Overview

This vignette presents a reproducible workflow for performing, processing, analyzing, and visualizing Molecular Dynamics (MD) simulations of vaccine–receptor complexes using GROMACS, R, APBS, and PyMOL.

The workflow spans the complete simulation pipeline, beginning with SLURM-based high-performance computing (HPC) simulation setup and continuing through trajectory processing, structural analysis, representative structure extraction, and electrostatic surface visualization.

The workflow includes:

This workflow is designed to support reproducible analysis of protein–protein interaction systems and can be adapted for additional molecular simulation studies.


Setup GROMACS Environment

conda create -n gromacs2025 python=3.11 cmake make gcc gxx fftw openmpi -c conda-forge

conda activate gromacs2025

Running MD Simulation Jobs on SLURM

The simulation stage of this workflow is designed to run on a SLURM-based high-performance computing (HPC) system. The simulation scripts are provided in the scripts/simulation/ directory and should be submitted using sbatch.

The workflow includes three main simulation stages:

  1. Minimization and equilibration
  2. Production run 1
  3. Production run 2 / continuation run

Submit Minimization and Equilibration

The minimization and equilibration script performs energy minimization followed by NVT and NPT equilibration.

From the repository root, submit the job using:

sbatch scripts/simulation/minimization_equilibration.sh

After submission, SLURM will return a job ID:

Submitted batch job 123456

Submit Production Run 1

After minimization and equilibration complete successfully, submit the first production run:

sbatch scripts/simulation/production_run1.sh

This script starts the first production MD simulation using the equilibrated system files.


Submit Production Run 2

After production run 1 completes successfully, submit the continuation run:

sbatch scripts/simulation/production_run2.sh

This script continues the simulation from the checkpoint or output files generated by production run 1.


Full Pipeline Submission

A full SLURM submission script is also provided at the repository root:

bash run_all.sh

This script submits the simulation and post-processing workflow using SLURM job dependencies.


Notes

SLURM options such as partition, wall time, number of nodes, number of tasks, and memory requirements should be adjusted according to the target HPC system.

The simulation scripts are intended for HPC execution and are not expected to run directly on a local laptop unless GROMACS and the required input files are configured locally.


Run the Full Analysis

The complete local analysis workflow can be run from the repository root using:

source("scripts/analysis/run_analysis.R")

Trajectory Combination and Processing

Multiple production trajectory segments were combined and processed prior to downstream analysis. This step generates a continuous trajectory and applies periodic boundary condition correction and centering for visualization and analysis.

From the repository root:

bash scripts/post_processing/combine_trajectories.sh

Workflow Summary

The script performs the following steps:

  1. Generates an index file using the input structure.
  2. Combines production trajectory segments into a single trajectory.
  3. Applies periodic boundary condition correction.
  4. Centers the system for downstream analysis.
  5. Generates processed trajectory files for full-system and protein-focused analyses.

GROMACS Commands

Create an index file:

gmx make_ndx -f step3_input.gro -o index.ndx

Combine trajectory segments:

gmx trjcat -f step5_production.xtc step5_production.part*.xtc -o step-all.xtc

Apply PBC correction and centering:

gmx trjconv -f step-all.xtc -s step5_production.tpr -n index.ndx \
  -pbc mol -ur compact -center -o out.xtc

Generate a protein-centered trajectory:

gmx trjconv -f step-all.xtc -s step5_production.tpr -n index.ndx \
  -pbc mol -ur compact -center -o out_prot.xtc

Output Files

  • index.ndx — GROMACS index file
  • step-all.xtc — combined production trajectory
  • out.xtc — PBC-corrected and centered trajectory
  • out_prot.xtc — protein-centered trajectory for downstream analysis

RMSD Analysis

Root Mean Square Deviation (RMSD) was used to evaluate the overall structural stability of the vaccine-receptor complex over time. RMSD measures the average positional deviation of selected atoms relative to a reference structure.

Lower and plateauing RMSD values generally suggest that the system has reached structural stability, while large or continuously increasing RMSD values may indicate conformational instability or major structural rearrangement.

GROMACS Command

gmx rms -f out.xtc -s step5_production.tpr -n index.ndx -o rmsd.xvg -tu ns

Output Plot

Figure: RMSD profile showing structural deviation of the vaccine-receptor complex over time.


Component-Specific Analysis: Separating Receptor and Vaccine Groups

For RMSF and radius of gyration (Rg) analyses, the receptor and vaccine components were separated into distinct index groups. This is important because the receptor and vaccine may differ substantially in size, residue numbering, flexibility, and structural compactness.

Analyzing the full complex alone can obscure component-specific behavior. For example, high flexibility in the vaccine construct may be masked by the larger receptor, while receptor compactness may dominate the radius of gyration profile of the full complex. Separating the components allows RMSF and Rg to be interpreted independently for each molecular partner.

Residue ranges were first confirmed in PyMOL by inspecting the receptor and vaccine sequences. These residue ranges were then used to create separate GROMACS index groups.

Create Receptor and Vaccine Index Groups

gmx make_ndx -f step5_production.tpr -n index.ndx -o index_separated.ndx

Within the interactive GROMACS prompt, enter l to list available residue and atom groups, then define the receptor and vaccine groups by residue index:

l
ri 1-279
name 16 Receptor
ri 280-387
name 17 Vaccine
q

The residue ranges shown here are system-specific and should be adjusted according to the receptor and vaccine residue numbering in each complex.


RMSF Analysis

Root Mean Square Fluctuation (RMSF) was used to evaluate residue-level flexibility throughout the MD simulation. RMSF identifies regions of the protein or vaccine construct that exhibit increased atomic displacement over time.

Higher RMSF values typically correspond to flexible loops, termini, or exposed regions, while lower RMSF values suggest more stable or structurally constrained residues.

GROMACS Commands

Calculate RMSF for the vaccine component:

echo "Vaccine" | gmx rmsf \
  -s step5_production.tpr \
  -f step-all.xtc \
  -o rmsf_vaccine.xvg \
  -res \
  -n index_separated.ndx

Calculate RMSF for the receptor component:

echo "Receptor" | gmx rmsf \
  -s step5_production.tpr \
  -f step-all.xtc \
  -o rmsf_receptor.xvg \
  -res \
  -n index_separated.ndx

Output Plot

Figure: Per-residue RMSF profile showing local flexibility across the separated components of the vaccine-receptor complex in order to analyze them individually.


Radius of Gyration Analysis

Radius of Gyration (Rg) was used to assess the compactness of the vaccine-receptor complex over time. Rg measures the distribution of atoms around the center of mass of the molecule.

Stable Rg values suggest that the system maintains compactness during simulation, while increasing Rg values may indicate unfolding, expansion, or loosening of the complex.

GROMACS Commands

Calculate radius of gyration for the vaccine component:

echo "Vaccine" | gmx gyrate \
  -f step-all.xtc \
  -s step5_production.tpr \
  -n index_separated.ndx \
  -o rg_vaccine.xvg

Calculate radius of gyration for the receptor component:

echo "Receptor" | gmx gyrate \
  -f step-all.xtc \
  -s step5_production.tpr \
  -n index_separated.ndx \
  -o rg_receptor.xvg

Output Plot

Figure: Radius of gyration profile showing changes in structural compactness of the separated vaccine-receptor complex components over time.


Hydrogen Bond Analysis

Hydrogen bond analysis was used to evaluate intramolecular interaction stability between the vaccine and receptor components during the MD simulation. The number of hydrogen bonds provides insight into structural integrity and interaction persistence over time.

A relatively stable hydrogen bond profile may indicate sustained internal interactions, while large fluctuations may reflect conformational rearrangement or transient interaction changes.

GROMACS Command

gmx hbond -f out.xtc -s step5_production.tpr -n index.ndx -num hbnum.xvg -g hbond.log -hbn hbond.ndx -hbm hbmap.xpm

Output Plot

Figure: Hydrogen bond profile showing changes in intramolecular hydrogen bonding between the vaccine and receptor throughout the simulation.


Representative Structure Extraction

Representative structures were extracted from the processed MD trajectory using GROMACS. This step fits the trajectory, performs clustering, and extracts a central representative structure for downstream visualization and electrostatic surface analysis.

From the repository root:

bash scripts/post_processing/extract_average_frame.sh

Workflow Summary

The script performs the following steps:

  1. Fits the trajectory using rotational and translational alignment.
  2. Clusters the fitted trajectory using the GROMOS method.
  3. Extracts a representative central structure from the clustered trajectory.

Note: This step is performed after trajectory combination and PBC correction, which prepare the trajectory for accurate clustering and structural alignment.

GROMACS Commands

Fit the combined trajectory:

gmx trjconv -s step5_production.tpr -f step-all.xtc -o fitted.xtc -fit rot+trans

Cluster the fitted trajectory:

gmx cluster -s step5_production.tpr -f fitted.xtc -n index.ndx \
  -o clusters.xpm -g cluster.log -cl central_structures.pdb \
  -method gromos -cutoff 0.25

Extract the representative structure:

gmx trjconv -f central_structures.pdb -s step5_production.tpr \
  -o md_rep_structure.pdb -dump 0

Output Files

  • fitted.xtc — fitted trajectory
  • clusters.xpm — cluster matrix
  • cluster.log — clustering summary
  • central_structures.pdb — representative structures from clustering
  • md_rep_structure.pdb — final representative structure used for visualization

Comparison of Docking and MD-Refined Structures

To evaluate structural changes induced by molecular dynamics simulations, the representative MD structure (md_rep_structure.pdb) was compared to the initial docking complex.

This comparison assesses structural stability, interface refinement, and conformational adjustments that occur during the simulation.


Purpose

  • Evaluate deviation from the initial docking pose
  • Assess structural stability after MD simulation
  • Identify interface rearrangements and conformational refinement

Structural Alignment in PyMOL

Load both structures:

load docking_complex.pdb, docking
load md_rep_structure.pdb, md

Align the MD-refined structure to the docking structure:

align md, docking

Alternatively, for more accurate alignment:

super md, docking

Visualization

Overlay both structures:

show cartoon, docking
color purple, docking

show cartoon, md
color blue, md

Figure: Overlay of the initial docked complex (TLR4/MD2 and the vaccine) and representative MD-refined structure, with labels for clarification.


RMSD Evaluation

PyMOL reports RMSD after alignment:

rms_cur md, docking

Lower RMSD values indicate structural stability, while higher values suggest conformational rearrangement during simulation.


Interpretation

  • Low RMSD (< 2–3 Ă…):
    Indicates structural stability and preservation of the docking pose.

  • Moderate RMSD (3–5 Ă…):
    Suggests minor conformational adjustments, often improving interface packing.

  • High RMSD (> 5 Ă…):
    Indicates significant structural rearrangement, potentially reflecting instability or alternative binding modes.

Structural deviations should be interpreted alongside RMSD, RMSF, and hydrogen bonding analyses to determine whether changes are stabilizing or destabilizing.


Interface Visualization

Define vaccine and receptor components (manual renaming may be required depending on chain assignments):

# Rename selections appropriately in PyMOL
# Example:
# rename complex → receptor
# rename chain(s) → vaccine

Highlight interface residues:

select interface_sel, byres ((vaccine within 4 of receptor) or (receptor within 4 of vaccine))
create interface, interface_sel

show surface, interface

color blue, interface and (interface like receptor)
color green, interface and (interface like vaccine)

Output

Representative comparison images can be generated in PyMOL:

ray 3000,3000
png docking_vs_md.png

Multiple orientations are recommended to capture interface changes.



Electrostatic Surface Representation

This section describes the workflow used to compute and visualize Electrostatic Surface Potentials for representative MD structures using APBS and PyMOL.


Environment Setup

Create and activate a dedicated Conda environment:

conda create -n electrostatics_env python=3.10 -y
conda activate electrostatics_env
conda install -c conda-forge pdb2pqr apbs

Structure Preparation

Convert the representative MD structure (md_rep_structure.pdb) to PQR format:

pdb2pqr30 --ff CHARMM md_rep_structure.pdb complex_md.pqr

APBS Input File

Create an APBS input file (e.g., complex_md.in):

nano complex_md.in

Insert the following:

read
  mol pqr complex_md.pqr
end

elec
  mg-auto
  dime 161 161 161
  cglen 160.0 160.0 160.0
  fglen 140.0 140.0 140.0
  cgcent mol 1
  fgcent mol 1
  mol 1
  lpbe
  bcfl sdh
  ion charge 1 conc 0.15 radius 2.0
  ion charge -1 conc 0.15 radius 2.0
  pdie 2.0
  sdie 78.54
  srfm smol
  chgm spl2
  sdens 10.0
  srad 1.4
  swin 0.3
  temp 298.15
  calcenergy no
  calcforce no
  write
    pot dx complex_md
  end
end

quit

Save and exit.


Run APBS

apbs complex_md.in

This generates an electrostatic potential map (.dx file).


Visualization in PyMOL

Load the structure and electrostatic map:

Note: It will again be necessary to separate receptor and vaccine chains/sequences, either manually or using CLI

load complex_md.pqr, complex
load complex_md.dx, complex_map

color splitpea, vaccine
color lightblue, receptor

Create an electrostatic color ramp:

ramp_new complex_pot, complex_map, [-15, 0, 15]


Electrostatic Surface Rendering

Vaccine Electrostatic Representation

set surface_color, complex_pot, vaccine
show surface, vaccine
set transparency, 0.15, vaccine
set surface_quality, 1
set two_sided_lighting, on

Receptor Electrostatic Representation

hide surface, vaccine
show cartoon, vaccine
set transparency, 0, vaccine

set surface_color, complex_pot, receptor
show surface, receptor
set transparency, 0.15, receptor
set surface_quality, 1
set two_sided_lighting, on


Notes


Conclusion

Presented here is a reproducible and modular workflow for the analysis and visualization of molecular dynamics simulations of vaccine–receptor complexes. Beginning with raw trajectory data, the pipeline integrates trajectory processing, quantitative stability metrics, representative structure extraction, and structure-based visualization into a unified framework.

System stability and conformational behavior are evaluated using complementary metrics, including root mean square deviation (RMSD), root mean square fluctuation (RMSF), and radius of gyration (Rg), alongside hydrogen bonding analysis (hbond) to characterize interaction persistence. Together, these measures provide a multidimensional assessment of structural integrity, flexibility, and intermolecular stability throughout the simulation.

Extraction of a representative structure via clustering enables downstream structural interpretation and facilitates direct comparison between initial docking models and MD-refined conformations. This comparison reveals interface-level adjustments and conformational refinement, offering insight into the stability and plausibility of predicted binding modes.

Electrostatic surface potential calculations, performed using APBS and visualized in PyMOL, further extend the analysis by mapping charge distributions and identifying potential interaction hotspots at the molecular interface.

Collectively, this workflow establishes a robust and extensible platform for the interpretation of MD simulations in the context of structure-based vaccine design and protein–protein interaction analysis.


Reproducibility and Extensibility

All scripts, example datasets, and visualization workflows are provided within this repository. The pipeline is designed to be modular, enabling straightforward adaptation to alternative molecular systems generated using GROMACS.


Future Directions

This framework can be extended to incorporate additional analyses, including:

  • Binding free energy calculations (MM/PBSA, MM/GBSA)
  • Principal component analysis (PCA) of conformational dynamics
  • Residue-level interaction network and contact analysis
  • Comparative evaluation across multiple complexes or simulation replicates

Availability

The complete workflow, including documentation and interactive visualization, is available at:

https://vsgosnell.github.io/MD-Simulation-Analysis/