# Tutorial 1: standard molar entropy of liquid water

The first tutorial is the simplest calculation you can carry out with DoSPT: estimating the standard molar entropy of liquid water.

## Contents

### Generate the trajectory

Note: If you have already generated a trajectory file, you can skip this step. Also note that Gromacs (or else) topology files are not required to run a DoSPT calculation: you only need a trajectory file generated with your MD package of choice (*.xyz and *.gro are currently supported formats) and DoSPT input files.

DoSPT cannot generate a molecular dynamics trajectory, therefore we need to generate our MD trajectory using some other package. At the moment DoSPT has native support for .xyz and .gro trajectory files. Since Gromacs is an easy-to-use and freely available package, we will be generating our trajectory with Gromacs. We will assume that you are already familiar with MD calculations.

Initial configuration with 100 TIP4P water molecules

To avoid wasting time on equilibrating the system, you can download this pre-equilibrated system made up of 100 water molecules. The model used is TIP4P, which contains one dummy atom per water molecule. Therefore each water molecule is assigned a total of 4 atoms (three real and one dummy). The file name is "water.gro".

Now we generate a trajectory from this initial configuration. We are not going to worry about the quality of the dynamics for this tutorial, since we only need an example trajectory for post-processing with DoSPT. The Gromacs files needed are "topol.top":

#include "oplsaa.ff/forcefield.itp"
#include "oplsaa.ff/tip4p.itp"

[ system ]
Water

[ molecules ]
SOL      100


And an "md.mdp" file with some basic options:

integrator =  md
dt = 0.001
nsteps = 20000
constraints = h-bonds
constraint_algorithm = lincs
lincs_order = 4
nstxout = 4
nstvout = 4
cutoff-scheme = group
ns_type = grid
nstlist = 5
rlist = 0.7
coulombtype = PME
rcoulomb = 0.7
vdw-type = cut-off
rvdw = 0.7
tc-grps = SOL
Tcoupl = nose-hoover
tau_t = 0.1
ref_t =  298.15


Note that we are running dynamics for a total of 20ps and we are saving position and velocity information every 4 steps, which is every 4fs in this case given our 1fs time step. It is also very important to note that we are using the "md" integrator, which is a "leap-frog" integrator. This means that velocities and positions are offset by half a time step, and cannot be used to accurately estimate angular momenta.

We generate our topology files and run the dynamics:

gmx grompp -f md.mdp -c water.gro -p topol.top -o md.tpr

gmx mdrun -v -deffnm md


After a few seconds the simulation will be finished. We now need to extract the required trajectory information to a file called "traj.gro":

gmx trjconv -f md.trr -s md.tpr -o traj.gro -dt 0.004 -ndec 5 -vel


Note that we are extracting velocity and position information every 0.004ps = 4fs, and we are requesting 5 decimal places for positions (the default 0.001nm precision of Gromacs is not enough for the 2PT analysis). We are not going to use the velocities in this tutorial because of the mentioned issue with leap-frog, so we could as well have left the "-vel" option out.

The generated trajectory file is quite large, 156MB in this case. You'll have to take memory limitations into account when extracting trajectory info.

### Prepare the DoSPT input files

Now that we have generated a trajectory DoSPT comes into play. We need 5 files to run this simulation, which should all be located within the same directory:

• traj.gro (trajectory file from the MD simulation)
• input (all the DoSPT input options)
• groups (defines the groups of atoms)
• supergroups (all groups of the same type)
• masses (defines atomic masses)

The following input file will be enough for our purposes:

# Number of points in the trajectory (5001)
points = 5001
# Trajectory period in ps (20)
tau = 20.
# Size of the box in nm
cell = 1.472 1.472 1.472
# Temperature in K
temperature = 298.15
# Trajectory info
format = gro
# Estimate velocities from positions
estimate_velocities = .true.


Note that we had to enable velocity estimation because of the problem with leap-frog integration. The cell size can be taken from the last line in the "water.gro" file. The units in this input file are strictly given in the units specified in the comments. Comments always start with a "#" symbol.

The "masses" file assigns a mass in a.m.u. to the atomic labels. For TIP4P, atomic labels are "OW", "HW1", "HW2" and "MW", for oxygen, the two hydrogens, and the dummy atom, respectively. Our "masses" file looks like this:

OW   16.00
HW1   1.01
HW2   1.01
MW    0.00


We have assigned zero mass to the dummy atom. You will need to adapt the mass definitions depending on the specific topology of your molecule when dummy atoms are included.

Now we need to generate the "groups" file. The mandatory structure is as follows:

natoms ngroups
n1 s1
n1_1 n1_2 n1_3 ...
n2 s2
n2_1 n2_2 n2_3 ...
...


The first line contains the total number of atoms "natoms" in the system followed by the number of groups "ngroups". This is then followed by two lines for each group, one line with the number of atoms in the group ("n1", "n2", etc.) and the symmetry number of the group ("s1", "s2", etc.) and another line with the labels of all the atoms belonging to that group. In our present example, TIP4P water molecules have 4 atoms, and their symmetry number is 2; therefore our "groups" file will look like this:

400 100
4 2
1 2 3 4
4 2
5 6 7 8
...


The easiest way to generate this file is with a Python script "setup_groups.py", e.g.:

print 400, 100

for i in range(0,100):
print 4,2
print 4*i+1, 4*i+2, 4*i+3, 4*i+4


Then run:

python setup_groups.py > groups


Finally, we need a "supergroups" file. Since this is a monocomponent system, all the groups, from 1 to 100, belong to the same supergroup. The "supergroups" file can be constructed like this:

1-100


### Run DoSPT and analyze the results

At this point we have everything we need to run a DoSPT calculation. Simply type "DoSPT" and look at the program's progress being printed on the screen. At the end of the execution several files will have been created. The entropy information is contained in the "entropy" file. Type "tail -2 entropy" to get this info printed on the screen:

# Supergroup; Entropies in eV/K [trans, rot, vib]; Total entropy in eV/K; Total entropy in J/K
1   5.29264712E-02   1.06849280E-02   4.54780340E-07   6.36118540E-02   6.13771060E+03


The last number tells you that the total entropy for this system is 6137.7 J/K, if this system constituted one molecule of the substance under study. Since we have 100 molecules, this means that the calculated molar entropy is 61.377 J/K/mol. Compare this to the experimental value of 69.9 J/K/mol. Note that the discrepancy with experiment is mostly due to the water model and force field used.

The density of states for supergroups is written to file "dos_sg". You can visualize it using for instance "gnuplot":

set xlabel "Frequency (ps^-1)"
set ylabel "DoS (ps)"
plot "dos_sg" u 1:2 t "Trans" w l, "" u 1:3 t "Rot" w l, "" u 1:4 t "Vib" w l


As expected there are no vibrational contributions to the DoS since TIP4P is a rigid model. Since there are no high frequency vibrational contributions, we can plot a close-up:

Since the integral of the DoS should equal the total number of degrees of freedom (DoF), we can use the "smooth cumulative" option of gnuplot to get a quick estimate of this number (the distance between data points is 0.05 ps^-1):

set key bottom
set ylabel "Integrated no. of DoF"
plot "dos_sg" u 1:($2*0.05) smooth cumulative t "Trans DoF" w l, "" u 1:($3*0.05) smooth cumulative t "Rot DoF" w l


It's easy to see that the computed number of DoFs is less than expected: 300 translational and 300 vibrational - adjusted to the specifics of your MD code, which might be subtracting the linear and/or angular velocities of the center of mass of your system (i.e., minus 3 degrees of freedom for each). The problem of underestimation of the number of degrees of freedom is associated to equilibration, total trajectory time and system size, and severely affects the rate of convergence of calculated entropy results. Check out the DoS renormalization article to learn how DoSPT can correct the leading error and significantly speed up convergence of results with respect to the number of sampled configurations.

### Homework

Repeat this tutorial with other water models to get an idea of the impact of the force field on the results. Also, repeat the calculation for different starting configurations/number of molecules (with the same force field) to get an idea of the statistical variation of entropy results and the impact of system size.