Tutorial 2: entropy of mixing of methanol+water

From DoSPT
Jump to: navigation, search

When two pure liquids are put in contact the new configurations that the molecules can adopt and the new molecular interactions will lead to a change in entropy. Typically, this change is positive because of disorder, although in principle specific molecular interactions can lead to a decrease in entropy in some cases. For ideal gases, the entropy of mixing has an analytical expression, and it depends only on the relative number of molecules of each component in the mixture. The total entropy of a mixture of ideal gases is:

[math]S_\text{mixture}^\text{ideal} = \sum\limits_\Lambda N_\Lambda \bar{S}_\Lambda - k_\text{B} \sum\limits_\Lambda N_\Lambda \ln{\left(\frac{N_\Lambda}{N}\right)},[/math]

where [math]N_\Lambda[/math] is the number of molecules of component [math]\Lambda[/math] and [math]N[/math] is the total number of molecules. [math]\bar{S}_\Lambda[/math] refers to the entropy per molecule in the pure component [math]\Lambda[/math]. For real mixtures, one defines the excess entropy of mixture [math]S_\text{mix}^\text{E}[/math] to account for the deviation of the mixing entropy with respect to what one would expect for ideal systems:

[math]S_\text{mix}^\text{E} = S_\text{mixture}^\text{real} - S_\text{mixture}^\text{ideal}[/math].

In this tutorial we will calculate [math]S_\text{mix}^\text{E}[/math] for a 2:1 mixture of water and methanol. More detailed information can be found in Ref.[1]

Generating the trajectories

For this tutorial we will use Gromacs to generate the mixture of liquids and the pure liquids. We will use the OPLS force field and the SPCE water model. For methanol we will use the topology available from virtualchemistry.org. We will constrain all bonds with H atoms.

You can skip this section and download the trajectories that we will be analyzing from Zenodo. Then continue with the 2PT analysis section.

Generating the starting configurations

We need the OPLS topology file for methanol (methanol.itp), which we retrieved from virtualchemistry.org. If you reuse this topology file make sure to properly credit the authors.[2] We also need two sample molecules to generate our liquids, one for water, "water.gro":

One water
    1SOL     OW    1   0.114   0.845   0.401 -0.5236  0.3981  0.1442
    1SOL    HW1    2   0.118   0.886   0.491  1.8112 -1.0518 -0.9491
    1SOL    HW2    3   0.147   0.758   0.425  1.3601  2.0733  3.6058
   1.20000   1.20000   1.20000

and one for methanol, "methanol.gro":

One methanol
    1MET    C      1   0.697   0.788   0.843
    1MET    H      2   0.600   0.829   0.873
    1MET    H      3   0.781   0.853   0.870
    1MET    H      4   0.706   0.773   0.736
    1MET    O      5   0.701   0.664   0.911
    1MET    H      6   0.761   0.600   0.875
   1.20000   1.20000   1.20000

We are going to work with 2400 atoms total. This means that our pure water box will contain 800 molecules, our pure methanol box will contain 400 methanol molecules, and our 2:1 mixture will contain 400 water molecules and 200 methanol molecules. We can build our pure water and pure methanol boxes as follows:

gmx insert-molecules -box 3.0 3.0 3.0 -ci water.gro -nmol 800 -try 20 -o pure_water.gro
gmx insert-molecules -box 3.1 3.1 3.1 -ci methanol.gro -nmol 400 -try 20 -o pure_methanol.gro

For the mixture we are going to generate 10 different configurations. This is so that we can obtain statistics. It is important that these initial configurations are completely uncorrelated, and so we will use bash random variables as seed for the generation:

One of the boxes containing a 2:1 water/methanol mixture
for i in `seq 1 1 10`; do
gmx insert-molecules -box 3.1 3.1 3.1 -ci water.gro -nmol 400 -try 20 -o temp1.gro -seed $RANDOM
gmx insert-molecules -f temp1.gro -ci methanol.gro -nmol 200 -try 20 -o temp2.gro -seed $RANDOM
rm temp1.gro; mv temp2.gro mixture_${i}.gro


These systems are far away from equilibrium. We need to first optimize the box size (NPT), then equilibrate at constant volume (NVT) and then do the 2PT analysis (again NVT). I will explain it here for "mixture_3.gro" only; the other ones can be equilibrated in the same way. We need the following input files:


#include "oplsaa.ff/forcefield.itp"
#include "oplsaa.ff/spce.itp"
#include "methanol.itp"

[ system ]

[ molecules ]
; name  number
SOL       400
MET       200


integrator      = steep
nsteps          = 20000
cutoff-scheme   = Verlet
nstlist         = 10
rlist           = 1.0
coulombtype     = pme
rcoulomb        = 1.0
rvdw            = 1.0
nstenergy       = 10
continuation	     =  no
constraints          =  h-bonds
constraint_algorithm =  lincs
lincs_order          =  4


constraints = h-bonds
integrator =  md-vv
dt = 0.002
nsteps = 250000
tinit = 0
nstcomm = 1
continuation = no
constraint_algorithm = lincs
lincs_order = 4
nstxout = 1000
nstvout = 1000
nstfout = 0
nstlog = 2500
nstenergy = 1000
nstxtcout = 1000
xtc_precision = 1000
ns_type = grid
nstlist = 10
cutoff-scheme = Verlet
rlist = 1.0
coulombtype = pme
rcoulomb = 1.0
optimize_fft = yes
rvdw = 1.0
Tcoupl = v-rescale
Pcoupl = Berendsen
tau_p = 1.0
compressibility = 4.5e-5
ref_p = 1.0
gen_vel = yes
gen_temp = 298.
gen_seed = 83872
tc-grps = MOL SOL
energygrps = MOL SOL
tau_t = 0.1 0.1
ref_t = 298. 298.
nsttcouple = 1

We now carry out an initial energy minimization:

gmx grompp -f em.mdp -c mixture_3.gro -p topol.top -o em.tpr
gmx mdrun -v -deffnm em

and from there we do the constant-pressure equilibration:

gmx grompp -f md.mdp -c em.gro -p topol.top -o md.tpr
gmx mdrun -v -deffnm md

Since we are equilibrating for 0.5 ns, this simulation might take a while (it took about 6 minutes on my "oldish" 8-core desktop machine). After the simulation is done, we need to extract the box length:

echo "Box-X" | gmx energy -f md.edr -o density.xvg; grep -v "\@" density.xvg | grep -v "\#" > box

we can plot and fit the box length during the last 250 ps of dynamics with gnuplot:

Evolution of box size during the dynamics
set fit quiet; fit[250:500] a "box" via a
set ylabel "Box side length (nm)"; set xlabel "Time (ps)"
plot "box" w l t "Box size", a t sprintf("Average after 250 ps = %.4f nm", a)

I got L = 2.9369 nm for my mixture (since we're using random seeds you'll get something slightly different). I can now modify the last line of "md.gro" with the new box side lengths:

   2.93690   2.9369   2.9369

At this point, we need to further equilibrate at constant volume. Edit the following "md.mdp" lines to this:

Tcoupl = nose-hoover
Pcoupl = no
gen_vel = no

and let's rock again:

cp md.gro NPT.gro
gmx grompp -f md.mdp -c NPT.gro -p topol.top -o md.tpr
gmx mdrun -v -deffnm md

Trajectories for analysis with DoSPT

Now you can save this system equilibrated in the NVT ensemble:

 cp md.gro NVT.gro

and run the final MD simulation for the DoSPT analysis. Modify these lines in "md.mdp" to run for 20 ps and save every 4 fs (every two steps):

nsteps = 10000
nstxout = 2
nstvout = 2

Run Gromacs again:

gmx grompp -f md.mdp -c NVT.gro -p topol.top -o md.tpr
gmx mdrun -v -deffnm md

and extract the trajectory:

gmx trjconv -f md.trr -s md.tpr -o traj.gro -dt 0.004 -ndec 5 -vel

Since we used the Verlet velocity integrator we have synchronous positions and velocities, which means that we do not need to estimate velocities with DoSPT.

This bash script should allow you to run all the dynamics above with minimal human intervention once your initial structures "pure_water.gro", "pure_methanol.gro", "mixture_1.gro", etc. have been generated.

DoSPT analysis

Input files

Now that we have all of our 20 ps MD trajectories for the pure liquids and the 10 liquid mixture configurations, we need to prepare the DoSPT input files to compute the corresponding entropies.

Our "masses" file should contain the masses for all the atom types. We need to take into account that the water and methanol H and O atoms have different naming conventions:

HW1    1.01
HW2    1.01
OW    16.00
O     16.00
C     12.01
H      1.01

Now our "groups" file, for which we will use the following Python script "generate_groups.py":


print nwat*3+nmet*6, nwat+nmet

for i in range(0,nwat):
    print 3,2
    print i*3+1, i*3+2, i*3+3

for i in range(0,nmet):
    print 6,1
    print nwat*3+i*6+1, nwat*3+i*6+2, nwat*3+i*6+3

and run

python generate_groups.py > groups

Our "supergroups" file is very simple:


Of course, "groups" and "supergroups" will need to be modified for pure liquids. In the future, it will be possible for the groups file to be substituted by the "topology file", and this process will become much easier.

The core of our simulation options resides in the "input" file:

points = 5000
tau = 20.
temperature = 298.
smooth = .false.
format = gro
estimate_velocities = .false.
renormalize_dos = .true.
entropy_mixture = vol
hs_formalism = lin

Note that the cell keyword is still not defined. We will need to extract a different one for each simulation box, since we allowed the box sizes to change during our NPT simulations. Since we used Verlet velocity we can do estimate_velocities = .false. and we also select renormalize_dos for better statistical convergence of the results.[1] We are also choosing Lin's way to estimate hard sphere diameters via the hs_formalism keyword. And very important when dealing with mixtures is what kind of combinatorial mixing entropy we use as the ideal limit (entropy_mixture keyword), where we have chosen partial volume based ideal mixing. Again, the details are explained in the original paper.[1]

To extract the appropriate cell size, we can do the following:

# Backup the version of "input" without cell info
cp input input.bk

l=`tail -1 NVT_pure_water.gro`; echo "cell = $l" >> input

Running DoSPT

Make sure that "groups", "supergroups" and cell inside "input" are consistent with the number of molecules in the .gro file. Let's rock:

for i in "pure_methanol" "pure_water" "mixture_1" "mixture_2" "mixture_3" "mixture_4" "mixture_5" "mixture_6" "mixture_7" "mixture_8" "mixture_9" "mixture_10"; do

if [ $i == "pure_methanol" ]; then
nmet=400; nwat=0
elif [ $i == "pure_water" ]; then
nmet=0; nwat=800
nmet=200; nwat=400

echo -e $sup > supergroups


print nwat*3+nmet*6, nwat+nmet

for i in range(0,nwat):
    print 3,2
    print i*3+1, i*3+2, i*3+3

for i in range(0,nmet):
    print 6,1
    print nwat*3+i*6+1, nwat*3+i*6+2, nwat*3+i*6+3, nwat*3+i*6+4, nwat*3+i*6+5, nwat*3+i*6+6

python generate_groups.py > groups

cp input.bk input
l=`tail -1 NPT_${i}.gro`
echo "cell = $l" >> input

mv traj_${i}.gro traj.gro
mv traj.gro traj_${i}.gro

cp entropy entropy_$i

Extracting the entropies

Now the entropies have been calculated and saved to the different entropy_* files. We can check the end of file "entropy_pure_water":

tail -6 entropy_pure_water

This is the output:

# Supergroup; Entropies in eV/K [trans, rot, vib]; Total entropy in eV/K; Total entropy in J/K
      1   4.13456046E-01   8.80711771E-02   9.52792811E-11   5.01527224E-01   4.83908071E+04

# Entropy results with DoS normalized to the expected degrees of freedom
# Supergroup; Entropies in eV/K [trans, rot, vib]; Total entropy in eV/K; Total entropy in J/K
      1   4.12011614E-01   8.81638786E-02   8.28867038E-05   5.00258379E-01   4.82683802E+04

We can see that for this pure liquid the results with and without DoS renormalization are quite similar: 60.5 J/mol/K and 60.3 J/mol/K, respectively. Let's check for one of the mixtures:

tail -8 entropy_mixture_4

with output:

# Supergroup; Entropies in eV/K [trans, rot, vib]; Total entropy in eV/K; Total entropy in J/K
      1   2.25777654E-01   4.17108936E-02   4.75239016E-11   2.67488547E-01   2.58091407E+04
      2   1.40468723E-01   9.53297770E-02   1.40387307E-02   2.49837231E-01   2.41060199E+04

# Entropy results with DoS normalized to the expected degrees of freedom
# Supergroup; Entropies in eV/K [trans, rot, vib]; Total entropy in eV/K; Total entropy in J/K
      1   2.25115191E-01   4.17555990E-02   3.15295619E-05   2.66902319E-01   2.57525774E+04
      2   1.37401314E-01   9.35018991E-02   1.42622772E-02   2.45165490E-01   2.36552581E+04

So the molar entropies of the [water; methanol] components are [64.5; 120.5] J/mol/K and [64.4; 118.3] J/mol/K without and with DoS renormalization, respectively. We can see that for methanol the results with and without DoS renormalization start to vary significantly. But let's get some statistics:

for i in "pure_methanol" "pure_water" "mixture_1" "mixture_2" "mixture_3" "mixture_4" "mixture_5" "mixture_6" "mixture_7" "mixture_8" "mixture_9" "mixture_10"; do

if [ $i == "pure_methanol" ]; then
echo "# System; S_raw (J/mol/K); S_ren (J/mol/K)"
S_raw=`tail -6 entropy_$i | awk 'NR == 2{print $6/400}'`
S_ren=`tail -1 entropy_$i | awk '{print $6/400}'`
elif [ $i == "pure_water" ]; then
S_raw=`tail -6 entropy_$i | awk 'NR == 2{print $6/800}'`
S_ren=`tail -1 entropy_$i | awk '{print $6/800}'`
S_raw=`tail -8 entropy_$i | awk '{if(NR == 2){wat=$6} else if(NR == 3){met=$6}} END {print (wat+met)/600}'`
S_ren=`tail -2 entropy_$i | awk '{if(NR == 1){wat=$6} else if(NR == 2){met=$6}} END {print (wat+met)/600}'`

echo $i ${S_raw} ${S_ren}


with the following output:

# System; S_raw (J/mol/K); S_ren (J/mol/K)
pure_methanol 121.118 119.765
pure_water 60.4885 60.3355
mixture_1 82.4227 82.2495
mixture_2 82.4857 82.3186
mixture_3 83.2054 82.4541
mixture_4 83.1919 82.3464
mixture_5 82.2207 82.047
mixture_6 82.0043 81.8674
mixture_7 83.0834 82.2327
mixture_8 82.7768 82.3641
mixture_9 81.7088 81.751
mixture_10 82.1603 81.8783

You can already see that the DoS renormalization is stabilizing the mixture results. We would also be able to see similar stabilization for the pure liquids, but we would need more samples. But don't take my word for it, put these results in a file ("file") and use awk to print averages and standard deviation:

av=`awk 'NR>3{raw += $2/10.; ren += $3/10.} END {print raw, ren}' file`
std=`awk -v av_raw=${av[0]} -v av_ren=${av[1]} 'NR>3{raw += ($2-av_raw)**2/10.; ren += ($3-av_ren)**2/10.} END {print sqrt(raw), sqrt(ren)}' file`
printf " Raw: %.1f +- %.1f J/mol/K\n Ren: %.1f +- %.1f J/mol/K\n" ${av[0]} ${std[0]} ${av[1]} ${std[1]}

And we get the final results, showcasing the utility of the DoS renormalization scheme:

Raw: 82.5 +- 0.5 J/mol/K
Ren: 82.2 +- 0.2 J/mol/K

Computing the excess entropy of mixing

To compute the excess entropy of mixing we will use the DoS-renormalized results. The standard molar entropies of the pure liquids are [math]\bar{S}_\text{wat} = 60.3[/math] J/mol/K and [math]\bar{S}_\text{met} = 119.8[/math] J/mol/K, the entropy of the mixture is [math]\bar{S}_\text{mixture} = 82.2[/math] J/mol/K. The excess entropy of mixing is:

[math] \bar{S}^\text{E}_\text{mix} = \bar{S}_\text{mixture} - \frac{400}{600} \bar{S}_\text{wat} - \frac{200}{600} \bar{S}_\text{met} + R \frac{400}{600} \ln{\left( \frac{400}{600} \right)} + R \frac{200}{600} \ln{\left( \frac{200}{600} \right)}, [/math]

where [math]R = 8.314[/math] J/mol/K is the gas constant ([math]R = N_\text{A} k_\text{B}[/math]). The final result is [math]\bar{S}^\text{E}_\text{mix} = -3.2[/math] J/mol/K. The experimental value is approximately [math]-3.8[/math] J/mol/K.[3]


We didn't check the statistical variation of the pure liquid, which can in fact be large for methanol. Check the effect of that quantity on the final result. You can also check the effect of using different combinatorial entropy of mixing (entropy_mixture tag) and diffusivity models (hs_formalism tag) on the results.


  1. 1.0 1.1 1.2 M.A. Caro, T. Laurila, and O. Lopez-Acevedo. Accurate schemes for calculation of thermodynamic properties of liquid mixtures from molecular dynamics simulations. J. Chem. Phys. 145, 244504 (2016).
  2. C. Caleman and P.J. van Maaren and M. Hong and J.S. Hub and L.T. Costa and D. van der Spoel. Force field benchmark of organic liquids: Density, enthalpy of vaporization, heat capacities, surface tension, isothermal compressibility, volumetric expansion coefficient, and dielectric constant. J. Chem. Theory Comput. 8, 61 (2012).
  3. R.F. Lama and B.C.-Y. Lu. Excess thermodynamic properties of aqueous alcohol solutions. J. Chem. Eng. Data 10, 216 (1965).