Tutorial 2: entropy of mixing of methanol+water
When two pure liquids are put in contact the new configurations that the molecules can adopt and the new molecular interactions will lead to a change in entropy. Typically, this change is positive because of disorder, although in principle specific molecular interactions can lead to a decrease in entropy in some cases. For ideal gases, the entropy of mixing has an analytical expression, and it depends only on the relative number of molecules of each component in the mixture. The total entropy of a mixture of ideal gases is:
[math]S_\text{mixture}^\text{ideal} = \sum\limits_\Lambda N_\Lambda \bar{S}_\Lambda - k_\text{B} \sum\limits_\Lambda N_\Lambda \ln{\left(\frac{N_\Lambda}{N}\right)},[/math]
where [math]N_\Lambda[/math] is the number of molecules of component [math]\Lambda[/math] and [math]N[/math] is the total number of molecules. [math]\bar{S}_\Lambda[/math] refers to the entropy per molecule in the pure component [math]\Lambda[/math]. For real mixtures, one defines the excess entropy of mixture [math]S_\text{mix}^\text{E}[/math] to account for the deviation of the mixing entropy with respect to what one would expect for ideal systems:
[math]S_\text{mix}^\text{E} = S_\text{mixture}^\text{real} - S_\text{mixture}^\text{ideal}[/math].
In this tutorial we will calculate [math]S_\text{mix}^\text{E}[/math] for a 2:1 mixture of water and methanol. More detailed information can be found in Ref.[1]
Contents
Generating the trajectories
For this tutorial we will use Gromacs to generate the mixture of liquids and the pure liquids. We will use the OPLS force field and the SPCE water model. For methanol we will use the topology available from virtualchemistry.org. We will constrain all bonds with H atoms.
You can skip this section and download the trajectories that we will be analyzing from Zenodo. Then continue with the 2PT analysis section.
Generating the starting configurations
We need the OPLS topology file for methanol (methanol.itp), which we retrieved from virtualchemistry.org. If you reuse this topology file make sure to properly credit the authors.[2] We also need two sample molecules to generate our liquids, one for water, "water.gro":
One water 3 1SOL OW 1 0.114 0.845 0.401 -0.5236 0.3981 0.1442 1SOL HW1 2 0.118 0.886 0.491 1.8112 -1.0518 -0.9491 1SOL HW2 3 0.147 0.758 0.425 1.3601 2.0733 3.6058 1.20000 1.20000 1.20000
and one for methanol, "methanol.gro":
One methanol 6 1MET C 1 0.697 0.788 0.843 1MET H 2 0.600 0.829 0.873 1MET H 3 0.781 0.853 0.870 1MET H 4 0.706 0.773 0.736 1MET O 5 0.701 0.664 0.911 1MET H 6 0.761 0.600 0.875 1.20000 1.20000 1.20000
We are going to work with 2400 atoms total. This means that our pure water box will contain 800 molecules, our pure methanol box will contain 400 methanol molecules, and our 2:1 mixture will contain 400 water molecules and 200 methanol molecules. We can build our pure water and pure methanol boxes as follows:
gmx insert-molecules -box 3.0 3.0 3.0 -ci water.gro -nmol 800 -try 20 -o pure_water.gro gmx insert-molecules -box 3.1 3.1 3.1 -ci methanol.gro -nmol 400 -try 20 -o pure_methanol.gro
For the mixture we are going to generate 10 different configurations. This is so that we can obtain statistics. It is important that these initial configurations are completely uncorrelated, and so we will use bash random variables as seed for the generation:
for i in `seq 1 1 10`; do gmx insert-molecules -box 3.1 3.1 3.1 -ci water.gro -nmol 400 -try 20 -o temp1.gro -seed $RANDOM gmx insert-molecules -f temp1.gro -ci methanol.gro -nmol 200 -try 20 -o temp2.gro -seed $RANDOM rm temp1.gro; mv temp2.gro mixture_${i}.gro done
Equilibration
These systems are far away from equilibrium. We need to first optimize the box size (NPT), then equilibrate at constant volume (NVT) and then do the 2PT analysis (again NVT). I will explain it here for "mixture_3.gro" only; the other ones can be equilibrated in the same way. We need the following input files:
"topol.top":
#include "oplsaa.ff/forcefield.itp" #include "oplsaa.ff/spce.itp" #include "methanol.itp" [ system ] Water+methanol [ molecules ] ; name number SOL 400 MET 200
"em.mdp":
integrator = steep nsteps = 20000 cutoff-scheme = Verlet nstlist = 10 rlist = 1.0 coulombtype = pme rcoulomb = 1.0 rvdw = 1.0 nstenergy = 10 continuation = no constraints = h-bonds constraint_algorithm = lincs lincs_order = 4
"md.mdp":
constraints = h-bonds integrator = md-vv dt = 0.002 nsteps = 250000 tinit = 0 nstcomm = 1 continuation = no constraint_algorithm = lincs lincs_order = 4 nstxout = 1000 nstvout = 1000 nstfout = 0 nstlog = 2500 nstenergy = 1000 nstxtcout = 1000 xtc_precision = 1000 ns_type = grid nstlist = 10 cutoff-scheme = Verlet rlist = 1.0 coulombtype = pme rcoulomb = 1.0 optimize_fft = yes rvdw = 1.0 Tcoupl = v-rescale Pcoupl = Berendsen tau_p = 1.0 compressibility = 4.5e-5 ref_p = 1.0 gen_vel = yes gen_temp = 298. gen_seed = 83872 tc-grps = MOL SOL energygrps = MOL SOL tau_t = 0.1 0.1 ref_t = 298. 298. nsttcouple = 1
We now carry out an initial energy minimization:
gmx grompp -f em.mdp -c mixture_3.gro -p topol.top -o em.tpr gmx mdrun -v -deffnm em
and from there we do the constant-pressure equilibration:
gmx grompp -f md.mdp -c em.gro -p topol.top -o md.tpr gmx mdrun -v -deffnm md
Since we are equilibrating for 0.5 ns, this simulation might take a while (it took about 6 minutes on my "oldish" 8-core desktop machine). After the simulation is done, we need to extract the box length:
echo "Box-X" | gmx energy -f md.edr -o density.xvg; grep -v "\@" density.xvg | grep -v "\#" > box
we can plot and fit the box length during the last 250 ps of dynamics with gnuplot:
set fit quiet; fit[250:500] a "box" via a set ylabel "Box side length (nm)"; set xlabel "Time (ps)" plot "box" w l t "Box size", a t sprintf("Average after 250 ps = %.4f nm", a)
I got L = 2.9369 nm for my mixture (since we're using random seeds you'll get something slightly different). I can now modify the last line of "md.gro" with the new box side lengths:
Water+methanol 2400 ... ... ... 2.93690 2.9369 2.9369
At this point, we need to further equilibrate at constant volume. Edit the following "md.mdp" lines to this:
Tcoupl = nose-hoover Pcoupl = no gen_vel = no
and let's rock again:
cp md.gro NPT.gro gmx grompp -f md.mdp -c NPT.gro -p topol.top -o md.tpr gmx mdrun -v -deffnm md
Trajectories for analysis with DoSPT
Now you can save this system equilibrated in the NVT ensemble:
cp md.gro NVT.gro
and run the final MD simulation for the DoSPT analysis. Modify these lines in "md.mdp" to run for 20 ps and save every 4 fs (every two steps):
nsteps = 10000 nstxout = 2 nstvout = 2
Run Gromacs again:
gmx grompp -f md.mdp -c NVT.gro -p topol.top -o md.tpr gmx mdrun -v -deffnm md
and extract the trajectory:
gmx trjconv -f md.trr -s md.tpr -o traj.gro -dt 0.004 -ndec 5 -vel
Since we used the Verlet velocity integrator we have synchronous positions and velocities, which means that we do not need to estimate velocities with DoSPT.
This bash script should allow you to run all the dynamics above with minimal human intervention once your initial structures "pure_water.gro", "pure_methanol.gro", "mixture_1.gro", etc. have been generated.
DoSPT analysis
Input files
Now that we have all of our 20 ps MD trajectories for the pure liquids and the 10 liquid mixture configurations, we need to prepare the DoSPT input files to compute the corresponding entropies.
Our "masses" file should contain the masses for all the atom types. We need to take into account that the water and methanol H and O atoms have different naming conventions:
HW1 1.01 HW2 1.01 OW 16.00 O 16.00 C 12.01 H 1.01
Now our "groups" file, for which we will use the following Python script "generate_groups.py":
nwat=400 nmet=200 print nwat*3+nmet*6, nwat+nmet for i in range(0,nwat): print 3,2 print i*3+1, i*3+2, i*3+3 for i in range(0,nmet): print 6,1 print nwat*3+i*6+1, nwat*3+i*6+2, nwat*3+i*6+3, nwat*3+i*6+4, nwat*3+i*6+5, nwat*3+i*6+6
and run
python generate_groups.py > groups
Our "supergroups" file is very simple:
1-400 401-600
Of course, "groups" and "supergroups" will need to be modified for pure liquids. In the future, it will be possible for the groups file to be substituted by the "topology file", and this process will become much easier.
The core of our simulation options resides in the "input" file:
points = 5000 tau = 20. temperature = 298. smooth = .false. format = gro estimate_velocities = .false. renormalize_dos = .true. entropy_mixture = vol hs_formalism = lin
Note that the cell keyword is still not defined. We will need to extract a different one for each simulation box, since we allowed the box sizes to change during our NPT simulations. Since we used Verlet velocity we can do estimate_velocities = .false. and we also select renormalize_dos for better statistical convergence of the results.[1] We are also choosing Lin's way to estimate hard sphere diameters via the hs_formalism keyword. And very important when dealing with mixtures is what kind of combinatorial mixing entropy we use as the ideal limit (entropy_mixture keyword), where we have chosen partial volume based ideal mixing. Again, the details are explained in the original paper.[1]
To extract the appropriate cell size, we can do the following:
# Backup the version of "input" without cell info cp input input.bk l=`tail -1 NVT_pure_water.gro`; echo "cell = $l" >> input
Running DoSPT
Make sure that "groups", "supergroups" and cell inside "input" are consistent with the number of molecules in the .gro file. Let's rock:
for i in "pure_methanol" "pure_water" "mixture_1" "mixture_2" "mixture_3" "mixture_4" "mixture_5" "mixture_6" "mixture_7" "mixture_8" "mixture_9" "mixture_10"; do if [ $i == "pure_methanol" ]; then sup="1-400" nmet=400; nwat=0 elif [ $i == "pure_water" ]; then sup="1-800" nmet=0; nwat=800 else sup="1-400\n401-600" nmet=200; nwat=400 fi echo -e $sup > supergroups cat>generate_groups.py<<eof nwat=$nwat nmet=$nmet print nwat*3+nmet*6, nwat+nmet for i in range(0,nwat): print 3,2 print i*3+1, i*3+2, i*3+3 for i in range(0,nmet): print 6,1 print nwat*3+i*6+1, nwat*3+i*6+2, nwat*3+i*6+3, nwat*3+i*6+4, nwat*3+i*6+5, nwat*3+i*6+6 eof python generate_groups.py > groups cp input.bk input l=`tail -1 NPT_${i}.gro` echo "cell = $l" >> input mv traj_${i}.gro traj.gro DoSPT mv traj.gro traj_${i}.gro cp entropy entropy_$i done
Extracting the entropies
Now the entropies have been calculated and saved to the different entropy_* files. We can check the end of file "entropy_pure_water":
tail -6 entropy_pure_water
This is the output:
# Supergroup; Entropies in eV/K [trans, rot, vib]; Total entropy in eV/K; Total entropy in J/K 1 4.13456046E-01 8.80711771E-02 9.52792811E-11 5.01527224E-01 4.83908071E+04 # Entropy results with DoS normalized to the expected degrees of freedom # Supergroup; Entropies in eV/K [trans, rot, vib]; Total entropy in eV/K; Total entropy in J/K 1 4.12011614E-01 8.81638786E-02 8.28867038E-05 5.00258379E-01 4.82683802E+04
We can see that for this pure liquid the results with and without DoS renormalization are quite similar: 60.5 J/mol/K and 60.3 J/mol/K, respectively. Let's check for one of the mixtures:
tail -8 entropy_mixture_4
with output:
# Supergroup; Entropies in eV/K [trans, rot, vib]; Total entropy in eV/K; Total entropy in J/K 1 2.25777654E-01 4.17108936E-02 4.75239016E-11 2.67488547E-01 2.58091407E+04 2 1.40468723E-01 9.53297770E-02 1.40387307E-02 2.49837231E-01 2.41060199E+04 # Entropy results with DoS normalized to the expected degrees of freedom # Supergroup; Entropies in eV/K [trans, rot, vib]; Total entropy in eV/K; Total entropy in J/K 1 2.25115191E-01 4.17555990E-02 3.15295619E-05 2.66902319E-01 2.57525774E+04 2 1.37401314E-01 9.35018991E-02 1.42622772E-02 2.45165490E-01 2.36552581E+04
So the molar entropies of the [water; methanol] components are [64.5; 120.5] J/mol/K and [64.4; 118.3] J/mol/K without and with DoS renormalization, respectively. We can see that for methanol the results with and without DoS renormalization start to vary significantly. But let's get some statistics:
for i in "pure_methanol" "pure_water" "mixture_1" "mixture_2" "mixture_3" "mixture_4" "mixture_5" "mixture_6" "mixture_7" "mixture_8" "mixture_9" "mixture_10"; do if [ $i == "pure_methanol" ]; then echo "# System; S_raw (J/mol/K); S_ren (J/mol/K)" S_raw=`tail -6 entropy_$i | awk 'NR == 2{print $6/400}'` S_ren=`tail -1 entropy_$i | awk '{print $6/400}'` elif [ $i == "pure_water" ]; then S_raw=`tail -6 entropy_$i | awk 'NR == 2{print $6/800}'` S_ren=`tail -1 entropy_$i | awk '{print $6/800}'` else S_raw=`tail -8 entropy_$i | awk '{if(NR == 2){wat=$6} else if(NR == 3){met=$6}} END {print (wat+met)/600}'` S_ren=`tail -2 entropy_$i | awk '{if(NR == 1){wat=$6} else if(NR == 2){met=$6}} END {print (wat+met)/600}'` fi echo $i ${S_raw} ${S_ren} done
with the following output:
# System; S_raw (J/mol/K); S_ren (J/mol/K) pure_methanol 121.118 119.765 pure_water 60.4885 60.3355 mixture_1 82.4227 82.2495 mixture_2 82.4857 82.3186 mixture_3 83.2054 82.4541 mixture_4 83.1919 82.3464 mixture_5 82.2207 82.047 mixture_6 82.0043 81.8674 mixture_7 83.0834 82.2327 mixture_8 82.7768 82.3641 mixture_9 81.7088 81.751 mixture_10 82.1603 81.8783
You can already see that the DoS renormalization is stabilizing the mixture results. We would also be able to see similar stabilization for the pure liquids, but we would need more samples. But don't take my word for it, put these results in a file ("file") and use awk to print averages and standard deviation:
av=`awk 'NR>3{raw += $2/10.; ren += $3/10.} END {print raw, ren}' file` av=($av) std=`awk -v av_raw=${av[0]} -v av_ren=${av[1]} 'NR>3{raw += ($2-av_raw)**2/10.; ren += ($3-av_ren)**2/10.} END {print sqrt(raw), sqrt(ren)}' file` std=($std) printf " Raw: %.1f +- %.1f J/mol/K\n Ren: %.1f +- %.1f J/mol/K\n" ${av[0]} ${std[0]} ${av[1]} ${std[1]}
And we get the final results, showcasing the utility of the DoS renormalization scheme:
Raw: 82.5 +- 0.5 J/mol/K Ren: 82.2 +- 0.2 J/mol/K
Computing the excess entropy of mixing
To compute the excess entropy of mixing we will use the DoS-renormalized results. The standard molar entropies of the pure liquids are [math]\bar{S}_\text{wat} = 60.3[/math] J/mol/K and [math]\bar{S}_\text{met} = 119.8[/math] J/mol/K, the entropy of the mixture is [math]\bar{S}_\text{mixture} = 82.2[/math] J/mol/K. The excess entropy of mixing is:
[math] \bar{S}^\text{E}_\text{mix} = \bar{S}_\text{mixture} - \frac{400}{600} \bar{S}_\text{wat} - \frac{200}{600} \bar{S}_\text{met} + R \frac{400}{600} \ln{\left( \frac{400}{600} \right)} + R \frac{200}{600} \ln{\left( \frac{200}{600} \right)}, [/math]
where [math]R = 8.314[/math] J/mol/K is the gas constant ([math]R = N_\text{A} k_\text{B}[/math]). The final result is [math]\bar{S}^\text{E}_\text{mix} = -3.2[/math] J/mol/K. The experimental value is approximately [math]-3.8[/math] J/mol/K.[3]
Homework
We didn't check the statistical variation of the pure liquid, which can in fact be large for methanol. Check the effect of that quantity on the final result. You can also check the effect of using different combinatorial entropy of mixing (entropy_mixture tag) and diffusivity models (hs_formalism tag) on the results.
References
- ↑ 1.0 1.1 1.2 M.A. Caro, T. Laurila, and O. Lopez-Acevedo. Accurate schemes for calculation of thermodynamic properties of liquid mixtures from molecular dynamics simulations. J. Chem. Phys. 145, 244504 (2016).
- ↑ C. Caleman and P.J. van Maaren and M. Hong and J.S. Hub and L.T. Costa and D. van der Spoel. Force field benchmark of organic liquids: Density, enthalpy of vaporization, heat capacities, surface tension, isothermal compressibility, volumetric expansion coefficient, and dielectric constant. J. Chem. Theory Comput. 8, 61 (2012).
- ↑ R.F. Lama and B.C.-Y. Lu. Excess thermodynamic properties of aqueous alcohol solutions. J. Chem. Eng. Data 10, 216 (1965).