Practical Implementation of Molecular Dynamics code for beginners using Python

In this paper, we introduce a simple yet powerful and working version of the molecular dynamics code using the Python 3.9 language. The code contents are published in the link given in the appendix 1. The structure and components of the program is given in detail using flowcharts and code snippets. The program consists of major features like velocity verlet integrator, thermostats, COM removal, input and output modules, virial, pressure, and other thermodynamic quantities estimation etc. The author believes that this program will be helpful to graduate students who perform research in molecular dynamics simulations who intend to write their own code instead of the sophisticated open source packages.


Introduction
Molecular dynamics (MD) is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules can interact for a fixed period, giving a view of the dynamic "evolution" of the system. In the most common version, the trajectories of atoms and molecules are determined by numerically solving Newton's equations of motion for a system of interacting particles, where forces between the particles and their potential energies are often calculated using interatomic potentials or molecular mechanics force fields. The method is applied mostly in chemical physics, materials science, and biophysics (1).
With the increasing computational power and widespread application of molecular simulations in various areas of science and engineering (2)(3)(4)(5)(6), the need for multiscale simulations are also growing (7)(8)(9)(10)(11). There exist many open source computer codes available to perform these simulations. However, often there arises a problem which needs customized molecular simulations. This can be due to the boundary condition, the construction of the system itself, approximate models for introducing new force fields (12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30) or can be coupling between different time or length scales. To account these studies, graduate students often write their own code which is a cumbersome task (20,21,28). Writing a molecular dynamics code from scratch is a daunting task which involves lot of time and effort in fixing the bugs, optimizing the execution speed, customizing the input and output of the computer code etc (31) (32). To address these concerns, we have created a simple, yet a robust working version of molecular dynamics code (33) using the Python programming language.
The Molecular dynamics code can be written in languages like C, C++, Fortran, Python, Java, or any other computer language. Due to the surge in popularity and vast support community and example code snippets, Python is a good candidate to develop a MD code. Though Python is not compiled and as fast as C or Fortran, the memory handling and debugging can be easier on Python. Once the code is working then we can move on to implement it in a high-speed code like C or Fortran. A sample atomic system of Argon with 2676 atoms in a 5 3 cubic system is shown in Fig. 1 at the density corresponding to the 90 K. This system is given in the published code link (33) and can be used to understand how a molecular or atomic model is created and linked to an MD code.

Code Implementation
The details of the implementation of the Molecular Dynamics code is given the Fig. 2. The program starts with the reading module for the MD parameter inputs. This is the module which gather the information on time step of integration, total steps of simulation aka simulation duration, frequency of writing the coordinate information of the atoms etc. Other information that the reading module expects are type of thermostat to be used, reference temperature, thermostat parameters, cutoff radius, force field parameters, and frequency of center of mass (COM) removal.
The overall code structure is shown in the Fig. 2. The read coordinates module will read the (x,y,z) coordinates and velocity (vx, vy, vz) of the Argon atoms from the comma separated value (csv) file and also will allocate the memory to store the positions (displacements) and velocity. The csv file has the format of "molecule id, atom name, atom id, x, y, x, vx, vy, vz". A sample csv file for Argon system is given in the GitHub website where the complete code is hosted.
Once the MD input parameters and coordinate information is available, then the program should follow the standard molecular dynamics flow as shown in the Fig. 2. The integrator that we choose is the velocity verlet version which is stable and have advantages over the other methods (20). There are two ways to implement the velocity verlet integrator. Method 1 -Standard Implementation: Obtain ⃗( + Δ ) from the interaction potential using ⃗( + Δ ) Calculate ⃗( + Δ ) = ⃗ ( + Method 2 -Half step velocity: Obtain ⃗( + Δ ) from the interaction potential using ⃗( + Δ ) While comparing both these implementations, one will notice that the method 2 has the advantage of shorter code length, however the memory needed will be more due to the storage of acceleration from the previous step. This can be avoided with the Method 1 implementation, where we can use less memory to estimate the same quantities. Hence, in our code, we will implement the Method 1 strategy.
Apart from the standard implementation as shown in the above equations, the molecular dynamics code must account for 1) boundary conditions, 2) Thermostats, 3) Momentum correction and 4) Estimation of thermodynamic quantities.

Boundary conditions
Often the atoms will move out of the simulation box with sides (box_L, box_B, box_H) after the integration step. These atoms can be put back into box (essential while applying periodic boundary conditions) through the below code snippet. The lowest box value will be denoted as x_lo and highest value as x_hi, which can be 0 or box_L/2 or any other value, depending on how you define the system. For our code, we have used a system cornered at origin, which means x_lo = 0, x_hi = box_L.
If x < x_lo: This will bring back an atom that went outside the box in the x-axis dimension. Similarly, we can do this for the other two dimensions. If the system is non periodic in a particular dimension (say z-axis), then just do not use the above code for z-axis, instead the force field will take care of it.
The most used boundary condition in molecular dynamics simulation is periodic boundary condition (PBC) which is usually implemented as the minimum image convention as shown in the below code snippet.

Force Estimation
The atoms are interacted through a potential field and the resulting force of attraction or repulsion is estimated based on the equation below.
Here, F is the force, and U is the potential energy function. These interactions can be between two atoms (pair potentials), or three or more atoms (multi body potentials). For the simplest cases, we use pair potentials and would be able to produce reasonable thermodynamic properties. The simplest of such pair potentials is called Lennard-Jones Potential as given below.
Here, r is the inter atomic separation, is the energy value at which the force of interaction becomes zero, is the pair separation value at which the potential energy becomes zero. The force estimation module will start with the estimation of the inter atomic separation . This will be followed by the minimum image convention module to account for the PBC. The separation distance is estimated using, The interaction strength of the Lennard-Jones (LJ) potential becomes significantly smaller when the atomic separations are large. There is no advantage on estimating the force values beyond such large separations. Often, computer simulations are expensive and a cutoff radius ( ) is defined to limit the force estimation as below.
If r < rc: This involves, estimation of using square root function which is computationally expensive. It is further modified as below to save time. rsqr = rx*rx + ry*ry + rz*rz rc2 = rc*rc if rsqr < rc2: "Estimate Force" Note that 2 will be estimated only once during the beginning of the simulation. For the rest of the force evaluation between ℎ atom and ℎ atom an efficient version of the algorithm is given below. This version of the force coding can be further improvised by using a shifted LJ potential which goes to zero force at cutoff radius. This implementation is used in the Python code. The acceleration estimation is straight forward from Newton's Law of motion ( = ) once Force is estimated.

Thermodynamic quantities estimation
Thermodynamic quantities like Virial pressure, Temperature, Kinetic Energy, Potential Energy, and the total energy of the system is estimated at regular intervals set by the simulation parameter in the input file.

COM Removal
This module is equivalent to the "fix momentum" function of LAMMPS software. The average velocity of the system in every dimension is estimated and then this average velocity is subtracted from velocity of every atom. This will remove the net linear momentum of the system which is unphysical in a periodic system. The pseudocode is given below.

Thermostats
The two thermostats to implement easier is velocity scaling and Berendsen thermostats. Note that these two thermostats do not reproduce any true canonical ensemble and instead should use an accurate thermostat like Nose-Hoover for studies.
Simple velocity scaling thermostat is straightforward to implement as shown below. Here, TAU is the Berendsen parameter which controls the strength of the thermostat and DT is the timestep of integration.

Results Output
The main core of the molecular dynamics code has been explained in the previous sections. The last but most important task is to write out the results in a user-friendly manner at regular intervals. This includes writing out the temperature, kinetic energy, potential energy, and total energy in "energy.res" file. The pressure needs some more calculations including the kinetic component and unit conversions, before finally writing out to the file "pressure.res". The coordinate information of the atoms is written at regular intervals (nstxout) to a file called "traj_pyth.xyz".
Apart from these calculations and writing out into files, the user may need to know how the program is running during the execution time. This includes information like how much time or steps are remaining in the simulations, energy, temperature etc. A sample screenshot of such information of a run is shown in the Fig. 3.

Conclusion
In this work, we have introduced a simple yet powerful and working version of the molecular dynamics code using Python language. The code contents are published in the link given in the appendix 1. The structure and components of the program is given in detail using flowcharts and code snippets. The author believes that this program will be helpful to graduate students who perform research in molecular dynamics simulations who intend to write their own code instead of the sophisticated open source packages. The program consists of major features like velocity verlet integrator, thermostats, COM removal, input and output modules, pressure and other thermodynamic quantities estimation etc. https://github.com/ydsumith/Molecular-Dynamics/tree/master/Python_MD