The current version of ptraj (previously called rdparm) is really two programs:
If the executable name does not contain the string "rdparm", ptraj is run instead. ptraj also requires specification of parameter/topology information, however it currently supports both the AMBER prmtop format and CHARMM psf files. Note that the ptraj program can also be accessed from rdparm by typing ptraj (or newtransform). The commands to ptraj can either be piped in through standard input or supplied in a file, where the filename (script) is passed in as the second command line argument. Note that if the prmtop filename is absent, the user will be prompted for a filename.
To use the program it is necessary to (1) read in a parameter/topology file, (2) set up a list of input coordinate files, (3) optionally specify an output file, (4) specify a series of actions to be performed on each coordinate set read in, and <5> optionally analyze any accumulated results.
This is done at startup and currently either an AMBER prmtop or CHARMM psf file can be read in. The type of the file is autodetected. The information in these files is used to setup the state (ptrajState *) which gives information about the number of atoms, residues, atom names, residue names, residue boundaries, etc. This information is used to guide the reading of input coordinates which MUST match the order specified by the state, otherwise garbage may be obtained (although this may be detected by the program for some file formats, leading to a warning to the user). In other words, when reading a pdb file, the atom order must correspond exactly to that of the parameter/topology information; in the pdb the names/residues are ignored and only the coordinates are read in based.
This is done with the trajin command (described in more detail below) which specifies the name of a coordinate file and optionally the start, stop and offset for reading coordinates. The type of coordinate file is detected automatically and currently the following input coordinate types are supported:
This is done with the trajout command (discussed in more detail below). Trajectories can currently be written in AMBER trajectory (default), AMBER restrt, Scripps binpos, PDB or CHARMM trajectory (in little or big endian binary format).
There are a variety of coordinate analysis/manipulation actions provided and each is applied sequentially to the coordinates in the order listed by the user in the input file.
Anaysis tools are currently under development to aid in processing data accumulated on the scalarStack (with reference to a particular entry via the name assignments in routines such as distance, angle, dihedral, pucker and rms). All of the analyze commands are preceeded by the analyze keyword and are discussed in more detail below.
trajin traj1.Z 1 20 trajin traj2.Z 1 100 trajin restrt.Z trajout fixed.traj rms first out rms @CA,C,N center :1-20 image strip :WAT go
Note that the current parser is really very stupid. Hopefully this will be fixed to properly handle quoted strings and spaces. Until this is "fixed", check the output very carefully; note that whenever an atom mask is used, a summary of the atoms selected is printed.
Load the trajectory file specified by filename using only the frames starting with start (default 1) and ending with (and including) stop (default, the final configuration) using an offset of offset (default 1) if specified... AMBER trajectory, restrt/inpcrd, PDB, Scripps BINPOS and CHARMM binary trajectory files are all currently supported. Note that the coordinates must match the names/ordering of the parameter/topology information previously read in. Compressed files (filenames with a .Z or .gz appended) are also auto-detected and treated appropriately.
Loads up a single coordinate set for use as a reference structure. Currently reference structures are only used (optionally) in RMS fitting (via the "rms" command). This command can be called multiple times, but note that the most recent "reference" structure will be used when an "rms" action is set up, for example:
reference A.pdb rms reference out rms_to_A reference B.pdb rms reference out rms_to_BThe above loads up "A.pdb" as a reference structure; this structure is used in the subsequent rms call (with data output to a file rms_to_A). Then, "B.pdb" is loaded up as a reference structure and the next rms uses this in the fitting.
Note that as the state is modified (for example by \fBstrip\fR or \fBclosestwaters\fR), the reference coordinates are also modified internally.
This command is used to specify the name of the single output trajectory file (filename) to create. An optional argument format specifies the format, currently "trajectory" (default), "restart", "binpos", "pdb" or "charmm" are supported. With CHARMM files it is possible to specify the byte ordering as "little" or "big" endian. With the PDB output, it is possible to include charges and radii in the temperature/occupancy columns with the additional keyword "dumpq" (or "parse" if you want the parse radii). By default, atom names are wrapped in the PDB file to put the 4th letter of the atom name first; if you want to avoid this behavior, specify "nowrap"; the former is more consistent with standard PDB usage but departs from the format written in previous versions of this program. Note that if more than one coordinate set is to be output, with the pdb and restrt/inpcrd formats, extensions (based on the current configuration number) will be appended to the filenames and therefore only one coordinate set will be written per file. The optional keyword "nobox" will prevent box coordinates from being dumped to trajectory files; this is useful if one is stripping the solvent from a trajectory file and you don't want that pesky box information cluttering up the trajectory and messing with other programs... Note that if periodic box information is present in the CHARMM trajectory file, when a new CHARMM trajectory file is written (in versions > 22) the symmetric box information will be *very* slightly different due to numerical issues in the diagonalization procedure; this will not effect analysis but shows up if diffing the binary files.
This command allows specification of the initial box parameters. Using this command, it is also possible to fix various values throughout the run, via the "fixN" commands above. These will override those values read from the prmtop or coordinate files (only if fixed).
This command can be used to override the solvent information specified in the AMBER prmtop file or that which is set by default (based on residue name) upon reading a CHARMM psf. Applying this command overwrites any previously set solvent definitions. The solvent can be selected by residue with the "byres" modifier using all the residues specified in the one or more atom masks listed. The byname option searches for solvent by residue name (where the mask contains the name of the residue), searching over all residues. The "bytype" option is intended for use in selecting solvents that span multiple residues, however it is not yet implemented since I haven't found a case where it is necessary (and setting the solvent information in the code is a real nightmare).
As an example, say you want to select the solvent to be all residues from 20-100, then you would do
solvent byres :20-100Note that if you don't know the final residue number of your system offhand, yet you do know that the solvent spans all residues starting at residue 20 until the end of the system, just chose an upper bound and the program will reset accordingly, i.e.
solvent byres :20-9999To select all residues named "WAT" and "TIP3" and "ST2":
solvent byname WAT TIP3 ST2Note that if you just want to peruse what the current solvent information is (or more generally get some information about the current state), specify solvent with no arguments and a summary of the current state will be printed.
Calculated the angle between the three atoms listed, each specified in a mask, mask1 through mask3. If more than one atom is listed in each mask, treat the position of that atom as the center of mass of the atoms in that mask. The results are saved internally with the name name. Data will be dumped to a file named filename if "out" is specified (with a time interval between configurations of interval if time is listed). All the angles are listed in degrees.
mask2-----mask3 / / /--angle / mask1
Compute the atomic positional fluctuations for all the atoms; output is performed only for the atoms in mask. If "byatom" is specified, dump the calculated fluctuations by atom (default). If "byres" is specified, dump the average (mass-weighted) for each residue. If "bymask" is specified, dump the average (mass-weighted) over all the atoms in the original mask. If "out" is specified, the data will be dumped to filename (otherwise the values will be dumped to the standard output). The "start", "stop" and "offset" keywords can be used to specify the range of coordinates processed (as a subset of all of those read in across all input files, not to be confused with the individual specification in each trajin command). If the keyword "bfactor" is specified, the data is output as B-factors rather than atomic positional fluctuations (which means multiplying the squared fluctuations by (8/3)pi**2).
So, to dump the mass-weighted B-factors for the protein backbone atoms, by residue:
atomicfluct out back.apf @C,CA,N byres bfactor
Compute the average structure over all the configurations read in (subject to start, stop and offset if set) dumping the results to a file named filename. If the keyword "stddev" is present, save the standard deviations (fluctuations) instead of the average coordinates. Output is by default to an AMBER trajectory, however can also be to a pdb, binpos or restrt file (depending on the keyword chosen). The "nobox" keyword will suppress box coordinates, and with the PDB format, it is possible to dump charges and radii (with the "dumpq" keyword for AMBER radii and charges or the "parse" for parse radii and AMBER charges). The optional mask trims the output coordinates (but does not change the state).
If we are in periodic boundary conditions, center all the atoms based on the center of geometry of the atoms in the mask to the center of the periodic box or the origin if the optional argument "origin" is specified. If the trajectory is not a periodic boundary trajectory, then the molecule is implicitly centered to the origin. If no mask is specified, centering is relative to all the atoms. If "mass" is specified, center with respect to the center of mass instead.
This command is useful to find if there are atomic overlaps (vdw conflicts) or more generally to see if atoms are close to other atoms. Without the optional "around" keyword, this will search for pair distances in the selected atoms (all by default) that are less than the specified minimum value (in angstroms, 0.95 by default) apart or greater than the maximum value (if specified). To avoid checking for all overlaps within a single selection, it is possible to only search for close/distant approach for two atoms in different atom selections, i.e. for one set of atoms around another set of atoms. For example, to check if ions are too close to the solute atoms.
Note that this command is rather computationally demanding, particularly if imaging is turned on (by default), since a double loop over all atoms is performed. Therefore, it is not recommended to apply this command to an entire trajectory. To look at ion contact, for example, the "hbond" command is better.
Example 1: Check for strong vdw overlaps in your inpcrd
trajin inpcrd checkoverlap goExample 2: Check for direct ion interaction of DNA residues 1-40 to all of the ions in the system (located after the DNA). It is assumed, in this example, that all the solvent is in residues named "WAT".
trajin inpcrd strip :WAT checkoverlap :41-99999 min 3.5 around :1-40 go
Retain only total solvent molecules (using the solvent information specified, see solvent above) in each coordinate set. The solvent molecules saved are those which are closest to the atoms in the mask. If "oxygen" or "first" are specified, only the distance to the first atom in the solvent molecule (to each atom in the mask) is measured. This command is rather time consuming since many distances need to be measured. Note that imaging is implicitly performed on the distances and this gets extremely expensive in non-orthorhomic systems due to the need to possibly check all the distances of the nearest images (up to 26!). Imaging can be disabled by specifying the "noimage" keyword.
Note that the behavior of this command is slightly different than in previous ptraj versions; now the solvent molecules are ordered at output such that the closest solvent is first and the PDB file residue numbers no longer represent the identity of the water in the original coordinate set. This command should now work with non-sequential solvent molecules and be independent of where the water is located. Like the strip command, this modifies the current state (i.e. pars down the size of the trajectory in cases where subsets of a trajectory may be loaded into memory). A restriction of this command is that each of the solvent molecules must have the same number of atoms; this leads to a fixed size "configuration" in each coordinate set output which is necessary for most of the file formats.
Calculated the dihedral angle for the four atoms listed in mask1 through mask4 (representing rotation about the bond from mask2 to mask3). If more than one atom is listed in each mask, treat the position of that atom as the center of mass of the atoms in the mask. The results are saved internally with the name name. Data will be dumped to a file if "out" is specified (with a filename appended). All the angles are listed in degrees and are in the range from -180.0 to 180.0 degrees.
Compute a mean square displacement plot for the atoms in the mask. The time between frames in picoseconds is specified by time_per_frame. If "average" is specified, then the average mean square displacement is calculated and dumped (only). If "average" is not specified, then the average and individual mean squared displacements are dumped. They are all dumped to a file in the format appropriate for xmgr (dumped in multicolumn format if necessary, i.e. use xmgr -nxy). The units are displacements (in angstroms**2) vs time (in ps). The "filenameroot" is used as the root of the filename to be dumped. The average mean square displacements are dumped to "filenameroot_r.xmgr", the x, y and z mean square displacements to "filenameroot_x.xmgr", etc and the total distance travelled to "filenameroot_a.xmgr".
This will fail if a coordinate moves more than 1/2 the box in a single step. Also, this command implicitly unfolds the trajectory (in periodic boundary simulations) hence will currently only work with orthorhombic unit cells.
Same as grid (see below) except that dipoles of the solvent molecules are binned. Dumping is to a grid in a format for Chris Bayly's discern delegate that comes with Midas/Plus.
This command will calculate a distance between the center of mass of the atoms in mask1 to the center of mass of the atoms in mask2 and store this information into an array with name as the identifier for each frame in the trajectory. If the optional keyword "out" is specified, then the data is dumped to a file named filename. The arrays are named (name) so that post-processing of the array can be performed, if necessary. The distance is implicitly imaged and the shortest imaged distance will be saved (unless the "noimage" keyword is specified which disables imaging).
Create a grid representing the histogram of atoms in mask1 on the 3D grid that is "nx * x_spacing by ny * y_spacing by nz * z_spacing angstroms (cubed). Either "origin" or "box" can be specified and this states whether the grid is centered on the origin or half box (however note that centering to the half-box is currently disabled in non-orthorhombic unit cells). Note that to provide any meaningful representation of the density, the solute of interest (about which the atomic densities are binned) should be rms fit, centered and imaged prior to the grid call. If the optional keyword "negative" is also specified, then these density will be stored as negative numbers. Output is in the format of a XPLOR formatted contour file (which can be visualized by the density delegate to Midas/Plus or other programs). Upon dumping the file, pseudo-pdb HETATM records are also dumped to standard out which have the most probable grid entries (those that are 80% of the maximum by default which can be changed with the max keyword, i.e. max .5 makes the dumping at 50% of the maximum).
NOTE: the behavior on this command has changed slightly!
Image the current snapshot. If the optional argument "origin" is specified, then the imaging will occur to the coordinate origin (like in SPASMS) rather than the center of the box (as is the AMBER standard). The goal here is to place atoms outside the periodic box back into the box. By default all atoms are imaged by molecule based on the position of the first atom (or the center of mass of the molecule if "center" is specified; the latter is recommended). If the mask is specified, only the atoms in the mask will be imaged. It is now possible to image by atom (byatm), by residue (byres), by molecule (bymol, default) or by atom mask (where all the atoms in the mask are treated as belonging to a single molecule). The behavior of the "by molecule" imaging is different in CHARMM and AMBER; with AMBER the molecules are specified directly by the periodic box information whereas with the CHARMM parameter/topology, each segid is treated as a different molecule. With this newer implementation of the imaging code, it is possible to avoid breaking up double stranded DNA during imaging, i.e.:
image :1-20 bymask :1-20 image byres :WATImaging only makes sense if there is periodic box information present.
Note: non-orthorhomic unit cells are now supported by default!!!
Use of the triclinic imaging can be forced with the "triclinic" keyword. Note that this puts the box into the triclinic shape, not the more familiar, more spherical shapes one might expect for some of the unit cells (i.e. truncated octahedron). To get into the more familiar shape, specify the "familiar" keyword. In this case, to specify atoms that imaged molecules should be closest to, specify a center of the atoms in the mask specified with the "com" keyword. Note that imaging "familiar" is time consuming since each of the possible imaged distances (27) are checked to see which is closest to the center.
Principal axis align the atoms specified in mask. This is more or less functional as there are issues with degenerate eigenvalues and swapping, however it basically works...
Calculate the pucker for the five atoms specified in each of the mask's, mask1 through mask5, associating name with the calculated values. If more than one atom is specified in a given mask, the center of mass of all the atoms in that mask is used to define the position. If the "out" keyword is specified the data is dumped to filename. If the keyword "amplitude" is present, the amplitudes are saved rather than the pseudorotation values. If the keyword "altona" is listed, use the Altona & Sundarlingam conventions/algorithm (for nucleic acids) (the default) [see Altona & Sundaralingam, JACS 94, 8205-8212 (1972) or Harvey & Prabhakaran, JACS 108, 6128-6136 (1986).] In this convention, both the pseudorotation phase and amplitude are in degrees. The phases are in the range from -180.0 to 180.0 degrees. If "cremer" is specified, use the Cremer & Pople conventions/algorithm [see Cremer & Pople, JACS 97:6, 1354-1358 (1975).]
Note that to calculate nucleic acid puckers, specify C1' first, followed by C2', C3', C4' and finally O4'. Also note that the Cremer & Pople convention is offset from the Altona & Sundarlingam convention (with nucleic acids) by 90.0; to add in an extra 90.0 to "cremer" (offset -90.0) or subtract 90.0 from the "Altona" (offset 90.0) specify an offset with the offset keyword; this value is subtracted from the calculated psueodorotation value (or amplitude).
Compute radial distribution functions and store the results into files with root-filename as the root of the filename. Three files are currently produced, "root-filename_carnal.xmgr" (which corresponds to a carnal style RDF), "root-filename_standard.xmgr" (which uses the more traditional RDF with a density input by the user) and "root-filename_volume.xmgr" (which uses the more traditional RDF and the average volume of the system). The total number of bins for the histogram is determined by the spacing between bins (spacing) and the range which runs from zero to maximum. If only a solvent-mask is listed (i.e. a list of atoms) then the RDF will be calculated for the interaction of every solute-mask atom with ALL the other solute-mask atoms.
If the optional solute-mask is specified, then the RDF will represent the interaction of each solute-mask atom with ALL of the solvent-mask atoms. If the optional keyword "closest" is specified, then the histogram will bin, over all the solvent-mask atoms, the distance of the closest atom in the solute mask. [This later case is similar to the behavior in carnal when the ALL keyword is not specified.] If the solute-mask and solvent-mask atoms are not mutually exclusive, zero distances will be binned (although this should not break the code). If the optional keyword "density", followed by the density value is specified, this will be used in the calculations. The default value is 0.033456 molecules/angstrom**3 which corresponds to a density of water equal to 1.0 g/mL. To convert a standard density in g/mL, multiply the density by "6.022 / (10 * weight)" where "weight" is the mass of the molecule in atomic mass units. This will only effect the "root-filename_standard.xmgr" file.
Note that although imaging of distances is performed (to find the shortest imaged distance unless the "noimage" keyword is specified), minimum image conventions are applied and therefore the RDF will tail to zero outside of the periodic cell.
For the ions in the selected set of atoms (as specified by the ion-mask), swap positions with randomly chosen solvent molecules (WAT residue by default or those defined with the solvent command. The "overlap" keyword specifies the closest approach of any two ions (default is 3.5 angstroms). If the optional keyword "around" is specified, along with an addition set of selected atoms, the randomly placed ions cannot come any closer to any atom in the selected set than the value specified with the "by" keyword (or 3.5 angstroms by default). The noimage keyword disables imaging.
This is a time consuming command.
This command is useful to randomize the ion positions in a coordinate set. For example, for a "inpcrd" that has DNA in residues 1-20 and ions in residues 21-38, if we want to randomize the ion positions (swap with waters) such that the ions can come no closer than 5.0 to the DNA or 3.5 angstroms from each other:
trajin inpcrd trajout inpcrd.moved_ions restart randomizeions :21-38 around :1-20 by 5.0 overlap 3.5 go
This will RMS fit all the atoms in the mask based on the current mode which is:
Scale the atoms in the mask in each direction by the offset specified.
Strip all atoms matching the atoms in mask. This changes the state of the system such that all commands following the strip (including output of the coordinates which is done last) are performed on the stripped coordinates (i.e. if you strip all the waters and then on a later action try to do something with the waters, you will have trouble since the waters are gone). A benefit of stripping, beyong paring down trajectories is with the data intensive commands that read entire sections of the trajectory into memory; with the strip to retain only selected atoms, it is much less likely that you will blow memory.
Move the atoms in the mask in each direction by the offset specified.
Create a truncated octahedron box with solvent stripped to a distance distance away from the atoms in the mask. Coordinates are output to the trajectory specified with the trajout command. Note that this is a special command and will only really make sense if a single coordinate set is processed (i.e. any prmtop written out will only correspond to the first configuration!!!) and commands after the truncoct will have undefined behavior since the state will not be consistent. It is intended only as a aid for creating truncated octahedron restrt files for running in AMBER.
The "prmtop" keyword can be used to specify the writing of a new prmtop (to a file named filename; this prmtop is only consistent with the first set of coordinates written. Moreover, this command will only work with AMBER prmtop files and assumes an AMBER prmtop file has previously been read in (rather than a CHARMM PSF). This command also assumes that all the solvent is located contiguously at the end of the file and that the solvent information has previously been set (see the solvent command).
This command will keep track of a vector value (and it's origin) over the trajectory; the data can be referenced for later use based on the name specified. If the optional keyword "out" is specified, the data will be dumped to the file named filename. The format is frame number, followed by the value of the vector, the reference point, and the reference point plus the vector. What kind of vector is stored depends on the keyword chosen.
This option will count the number of waters within a certain distance of the atoms in the mask in order to represent the first and second solvation shells. The output is a file into filename (appropriate for xmgr) which has, on each line, the frame number, number of waters in the first shell and number of waters in the second shell. If lower is specified, this represents the distance from the mask which represents the first solvation shell; if this is absent 3.4 angstroms is assumed. Likewise, upper represents the range of the second solvation shell and if absent is assumed to be 5.0 angstroms. The optional solvent-mask can be used to consider other atoms as the solvent; the default is ":WAT". The distances are implicitly imaged unless the "noimage" keyword is specified.
Calculate a 2D RMS plot for the atoms specified, dumping the output to one of the specified file formats.
Calculate the correlation coefficient between the data specified by name1 and name2. If the ALL keyword is specified treat the (first or second or both) sets of data to represent ALL the data on the scalarStack. Output of the calculated correlation coefficient is to standard output.
Calculate the average and standard deviation of the scalarStack name name or all entries if the "all" keyword is specified. The optional keyword "shift" can be used to shift the values during averaging; this can be particularly important with dihedral angles. Dihedral angle values are currently all stored in the range of -180.0 to 180.0; if the actual average value is near 180.0, raw averaging of the values will tend to distort the average towards zero due to the presence of the negative values. To avoid this, an shift value of 180.0 can be applied which will shift the values for the averaging; after calculating the shifted average, the average value will automatically be shifted back. Note that the shift is applied in a periodic fashion to keep the (shifted) values between -180.0 and 180.0 for angular quantities.