Getting Started

Getting Started#

IOData can be used to read and write different quantum chemistry file formats.

Script usage#

The simplest way to use IOData, without writing any code is to use the iodata-convert script.

iodata-convert in.fchk out.molden

See the --help option for more details on usage.

Code usage#

More complex use cases can be implemented in Python, using IOData as a library. IOData stores an object containing the data read from the file.

Reading#

To read a file, use something like this:

from iodata import load_one

mol = load_one('water.xyz')  # XYZ files contain atomic coordinates in Angstrom
print(mol.atcoords)  # print coordinates in Bohr.

IOData also has basic support for loading databases of molecules. For example, the following will iterate over all frames in an XYZ file:

from iodata import load_many

    # print the title line from each frame in the trajectory.
    for mol in load_many('trajectory.xyz'):
        print(mol.title)

Writing#

IOData can also be used to write different file formats:

    from iodata import load_one, dump_one

    mol = load_one('water.fchk')
    # Here you may put some code to manipulate mol before writing it the data
    # to a different file.
    dump_one(mol, 'water.molden')

One could also convert (and manipulate) an entire trajectory. The following example converts a geometry optimization trajectory from a Gaussian FCHK file to an XYZ file:

from iodata import load_many, dump_many

# Conversion without manipulation.
dump_many((mol for mol in load_many('water_opt.fchk')), 'water_opt.xyz')

If you wish to perform some manipulations before writing the trajectory, the simplest way is to load the entire trajectory in a list of IOData objects and dump it later:

from iodata import load_many, dump_many

# Read the trajectory
trj = list(load_many('water_opt.fchk'))
# Manipulate if desired
# ...
# Write the trajectory
dump_many(trj, 'water_opt.xyz')

For very large trajectories, you may want to avoid loading it as a whole in memory. For this, one should avoid making the list object in the above example. The following approach would be more memory efficient.

from iodata import load_many, dump_many

def itermols():
    for mol in load_many("traj1.xyz"):
        # Do some manipulations
        yield modified_mol

dump_many(itermols(), "traj2.xyz")

Input files#

IOData can be used to write input files for quantum-chemistry software. By default minimal settings are used, which can be changed if needed. For example, the following will prepare a Gaussian input for a HF/STO-3G calculation from a PDB file:

from iodata import load_one, write_input

write_input(load_one("water.pdb"), "water.com", fmt="gaussian")

The level of theory and other settings can be modified by setting corresponding attributes in the IOData object:

from iodata import load_one, write_input

mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "6-31g*"
mol.run_type = "opt"
write_input(mol, "water.com", fmt="gaussian")

The run types can be any of the following: energy, energy_force, opt, scan or freq. These are translated into program-specific keywords when the file is written.

It is possible to define a custom input file template to allow for specialized commands. This is done by passing a template string using the optional template keyword, placing each IOData attribute (or additional keyword, as shown below) in curly brackets:

from iodata import load_one, write_input

mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "Def2QZVP"
mol.run_type = "opt"
custom_template = """\
%NProcShared=4
%mem=16GB
%chk=B3LYP_def2qzvp_H2O
#n {lot}/{obasis_name} scf=(maxcycle=900,verytightlineq,xqc) integral=(grid=ultrafinegrid) pop=(cm5, hlygat, mbs, npa, esp)

{title}

{charge} {spinmult}
{geometry}

"""
write_input(mol, "water.com", fmt="gaussian", template=custom_template)

The input file template may also include keywords that are not part of the IOData object:

from iodata import load_one, write_input

mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "Def2QZVP"
mol.run_type = "opt"
custom_template = """\
%chk={chk_name}
#n {lot}/{obasis_name} {run_type}

{title}

{charge} {spinmult}
{geometry}

"""
# Custom keywords as arguments (best for few extra arguments)
write_input(mol, "water.com", fmt="gaussian", template=custom_template, chk_name="B3LYP_def2qzvp_water")

# Custom keywords from a dict (in cases with many extra arguments)
custom_keywords = {"chk_name": "B3LYP_def2qzvp_waters"}
write_input(mol, "water.com", fmt="gaussian", template=custom_template, **custom_keywords)

In some cases, it may be preferable to load the template from file, instead of defining it in the script:

from iodata import load_one, write_input

mol = load_one("water.pdb")
mol.lot = "B3LYP"
mol.obasis_name = "6-31g*"
mol.run_type = "opt"
write_input(mol, "water.com", fmt="gaussian", template=open("my_template.com", "r").read())

Data storage#

IOData can be used to store data in a consistent format for writing at a future point.

import numpy as np
from iodata import IOData

mol = IOData(title="water")
mol.atnums = np.array([8, 1, 1])
mol.atcoords = np.array([[0, 0, 0,], [0, 1, 0,], [0, -1, 0,]])  # in Bohr

Unit conversion#

IOData always represents all quantities in atomic units and unit conversion constants are defined in iodata.utils. Conversion to atomic units is done by multiplication with a unit constant. This convention can be easily remembered with the following examples:

When you say “this bond length is 1.5 Å”, the IOData equivalent is bond_length = 1.5 * angstrom.
The conversion from atomic units is similar to axes labels in old papers. For example. a bond length in angstrom is printed as “Bond length / Å”. Expressing this with IOData’s conventions gives print("Bond length in Angstrom:", bond_length / angstrom)

(This is rather different from the ASE conventions.)