QMzyme Serialization

Objective

The objective of this tutorial is to show how to save and continue QMzyme workflow. In this example we follow the same steps from the QMQM2 Calculation example, up until it is time to assign the calculation method. Let’s pretend we got to that point and needed to read some literature to decide what level of theory we should use. So we will serialize the QMzymeModel object using the Python library pickle by calling the method store_pickle, take a (hypothetical) break to read the literature, and then reload the pickle file to continue!

It is recommended to first visit QM/QM2 calculation workflow. You can find it here.

This workflow allows you to:

Save and continue QMzyme workflow using pickled files

In this specific example, we are using ketosteroid isomerase (KSI) as the model system. The structure for KSI is obtained from the PDB 1OH0 and MM-minimized prior to this tutorial. We will utilize it to create an input file for ORCA for geometry optimization and frequency calculation.

Classes used in this example

Required Files

To start, you will need:

A fully prepped and protonated PDB.

[ ]:

# Here are the necesary imports for this tutorial!

import QMzyme
from QMzyme.SelectionSchemes import DistanceCutoff
from QMzyme.data import PDB

Define the QM and QM2 regions

For doing QM/QM2 calculation, we need a QMzymeModel with two regions, each designated with their own QM method. Thus, the first step is to create 2 distinct QMzymeRegion. For our enzyme, we will utilize the bound ligand (residue 263) and the 5 Angstrom distance cutoff region as our two regions for QM/QM2 calculation. To initialize and define our regions, we will first load our structure file and set regions using set_catalytic_center() and set_region(). You can find more information about DistanceCutoff class here.

[2]:

# We first initilize QMzymeModel by loading structure file.
# The initial pdb file should be prepared prior (hydrogens must be present).
model = QMzyme.GenerateModel(PDB)

# This adds unknown residue charge.
QMzyme.data.residue_charges.update({'EQU': -1})

# We select first region using set_catalytic_center and update the charge of the region.
model.set_catalytic_center(selection='resid 263')
model.catalytic_center.set_charge(-1)

# We use DistanceCutoff class to create our second region.
model.set_region(selection=DistanceCutoff, cutoff=5)

# Selecting and freeznig CA atoms.
c_alpha_atoms = model.cutoff_5.get_atoms(attribute='name', value='CA')
model.cutoff_5.set_fixed_atoms(atoms=c_alpha_atoms)


Charge information not present. QMzyme will try to guess region charges based on residue names consistent with AMBER naming conventions (i.e., aspartate: ASP --> Charge: -1, aspartic acid: ASH --> Charge: 0.). See QMzyme.data.residue_charges for the full set.

        Nonconventional Residues Found
        ------------------------------
        EQU --> Charge: UNK, defaulting to 0

You can update charge information for nonconventional residues by running
        >>>QMzyme.data.residue_charges.update({'3LETTER_RESNAME':INTEGER_CHARGE}).
Note your changes will not be stored after you exit your session. It is recommended to only alter the residue_charges dictionary. If you alter the protein_residues dictionary instead that could cause unintended bugs in other modules (TruncationSchemes).

Serialize QMzymeModel object

The pickle file will by default be named after the QMzymeModel.name attribute, which by default is the base name of the file originally used to initialize the model. You can also specify a filename by passing the argument filename=.

[3]:

model.store_pickle()

Upload serialized QMzymeModel object

Now that you know what methods you want to use, let’s load the

[4]:

model = QMzyme.GenerateModel(pickle_file=f'1oh0.pkl')

Build the QM Methods and Assign to Regions

For QM/QM2 input file generation, we need to create 2 QM methods and assign them to 2 regions. There are multiple attributes that can be given to the method, which will impact the input file! Some of the essential options are:

basis_set: defines the basis set to use for calculation.
functional: defines the functional to use for calculation.
qm_input: keywords to include in the input file route line to declare any details.
program: defines the program that will be used for QM calculation.

You can find more information about QM_Method here. For our case, we will set the catalytic center to be treated with 6-31+G**/wB97X-D3 with optimization and frequency calculation and our QM2 region to be treated with 6-31G*/wB97X-D3. At last, we will prepare these two regions with model.truncate()`.

[5]:

qm1_method = QMzyme.QM_Method(basis_set='6-31+G**',
               functional='wB97X-D3',
               qm_input='OPT FREQ',
               program='orca')

qm2_method = QMzyme.QM_Method(basis_set='6-31G*',
               functional='wB97X-D3')

qm1_method.assign_to_region(region=model.catalytic_center)
qm2_method.assign_to_region(region=model.cutoff_5)

QMzymeRegion cutoff_5 has an estimated charge of -2.

Truncate Model

The last step before creating an input file is to use model.truncate(). This takes the full QMzymeModel (which, if you recall, is a region with a method), truncates it, and prepares it for input file generation. If not specified, model.truncate() will undergo TerminalAlphaCarbon truncation class, which will preserves peptide bonds between consecutive residues and remove terminal backbone amine and carboxylate groups. You can find more information about it here.

[6]:

# We truncate the model with TerminalAlphaCarbon class.
model.truncate()

QMzymeRegion catalytic_center_cutoff_5_combined has an estimated charge of -2.

Truncated model has been created and saved to attribute 'truncated' and stored in QMzyme.CalculateModel.calculation under key QMQM2. This model will be used to write the calculation input.

Write QM Input File

The final step now is to create the calculation input file. This can be done by using model.write_input()` method, which will create a directory named QCALC and save the input file there. When this method is called, the store_pickle() method is triggered, and you will have the corresponding .pkl file for your QMzymeModel. This is especially useful because after your calculation is completed, you can then load the calculation results back into the corresponding QMzymeRegion using the QMzymeRegion method store_calculation_results(), passing it the calculation file.

[ ]:

model.write_input()