QM/QM2 Calculation

Objective

The objective of this tutorial is to create an input file for QM/QM2 calculation. In this example you will define a catalytic center as the bound ligand, residue 263, and a second region containing atoms within 5 Å of the catalytic center. You will then set a high-level QM method for the catalytic center region and a lower-level QM (QM2) method to treat the other atoms. All C-alpha atoms will be constrained for optimization.

It is recommended to first visit the QM-only calculation workflow since it provides a more basic understanding of the QMzyme workflow. You can find it here.

This workflow allows you to:

  • Create QM/QM2 input file using QMzyme.

In this specific example, we are using ketosteroid isomerase (KSI) as the model system. The structure for KSI is obtained from the PDB 1OH0 and MM-minimized prior to this tutorial. We will utilize it to create an input file for ORCA for geometry optimization and frequency calculation.

Classes used in this example

Required Files

To start, you will need:

  • A fully prepped and protonated PDB.


[ ]:
# Here are the necesary imports for this tutorial!

import QMzyme
from QMzyme.SelectionSchemes import DistanceCutoff
from QMzyme.data import PDB

Initialize Model and Define Regions

For doing QM/QM2 calculation, we need a QMzymeModel with two regions, each designated with their own QM method. Thus, the first step is to create 2 distinct QMzymeRegion. For our enzyme, we will utilize the bound ligand (residue 263) and the 5 Å distance cutoff region as our two regions for QM/QM2 calculation. To initialize and define our regions, we will first load our structure file and set regions using set_catalytic_center() and set_region(). You can find more information about DistanceCutoff class here.

[2]:
# We first initilize QMzymeModel by loading structure file.
# The initial pdb file should be prepared prior (hydrogens must be present).
model = QMzyme.GenerateModel(PDB)

# Since our bound ligand, equilenin, is not a standard AMBER protein residue, we need to specify the chage.
QMzyme.data.residue_charges.update({'EQU': -1})

# We select first region using set_catalytic_center and update the charge of the region.
model.set_catalytic_center(selection='resid 263')
model.catalytic_center.set_charge(-1)

# We use DistanceCutoff class to create our second region.
model.set_region(selection=DistanceCutoff, cutoff=5)

# Selecting and freeznig CA atoms.
c_alpha_atoms = model.cutoff_5.get_atoms(attribute='name', value='CA')
model.cutoff_5.set_fixed_atoms(atoms=c_alpha_atoms)

Charge information not present. QMzyme will try to guess region charges based on residue names consistent with AMBER naming conventions (i.e., aspartate: ASP --> Charge: -1, aspartic acid: ASH --> Charge: 0.). See QMzyme.data.residue_charges for the full set.

        Nonconventional Residues Found
        ------------------------------
        EQU --> Charge: UNK, defaulting to 0

You can update charge information for nonconventional residues by running
        >>>QMzyme.data.residue_charges.update({'3LETTER_RESNAME':INTEGER_CHARGE}).
Note your changes will not be stored after you exit your session. It is recommended to only alter the residue_charges dictionary. If you alter the protein_residues dictionary instead that could cause unintended bugs in other modules (TruncationSchemes).

Build the QM Methods and Assign to Regions

One of the cool features of QMzyme is the automated workflow for multi-level calculation setup. Specifically, if there is more than 1 region with method designated to it, QMzyme will automatically decide an appropriate multi-level for input file generation. For QM/QM2 input file generation, we need to create 2 QM methods and assign them to 2 regions.There are multiple attributes that can be given to the method, which will impact the input file! Some of the essential options are:

  • basis_set: defines the basis set to use for calculation.

  • functional: defines the functional to use for calculation.

  • qm_input: keywords to include in the input file route line to declare any details.

  • program: defines the program that will be used for QM calculation.

You can find more information about QM_Method here. For our case, we will set the catalytic center to be treated with 6-31+G**/wB97X-D3 with optimization and frequency calculation and our QM2 region to be treated with 6-31G*/wB97X-D3. At last, we will prepare these two regions with model.truncate().

[3]:
# We first create two QM method, each with desired level of theory.
qm1_method = QMzyme.QM_Method(basis_set='6-31+G**',
               functional='wB97X-D3',
               qm_input='OPT FREQ',
               program='orca')

qm2_method = QMzyme.QM_Method(basis_set='6-31G*',
               functional='wB97X-D3')

# Afterwards, we assign two methods to their own respective region.
qm1_method.assign_to_region(region=model.catalytic_center)
qm2_method.assign_to_region(region=model.cutoff_5)
QMzymeRegion cutoff_5 has an estimated charge of -2.

Truncate Model

The last step before creating an input file is to use model.truncate(). This takes the full QMzymeModel (which, if you recall, is a region with a method), truncates it, and prepares it for input file generation. If not specified, model.truncate() will undergo TerminalAlphaCarbon truncation class, which will preserves peptide bonds between consecutive residues and remove terminal backbone amine and carboxylate groups. You can find more information about it here.

[4]:
# We truncate the model with TerminalAlphaCarbon class.
model.truncate()
QMzymeRegion catalytic_center_cutoff_5_combined has an estimated charge of -2.

Truncated model has been created and saved to attribute 'truncated' and stored in QMzyme.CalculateModel.calculation under key QMQM2. This model will be used to write the calculation input.

Write QM Input File

The final step now is to create the calculation input file. This can be done by using model.write_input() method, which will create a directory named QCALC and save the input file there. When this method is called, the store_pickle() method is triggered, and you will have the corresponding .pkl file for your QMzymeModel. This is especially useful because after your calculation is completed, you can then load the calculation results back into the corresponding QMzymeRegion using the QMzymeRegion method store_calculation_results(), passing it the calculation file.

[ ]:
model.write_input()