Ligand Parameterization

Objective

The objective of this tutorial is to do a ligand parameterization. Molecular mechanics forcefields need to be told how to treat each atom via a set of parameters. If there is a molecule (residue) in your system that your forcefield of choice does not already have parameters, you will need to build these yourself. In this example, we will see how QMzyme can automate the calculations required to parameterize a ligand in line with the RESP procedure (J. Phys. Chem. 1993, 97, 40, 10269–1028).

This workflow allows you to:

Prepare ligand for ligand parameterization.

In this specific example, we are using ketosteroid isomerase (KSI) as the model system. The structure for KSI is obtained from the PDB 1OH0 and MM-minimized prior to this tutorial. We will utilize it to create an input file for ORCA for geometry optimization and frequency calculation.

Classes used in this example

Required Files

To start, you will need:

A fully prepped and protonated PDB.

[ ]:

# Here are the necesary imports for this tutorial!

import QMzyme
from QMzyme.data import PDB

Initialize Model

To begin our example workflow, we need to start by initializing QMzymeModel using GenerateModel().

[3]:

# We first initilize QMzymeModel by loading structure file.
# The initial pdb file should be prepared prior (hydrogens must be present).
model = QMzyme.GenerateModel(PDB)


Charge information not present. QMzyme will try to guess region charges based on residue names consistent with AMBER naming conventions (i.e., aspartate: ASP --> Charge: -1, aspartic acid: ASH --> Charge: 0.). See QMzyme.data.residue_charges for the full set.

        Nonconventional Residues Found
        ------------------------------
        EQU --> Charge: UNK, defaulting to 0

You can update charge information for nonconventional residues by running
        >>>QMzyme.data.residue_charges.update({'3LETTER_RESNAME':INTEGER_CHARGE}).
Note your changes will not be stored after you exit your session. It is recommended to only alter the residue_charges dictionary. If you alter the protein_residues dictionary instead that could cause unintended bugs in other modules (TruncationSchemes).

Add Ligand Charge

When loading the PDB file, we quickly run into an issue: a nonconventional residue is found. This simply means that there is a non-AMBER protein residue with an unknown charge. This can be resolved using residue_charges.update(), where we can manually update the charge of an unknown residue. In this case, the charge of equilenin (EQU) is -1 and we will update it using residue_charges.update().

[4]:

QMzyme.data.residue_charges.update({'EQU': -1})

Designate Ligand Region

In QMzyme, set_region() is used to create QMzymeRegion, a Python class object that is the collection of QMzymeAtom objects. To select protein amino acids directly, we use MDAnalysis nomenclature for selecting our residues of interest!1,2

[5]:

# We'll select the region of interest.
model.set_region(name='EQU', selection='resid 263 and resname EQU')
print("Ligand region: ", model.EQU)

Ligand region:  <QMzymeRegion EQU contains 37 atom(s) and 1 residue(s)>

Build the QM Method and Assign to Region

Ultimately, QMzymeModel is a collection of QMzymeRegon and a QM method for calculation. This can be achieved by creating a QM method and assigning QM method to the region. You can find more about different QM methods and their usages at:

QM_Method

For the purpose of this example, we will forgo geometry optimization and only perform a single point energy. Calculation and population analysis at the level used in the original RESP procedure (J. Phys. Chem. 1993, 97, 40, 10269–1028).

After creating QM method of choice, we can assign the QM method to a QMzymeRegion using assign_to_region(). This will make a complete set of QMzymeModel, which can be used to now write a QM input file.

[6]:

# We first create two QM method, each with desired level of theory.
qm_method = QMzyme.QM_Method(
    basis_set='6-31G*',
    functional='HF',
    qm_input='SCF=Tight Pop=MK IOp(6/33=2,6/42=6,6/43=20)',
    program='gaussian'
)

# Now we assign this method to our QMzymeRegion EQU
qm_method.assign_to_region(region=model.EQU)

QMzymeRegion EQU has an estimated charge of -1.

Write Calculation Input File

The final step now is to create the calculation input file. This can be done by using model.write_input() method, which will create a directory named QCALC and save the input file there. When this method is called, the store_pickle() method is triggered, and you will have the corresponding .pkl file for your QMzymeModel. This is especially useful because after your calculation is completed, you can then load the calculation results back into the corresponding QMzymeRegion using the QMzymeRegion method store_calculation_results(), passing it the calculation file.

[ ]:

# QMzyme will know we only have one region with a calculation method set,
# so it will logically create the input file for that scenario
model.write_input('EQU_resp')

References Cited

1. Michaud-Agrawal, E. J. Denning, T. B. Woolf, and O. Beckstein. MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics Simulations. J. Comput. Chem. 32 (2011), 2319–2327. doi:10.1002/jcc.21787
1. 1. Gowers, M. Linke, J. Barnoud, T. J. E. Reddy, M. N. Melo, S. L. Seyler, D. L. Dotson, J. Domanski, S. Buchoux, I. M. Kenney, and O. Beckstein. MDAnalysis: A Python package for the rapid analysis of molecular dynamics simulations. In S. Benthall and S. Rostrup, editors, Proceedings of the 15th Python in Science Conference, pages 98-105, Austin, TX, 2016. SciPy. doi:10.25080/Majora-629e541a-00e