Manipulating QMzymeRegion
Objective
The objective of this tutorial is to show different ways in which QMzymeRegion can be modified. We will highlight some of the ways in which the user can manipulate QMzymeRegion to get a desirable selection for QM input generation. This workflow allows you to:
Learn methods to combine and subtract QMzymeRegion objects.
In this specific example, we are using ketosteroid isomerase (KSI) as the model system. The structure for KSI is obtained from the PDB 1OH0 and MM-minimized prior to this tutorial.
Classes used in this example
Required Files
To start, you will need:
A fully prepped and protonated PDB
[8]:
# Here are the necesary imports for this tutorial!
import QMzyme
from QMzyme import GenerateModel
from QMzyme.SelectionSchemes import DistanceCutoff
from QMzyme.data import PDB
from QMzyme.RegionBuilder import RegionBuilder
import pandas as pd
import MDAnalysis
Combining Two Regions
We’ll first look at combining two regions! To achieve this, we can use combine() method in QMzymeRegion class. Using it is quite simple: you decide on the base region and region you want to add, then simply combine them using combine(). In here, we will use it to add Tyr 57 to our distance cutoff of 3 Å.
[11]:
# We first initialize model and update the unknown residue charge.
model = QMzyme.GenerateModel(PDB)
QMzyme.data.residue_charges.update({'EQU': -1})
# We create regions of interest.
model.set_catalytic_center(selection='resname EQU and segid A')
model.set_region(selection=DistanceCutoff, cutoff=3)
model.set_region(selection="resid 57", name="Tyr_57")
# We combine Tyr_57 region to cutoff_3 region.
combined_region = model.get_region("cutoff_3")
combined_region = combined_region.combine(model.Tyr_57)
model.set_region(selection=combined_region, name=f"combined_region")
Charge information not present. QMzyme will try to guess region charges based on residue names consistent with AMBER naming conventions (i.e., aspartate: ASP --> Charge: -1, aspartic acid: ASH --> Charge: 0.). See QMzyme.data.residue_charges for the full set.
Nonconventional Residues Found
------------------------------
EQU --> Charge: UNK, defaulting to 0
You can update charge information for nonconventional residues by running
>>>QMzyme.data.residue_charges.update({'3LETTER_RESNAME':INTEGER_CHARGE}).
Note your changes will not be stored after you exit your session. It is recommended to only alter the residue_charges dictionary. If you alter the protein_residues dictionary instead that could cause unintended bugs in other modules (TruncationSchemes).
We can examine the region using pandas and summarize() method. We’ll first look at our cutoff_3 region, then compare it with combined_region!
[12]:
df = pd.DataFrame(model.cutoff_3.summarize())
df
[12]:
| Resid | Resname | Charge | Removed atoms | Fixed atoms | Segids | |
|---|---|---|---|---|---|---|
| 0 | 16 | TYR | 0 | [] | [] | A |
| 1 | 20 | VAL | 0 | [] | [] | A |
| 2 | 40 | ASP | -1 | [] | [] | A |
| 3 | 60 | GLY | 0 | [] | [] | A |
| 4 | 61 | LEU | 0 | [] | [] | A |
| 5 | 66 | VAL | 0 | [] | [] | A |
| 6 | 86 | PHE | 0 | [] | [] | A |
| 7 | 88 | VAL | 0 | [] | [] | A |
| 8 | 90 | MET | 0 | [] | [] | A |
| 9 | 99 | LEU | 0 | [] | [] | A |
| 10 | 101 | VAL | 0 | [] | [] | A |
| 11 | 103 | ASH | 0 | [] | [] | A |
| 12 | 118 | ALA | 0 | [] | [] | A |
| 13 | 120 | TRP | 0 | [] | [] | A |
| 14 | 263 | EQU | -1 | [] | [] | A |
| 15 | 372 | WAT | 0 | [] | [] | A |
| 16 | 373 | WAT | 0 | [] | [] | A |
| 17 | 376 | WAT | 0 | [] | [] | A |
| 18 | 378 | WAT | 0 | [] | [] | A |
[13]:
df = pd.DataFrame(model.combined_region.summarize())
df
[13]:
| Resid | Resname | Charge | Removed atoms | Fixed atoms | Segids | |
|---|---|---|---|---|---|---|
| 0 | 16 | TYR | 0 | [] | [] | A |
| 1 | 20 | VAL | 0 | [] | [] | A |
| 2 | 40 | ASP | -1 | [] | [] | A |
| 3 | 57 | TYR | 0 | [] | [] | A |
| 4 | 60 | GLY | 0 | [] | [] | A |
| 5 | 61 | LEU | 0 | [] | [] | A |
| 6 | 66 | VAL | 0 | [] | [] | A |
| 7 | 86 | PHE | 0 | [] | [] | A |
| 8 | 88 | VAL | 0 | [] | [] | A |
| 9 | 90 | MET | 0 | [] | [] | A |
| 10 | 99 | LEU | 0 | [] | [] | A |
| 11 | 101 | VAL | 0 | [] | [] | A |
| 12 | 103 | ASH | 0 | [] | [] | A |
| 13 | 118 | ALA | 0 | [] | [] | A |
| 14 | 120 | TRP | 0 | [] | [] | A |
| 15 | 263 | EQU | -1 | [] | [] | A |
| 16 | 372 | WAT | 0 | [] | [] | A |
| 17 | 373 | WAT | 0 | [] | [] | A |
| 18 | 376 | WAT | 0 | [] | [] | A |
| 19 | 378 | WAT | 0 | [] | [] | A |
As you can see, Tyr 57 can be seen in combined_region, suggesting that our region has been successfully combined!
Subtracting Two Regions
Now, let’s subtract a region from our QMzyme region! To achieve this, we can use subtract() method in QMzymeRegion class. This time, we’ll consider a case where you want to remove amino acid residues responsible for creating the oxyanion hole in KSI (Tyr 16 and Asp 103) to see how it influences coordination of the substrate.
[17]:
# We first initialize model and update the unknown residue charge.
model = QMzyme.GenerateModel(PDB)
QMzyme.data.residue_charges.update({'EQU': -1})
# We create regions of interest.
model.set_catalytic_center(selection='resname EQU and segid A')
model.set_region(selection=DistanceCutoff, cutoff=3)
model.set_region(selection="resid 16 or resid 103", name="oxyanion_hole")
# We combine Tyr_57 region to cutoff_3 region.
subtracted_region = model.get_region("cutoff_3")
subtracted_region = subtracted_region.subtract(model.oxyanion_hole)
model.set_region(selection=subtracted_region, name=f"subtracted_region")
Charge information not present. QMzyme will try to guess region charges based on residue names consistent with AMBER naming conventions (i.e., aspartate: ASP --> Charge: -1, aspartic acid: ASH --> Charge: 0.). See QMzyme.data.residue_charges for the full set.
We can examine the region using pandas and summarize() method. We’ll first look at our cutoff_3 region, then compare it with subtracted_region!
[18]:
df = pd.DataFrame(model.cutoff_3.summarize())
df
[18]:
| Resid | Resname | Charge | Removed atoms | Fixed atoms | Segids | |
|---|---|---|---|---|---|---|
| 0 | 16 | TYR | 0 | [] | [] | A |
| 1 | 20 | VAL | 0 | [] | [] | A |
| 2 | 40 | ASP | -1 | [] | [] | A |
| 3 | 60 | GLY | 0 | [] | [] | A |
| 4 | 61 | LEU | 0 | [] | [] | A |
| 5 | 66 | VAL | 0 | [] | [] | A |
| 6 | 86 | PHE | 0 | [] | [] | A |
| 7 | 88 | VAL | 0 | [] | [] | A |
| 8 | 90 | MET | 0 | [] | [] | A |
| 9 | 99 | LEU | 0 | [] | [] | A |
| 10 | 101 | VAL | 0 | [] | [] | A |
| 11 | 103 | ASH | 0 | [] | [] | A |
| 12 | 118 | ALA | 0 | [] | [] | A |
| 13 | 120 | TRP | 0 | [] | [] | A |
| 14 | 263 | EQU | -1 | [] | [] | A |
| 15 | 372 | WAT | 0 | [] | [] | A |
| 16 | 373 | WAT | 0 | [] | [] | A |
| 17 | 376 | WAT | 0 | [] | [] | A |
| 18 | 378 | WAT | 0 | [] | [] | A |
[19]:
df = pd.DataFrame(model.subtracted_region.summarize())
df
[19]:
| Resid | Resname | Charge | Removed atoms | Fixed atoms | Segids | |
|---|---|---|---|---|---|---|
| 0 | 20 | VAL | 0 | [] | [] | A |
| 1 | 40 | ASP | -1 | [] | [] | A |
| 2 | 60 | GLY | 0 | [] | [] | A |
| 3 | 61 | LEU | 0 | [] | [] | A |
| 4 | 66 | VAL | 0 | [] | [] | A |
| 5 | 86 | PHE | 0 | [] | [] | A |
| 6 | 88 | VAL | 0 | [] | [] | A |
| 7 | 90 | MET | 0 | [] | [] | A |
| 8 | 99 | LEU | 0 | [] | [] | A |
| 9 | 101 | VAL | 0 | [] | [] | A |
| 10 | 118 | ALA | 0 | [] | [] | A |
| 11 | 120 | TRP | 0 | [] | [] | A |
| 12 | 263 | EQU | -1 | [] | [] | A |
| 13 | 372 | WAT | 0 | [] | [] | A |
| 14 | 373 | WAT | 0 | [] | [] | A |
| 15 | 376 | WAT | 0 | [] | [] | A |
| 16 | 378 | WAT | 0 | [] | [] | A |
As you can see, Tyr 16 and Asp 103 are no longer present in subtracted_region, suggesting that our region has been successfully subtracted!