SelectionSchemes
Module containing functions to define a QMzymeRegion based on some logic/workflow.
- class QMzyme.SelectionSchemes.SelectionScheme(model, name)
Bases:
ABCSelectionScheme is the abstract base class used to prescribe concrete selection scheme sub classes. Fork QMzyme to build your own concrete selection scheme class and submit a Pull Request (PR) on github to have your scheme added to QMzyme! You will need to create comprehensive tests before the PR is accepted. See the documentation on contributing for more information.
Below is a template that you can use to implement your own selection scheme class:
- class InformativeName(SelectionScheme):
This is an example docstring.
Include a detailed description of how the scheme works in the class level doc string. Also include any parameters the __init__() method of your class accepts.
- Parameters:
- model:
(
QMzymeModel) QMzymeModel to provide starting structure that selection will be performed on. When using the mainGenerateModelclass, the QMzyme model is automatically passed as an argument to the selection scheme. It is recommended you use the Universe (universe attribute) representing the starting structure to perform the selection on.- name:
(str, required) Name of the region generated.
The return should always be a QMzyme region.
- Returns:
Note
Include any notes you want users to be aware of.
Assign any key word arguments as attributes to self. Then in your select_atoms() method you can pull any necessary args from self attributes, instead of relying on passing them.
Every concrete scheme __init__() method should include this line at the very end:
super().__init__(model, name)
This will automatically run your select_atoms() method and return the resulting region.
- __init__(model, name)
Assign any key word arguments as attributes to self. Then in your select_atoms() method you can pull any necessary args from self attributes, instead of relying on passing them.
Every concrete scheme __init__() method should include this line at the very end:
super().__init__(model, name)
This will automatically run your select_atoms() method and return the resulting region.
- abstractmethod select_atoms()
Write your code to perform the selection.
At the end of your code you should set self.region = {region}.
The product of your selection scheme needs to be a QMzymeRegion in order for it to work with GenerateModel().set_region().
This method is automatically called in the
super().__init__(model, name)line of your __init__() method.
- method_name()
You can add whatever other methods you want in your class, but you should call those methods as necessary in __init__() otherwise your scheme will be automated in GenerateModel.set_region()
- return_region()
This method belongs to the base class and is automatically called in the
super().__init__(model, name)line of your __init__() method. All you have to do is make sure you have created a class attribute called region.
- abstractmethod reference()
This method needs to be included in your class. All it should do is create an attribute called reference that provides a citable reference of the scheme, to give credit where credit is due. The reference will be automatically printed when the class is instantiated. This is taken care of in the the
super().__init__(model, name)line of your __init__() method.Example:
self.reference = "1. Alegre‐Requena, J. V., Sowndarya S. V., S., Pérez‐Soto, R., Alturaifi, T. M. & Paton, R. S. AQME: Automated quantum mechanical environments for researchers and educators. WIREs Comput Mol Sci 13, e1663 (2023)."
In some cases, there might not be a direct reference (see DistanceCutoff class), but there might be relevant work a user might be interested in. Please only refer to the work of interest in the class doc string, not in the reference method.
If there are no references, please only include the line:
self.reference = None
- class QMzyme.SelectionSchemes.DistanceCutoff(model, name, cutoff, include_whole_residues=True)
Bases:
SelectionSchemeThe DistanceCutoff class performs a selection simply based on the distance of atoms from a pre-defined catalytic_center region. Users must first call
GenerateModel().set_catalytic_center(kwargs)in order to then run DistanceCutoff viaGenerateModel().set_region(selection=DistanceCutoff, name={str}, cutoff={int}).This scheme is known to require rather large QM regions to achieve agreement with experiment (Ex., Kulik HJ, Zhang J, Klinman JP, Martínez TJ. How Large Should the QM Region Be in QM/MM Calculations? The Case of Catechol O-Methyltransferase. J Phys Chem B. 2016 Nov 10;120(44):11381-11394. doi: 10.1021/acs.jpcb.6b07814.).
- Parameters:
model (
QMzymeModel, required.) -- QMzymeModel to provide starting structure that selection will be performed on.name (str, required.) -- Name of the region generated.
cutoff (float, required.) -- Numerical value to define cutoff.
include_whole_residues (bool, default=True.) -- Informs code whether or not to only include atoms within the cutoff, or include whole residues if they have at least one atom within the cutoff.
- Returns:
Note
Users are encouraged to evaluate the resulting region. There may be situations where a charged residue is within the cutoff distance, however, its charge partner is not. Such situations can drastically alter the chemistry of the model! Maybe someone could write up a less generic distance based selection scheme that would take such situations into consideration. Or modify the current class to include an argument include_charge_partners=True.
Assign any key word arguments as attributes to self. Then in your select_atoms() method you can pull any necessary args from self attributes, instead of relying on passing them.
Every concrete scheme __init__() method should include this line at the very end:
super().__init__(model, name)
This will automatically run your select_atoms() method and return the resulting region.
- __init__(model, name, cutoff, include_whole_residues=True)
Assign any key word arguments as attributes to self. Then in your select_atoms() method you can pull any necessary args from self attributes, instead of relying on passing them.
Every concrete scheme __init__() method should include this line at the very end:
super().__init__(model, name)
This will automatically run your select_atoms() method and return the resulting region.
- select_atoms()
Executes the distance cutoff selection to identify neighbor atoms and residues surrounding the catalytic center.
This method queries the model's universe to find all atoms within the configured distance threshold of the 'catalytic_center' region.
- Returns:
- Return type:
- Raises:
UserWarning -- If the model has no defined 'catalytic_center' region.
- reference()
Writes out the reference for the selection scheme. This method is automatically called in the
super().__init__(model, name)line of your __init__() method.
- class QMzyme.SelectionSchemes.ChargeShiftAnalysis(model, name, method: QM_Method, min_atoms=900, max_atoms=1000, include_whole_residues=True, alanine_mutation=[], memory=None, nprocs=None, holo_output_files=None, apo_output_files=None, pop='hirshfeld', charge_threshold=<class 'float'>, charge_output_csv: bool = False)
Bases:
SelectionSchemeThe ChargeShiftAnalysis Selection Scheme is based on the CSA approach developed by Prof. Heather Kulik (see (1) Kulik, Heather J.; Zhang, Jianyu; Klinman, Judith P.; Martinez, Todd J.(2016) How Large Should the QM Region Be in QM/MM Calculations? The Case of Catechol O-Methyltransferase. Journal of Physical Chemistry B, 120(44). and (2) Karelina, M., & Kulik, H. J. (2017). Systematic quantum mechanical region determination in QM/MM simulation. Journal of chemical theory and computation, 13(2)). In this scheme, two initial large single-point QM calculations are performed with atomic population analysis with and without the substrate present (Holo and Apo, respectively). The rank, or importance of including a residue in the QM region is determined by how much the residue-summed partial charges shift in the presence of the substrate. Residues above a threshold charge shift value will be included in the returned QMzymeRegion.
This class requires two steps, and therefore two separate calls. The first call will create the Apo and Holo QM calculation input files. The user will then need to run those calculations. After the calculations are complete, the user can reload their saved serialized QMzyme.GenerateModel object (".pkl" format) and then call the ChargeShiftAnalysis class, this time uploading the Apo and Holo QM output files and specifying a charge shift cutoff to return the QMzymeRegion.
- Parameters:
model (
QMzymeModel, required.) -- QMzymeModel to provide starting structure that selection will be performed on.name (str, required.) -- Name of the region generated.
method (
QM_Method, required.) -- Method used to write QM input file.min_atoms (int, default=900.) -- Minimum number of atoms for first QM input file.
max_atoms (int, default=1000.) -- Maximum number of atoms for first QM input file.
include_whole_residues (bool, default=True.) -- Informs code whether or not to only include atoms within the cutoff, or include whole residues if they have at least one atom within the cutoff.
alanine_mutation (str.) -- Selects specific residues to mutate into alanine. The input should be the format for MDAnalysis residue selection.
memory (str, default=None.) -- Memory for QM input file.
nprocs (int, default=None.) -- Number of processors for QM input file.
CSA_elbow (bool, default=False.) -- Chooses elbow method for CSA selection.
holo_output_files (str, default=None.) -- Location of holo output file.
apo_output_files (str, default=None.) -- Location of apo output file.
pop (str, default="hirshfeld".) -- Method for population analysis.
charge_threshold (float, required.) -- Partial charge threshold for CSA selection.
charge_results_csv (bool, optional.) -- Partial charge threshold for CSA selection.
- Returns:
- Usage:
Users must first set a catalytic center and define a QM_Method.
Then they can call the ChargeShiftAnalysis Selection Scheme.
This will create the Holo and Apo input files. The user will then run these QM calculations. Once the calculations have completed, the user can start back where they left off by loading in the saved .pkl file containing the QMzyme.GenerateModel object they instantiated in the previous step.
Note
Users are encouraged to evaluate the resulting region. There may be situations where a charged residue is included because it falls within the distance and atom number range, but its charge partner is excluded. Such imbalances can drastically alter the chemistry of your model! Additionally, please note that the minimum and maximum atom limits set within the class apply to the number of atoms before truncation; therefore, the final atom count might not fall within that initial range. It is highly recommended to verify the structure of your initial input file to ensure it is of high quality.
In addition, it is crucial to set the method as QM_Method. This setting is strictly required for performing the initial population analysis and defining the QMzyme region using charge shift analysis. Any other method selection will return an error message.
For the second part of the ChargeShiftAnalysis class, you must have the .pkl file generated from the first run. Without this file, the ChargeShiftAnalysis class cannot perform the charge shift analysis.
Assign any key word arguments as attributes to self. Then in your select_atoms() method you can pull any necessary args from self attributes, instead of relying on passing them.
Every concrete scheme __init__() method should include this line at the very end:
super().__init__(model, name)
This will automatically run your select_atoms() method and return the resulting region.
- __init__(model, name, method: QM_Method, min_atoms=900, max_atoms=1000, include_whole_residues=True, alanine_mutation=[], memory=None, nprocs=None, holo_output_files=None, apo_output_files=None, pop='hirshfeld', charge_threshold=<class 'float'>, charge_output_csv: bool = False)
Assign any key word arguments as attributes to self. Then in your select_atoms() method you can pull any necessary args from self attributes, instead of relying on passing them.
Every concrete scheme __init__() method should include this line at the very end:
super().__init__(model, name)
This will automatically run your select_atoms() method and return the resulting region.
- select_cat_residues()
Selects catalytic center for apo form generation.
- Parameters:
self --
QMzymeRegion, required.
- create_apo_holo_regions()
Using attributes from ChargeShiftAnalysis class, creates holo and apo region and QM input file for these regions.
- Parameters:
self --
QMzymeModel, required.
- select_atoms()
Write your code to perform the selection.
At the end of your code you should set self.region = {region}.
The product of your selection scheme needs to be a QMzymeRegion in order for it to work with GenerateModel().set_region().
This method is automatically called in the
super().__init__(model, name)line of your __init__() method.
- reference()
Writes out the reference for the selection scheme. This method is automatically called in the
super().__init__(model, name)line of your __init__() method.