-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Residue naming issue in protein preparation #291
Comments
Hi @junsuhas For the first example:
This is because the protonation or tautomeric state of the histidine residue is unspecified, and Meeko brings it to your attention as it doesn't want to make an assumption. This histidine can be HIE, HID or HIP (not just HIP) - and the decision must be made in the input structure. Meeko currently does not have an internal mechanism to enumerate or evaluate macromolecule protonation states. To assign and evaluate protonation states, popular choices are reduce and PDB2PQR. Both of them are current and can be incorporated in a Python environment and have command-line scripts and/or Python usage just like Meeko.
For the second example:
This is because GL3 present in the system as a linking fragment, but it doesn't have the standard protein- or nucleic-acid -like backbone. Meeko currently only automatically processes the nonstandard residues with standard backbone, to make sure the chemistry - atom and bond types at the end points - are predictable. Please let us know what you think and if you have any further questions. Thanks! |
Thanks for your response! |
The residue matching algorithm depends on whether or not the input residue has any hydrogens. For histidine, if there are zero input hydrogens, it will default to HIE. If there is any hydrogen, even bound to carbon as in the example, then the algorithm needs to identify one template that has the least number of missing hydrogens. Since there aren't any hydrogens on the sidechain (only on the backbone), HID and HIE tie for fewest missing Hs and the error is raised. |
Hey @rwxayheee @diogomart, I know this issue is already closed but I encounter the same for some structures. I noticed that even with |
Hi @frgoe003 I had thought about that too. I implemented the template generation but excluded the linking fragments for several reasons. One of them is that the auto-removal of linking fragments will affect the embedding state of another residue (to which the linking fragment is attached). Doing this can potentially set the system up with a different (?) residue template, like if the fragment is linked to a Cys or Lys. There's currently no automatic way to delete linking fragments, and the intention was to urge users to pause and manually correct the structures. There are workarounds for automation processes, though, as we are allowing the deletion of linking fragments by --delete_residues. Here's what I might do: 1- extract residue ID of linking fragment from the logging The ideal situation would be to allow the incorporation of linking fragments as well as gracefully deleting residues with minimal disruption in structure. I've wanted to implement this, but it may require a different approach for end capping and template matching. Currently, we rely on pre-registration of protonation states and embedding forms of all residues, but that isn't feasible for large databases. Generating the template and checking on the linking patterns becomes sort of 'chicken-and-egg' problem, when a new pattern is encountered. With this, a pause for manual inspection is always expected. |
Hello!
Thank you for updating Meeko.
I find it very useful.
If the information in your PDB is slightly incorrect or you are experiencing other issues with the new update of Protein Preparation, we have two questions about errors.
For protein preparation, I used
mk_prepare_receptor.py -p -a
.The two problems are very similar.
first
The error occurs when a residue name that should say HIP is listed as HIS in error.
Is it possible to be more flexible about this?
example1
second
If a residue name other than the specified residue comes in, an error is thrown.
For HIC, it proceeds with a new Template built, but for GL3, it errors with unknown residue.
I would like to see GL3 ignored, or atomized.
example2
Thank you for taking the time to look at my question.
I would also appreciate knowing if you plan to update that part or just ignore it and leave it as is.
The text was updated successfully, but these errors were encountered: