Skip to content

Commit

Permalink
v0.6.1
Browse files Browse the repository at this point in the history
  • Loading branch information
diogomart committed Nov 18, 2024
2 parents 7880da1 + aa8d4df commit 07a38f8
Show file tree
Hide file tree
Showing 33 changed files with 303 additions and 207 deletions.
10 changes: 9 additions & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
include README.md
include LICENSE
include test/*
include test/*
include test/flexibility_data/*
include test/macrocycle_data/*
include test/polymer_data/*
include test/rdkitmol_from_docking_data/*
include test/small_cycle_data/*
include example/*
include example/tutorial1/*
include meeko/data/*
include meeko/data/params/*
include meeko/tmp/*
include meeko/utils/*
include meeko/utils/*
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
# Meeko: interface for AutoDock

[![API stability](https://img.shields.io/badge/stable%20API-no-orange)](https://shields.io/)
[![PyPI version fury.io](https://img.shields.io/badge/version-0.6.0-green.svg)](https://pypi.python.org/pypi/meeko/)
[![Documentation Status](https://readthedocs.org/projects/meeko/badge/?version=readthedocs)](https://meeko.readthedocs.io/en/readthedocs/?badge=readthedocs)
[![PyPI version fury.io](https://img.shields.io/badge/version-0.6.1-green.svg)](https://pypi.python.org/pypi/meeko/)
[![Documentation Status](https://readthedocs.org/projects/meeko/badge/?version=release)](https://meeko.readthedocs.io/en/release/?badge=release)

Meeko prepares the input for AutoDock and processes its output.
It is developed alongside AutoDock-GPU and AutoDock-Vina.
Expand All @@ -16,7 +16,7 @@ at [Scripps Research](https://www.scripps.edu/).

## Documentation

The docs are hosted on [meeko.readthedocs.io](meeko.readthedocs.io)
The docs are hosted on [meeko.readthedocs.io](https://meeko.readthedocs.io/en/release)


## Reporting bugs
Expand All @@ -42,7 +42,7 @@ pip install meeko

Meeko exposes a Python API to enable scripting. Here we share very minimal examples
using the command line scripts just to give context.
Please visit the [meeko.readthedocs.io](meeko.readthedocs.io) for more information.
Please visit the [meeko.readthedocs.io](https://meeko.readthedocs.io/en/release) for more information.

Parameterizing a ligand and writing a PDBQT file:
```bash
Expand Down
1 change: 1 addition & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
sphinx-book-theme
sphinx-design
4 changes: 2 additions & 2 deletions docs/source/cli_rec_prep.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ Write flags
~~~~~~~~~~~

The option flags starting with ``--write`` in ``mk_prepare_receptor`` can
be used both with an argument to specify the outpuf filename:
be used both with an argument to specify the output filename:

.. code-block:: bash
Expand Down Expand Up @@ -72,7 +72,7 @@ The arguments involving assignment of residues to properties:
-s, --reactive_name_specific <residue:atom>
use the residue selection lanaguge described above, followed by an equal sign (``=``) as the delimiter and the assigned value, which could be the name of a residue template, the atom index for the blunt end, the wanted altloc ID, or the atom name of the reactive atom. Each residue selection is comibned with the most recent assignment that precedes it, resulting in a further expanded list of residue-assignment pairs.
use the residue selection lanaguge described above, followed by an equal sign (``=``) as the delimiter and the assigned value, which could be the name of a residue template, the atom index for the blunt end, the wanted altloc ID, or the atom name of the reactive atom. Each residue selection is combined with the most recent assignment that precedes it, resulting in a further expanded list of residue-assignment pairs.

For an input like ``"A:5,7=CYX,A:19A,B:17=HID``, this assignment language represents: ``residues (number) 5 in Chain A are set to (template name) CYX`` and ``residue (number) 19 A in Chain A, and residue (number) 17 in Chain B are set to (template name) HID``.

Expand Down
4 changes: 2 additions & 2 deletions docs/source/colab_examples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,14 +69,14 @@ The basic docking example is developed to showcase the usage of **import additio

`Run on Colab! <https://colab.research.google.com/drive/1tzQoguVQDCguOaLSsGvQuL57ry_PY3UG?usp=sharing>`_

The reactive docking example is based on reactive docking method that has been developed for high-throughput virtual screenings of reactive species. In this example, a small molecule substrate (Adenosine monophosphate, PDB token AMP) is targeting at the catalytic histidine residue of a hollow protein structure of bacteria RNA 3' cyclase (PDB token 3KGD) to generate the near-attack conformation for the formation of the phosphoamide bond. A docked pose that closely resembles the original position of the ligand is expected among the top-ranked poses.
The reactive docking example is based on reactive docking method that has been developed for high-throughput virtual screenings of reactive species. In this example, a small molecule substrate (Adenosine monophosphate, PDB token AMP) is targeting the catalytic histidine residue of a hollow protein structure of bacteria RNA 3' cyclase (PDB token 3KGD) to generate the near-attack conformation for the formation of the phosphoamide bond. A docked pose that closely resembles the original position of the ligand is expected among the top-ranked poses.


[AutoDock-GPU] Tethered Docking
---------------

`Run on Colab! <https://colab.research.google.com/drive/1tf9xOgn6u8eDTeFJtc8GCEGRX-8aR9Bo?usp=sharing>`_

The covalent docking example is based on the two-point attractor and flexible side chain method. In this example, a small molecule substrate (Adenosine monophosphate, PDB token AMP) is attached onto the catalytic histidine residue of a hollow protein structure of bacteria RNA 3' cyclase (PDB token 3KGD) to reproduce the covalent intermediate complex structure. A docked pose that closely resembles the original position of the ligand is expected among the top-ranked poses.
The covalent docking example is based on the two-point attractor and flexible side chain method. In this example, a small molecule substrate (Adenosine monophosphate, PDB token AMP) is attached to the catalytic histidine residue of a hollow protein structure of bacteria RNA 3' cyclase (PDB token 3KGD) to reproduce the covalent intermediate complex structure. A docked pose that closely resembles the original position of the ligand is expected among the top-ranked poses.


1 change: 1 addition & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
'sphinx.ext.autodoc',
'sphinx.ext.napoleon',
'sphinx.ext.intersphinx',
'sphinx_design',
]

html_logo = "images/raccoon.png"
Expand Down
28 changes: 23 additions & 5 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -33,17 +33,35 @@ convert docking output files ``mk_export.py``.
Running a docking
-----------------

To run a docking, more packages are required besides Meeko:
To run a docking, more packages are required besides Meeko. At a minimum,
either Vina or AutoDock-GPU are needed to run the actual docking.

.. grid:: 3

.. grid-item-card:: AutoDock-GPU
:link: https://github.com/ccsb-scripps/AutoDock-GPU

Docking for GPUs. Implements the AutoDock4.2 scoring function.
Command line executable only.

.. grid-item-card:: AutoDock-Vina
:link: https://autodock-vina.readthedocs.io/

Docking on CPUs.
Implements Vina and AutoDock4.2 scoring functions.
Has a Python API and command line executable.

.. grid-item-card:: Ringtail
:link: https://github.com/forlilab/ringtail

Store and analyze virtual screening with SQLite.
Has a Python API and command line scripts.

* AutoDock-Vina
* AutoDock-GPU
* Ringtail

Check the tutorials page to learn about using meeko with these other packages
to run molecular docking and virtual screening.



.. toctree::
:maxdepth: 3
:hidden:
Expand Down
14 changes: 8 additions & 6 deletions docs/source/lig_overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,28 +14,30 @@ Use of RDKit
Meeko takes as input an RDKit molecule that has 3D positions and
all hydrogens as real atoms, and creates an object called MoleculeSetup
that stores all the parameters, such as atom types, partial charges, or
rotatable bonds. Adding hydrogens and 3D positions is not performed by Meeko.
rotatable bonds. RDKit checks the valence of atoms, making it difficult
to have molecules with incorrect bond orders or formal charges.
Adding hydrogens and 3D positions is not performed by Meeko.

Parameterization using SMARTS patterns
--------------------------------------

Many of the parameters are assigned using SMARTS strings, which are a compact
and versatile query language for identigying chemical substructures. This makes
and versatile query language for identifying chemical substructures. This makes
it easier to define custom atom types and other parameters. The SMARTS
that define the AutoDock parameters are included in Meeko and set as the
default, This goal of this feature is to facilitate the implementation of
default, The goal of this feature is to facilitate the implementation of
custom docking workflows.

The MoleculePreparation class stores all the configuration required to
parameterize a ligand, and contains the methods that take an RDKit molecule
as input, and return a parameterized MoleculeSetup. The MoleculePreparation
offers several option to control how molecules are parameterized, and many of
which are exposed as command line options in ``mk_prepare_ligand.py``.
offers several options to control how molecules are parameterized, and most
are exposed as command line options in ``mk_prepare_ligand.py``.

Preparing the input for docking
-------------------------------

AutoDock-GPU and Vina use the PDBQT format for input molecules.
PDBQT files, or strings in Python, are produced from a parameterized instance
of MoleculeSetup. The methods for that live in the PDBQTWriterLegacy class.
of MoleculeSetup. The methods to write PDBQT are in the PDBQTWriterLegacy class.
PDBQT strings can be passed directly to Vina using its Python API.
2 changes: 1 addition & 1 deletion docs/source/lig_prep_advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ rotatable, which often leads to unreasonable geometries, but is necessary to
visit both amide rotamers during docking.

Here, we configure Meeko to make single bonds in some conjugated systems rigid,
as defined byt the SMARTS ``"C=CC=C"``, and rigidify all amide bonds matched
as defined by the SMARTS ``"C=CC=C"``, and rigidify all amide bonds matched
by ``"[CX3](=O)[NX3]"``, which includes tertiary amides but not thioamides or
amidines:

Expand Down
4 changes: 2 additions & 2 deletions docs/source/lig_prep_basic.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Writing a single PDBQT file:
If the ``-o`` option is omitted, the output filename will be the same as the
input but with ``.pdbqt`` extension. Option ``-`` prints the PDBQT
string to standard output instead of writting to a file.
string to standard output instead of writing to a file.

Writing multiple PDBQT files from an SD file with multiple molecules:

Expand Down Expand Up @@ -67,6 +67,6 @@ AutoDock-Vina, or passed directly to Vina within Python using Vina's Python API,
and avoiding writing PDBQT files to the filesystem.

Note that calling ``mk_prep`` returns a list of molecule setups.
As of v0.6.0, this list constains only one element unless ``mk_prep`` is
As of v0.6.0, this list contains only one element unless ``mk_prep`` is
configured for reactive docking, which is not the case in this example. This is
why we are considering the first (and only) molecule setup in the list.
20 changes: 10 additions & 10 deletions docs/source/py_build_temp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ Advanced information about templates

The interpretation of the valence (bonds) and formal charge of atoms is an essential step when parsing a PDB/CIF file, and the accuracy of residue mapping is crucial to the creation of a macrobiomolecule system. In Meeko, the input residue names are used as keys and the chemical templates are retrieved accordingly based on :ref:`templates <templates>`.

For the command line script for receptor preparatoin, ``mk_prepare_receptor.py``, there are three major ways of obtaining such templates:
For the command line script for receptor preparation, ``mk_prepare_receptor.py``, there are three major ways of obtaining such templates:

**(1) Loading from the default JSON file:** ``Meeko/meeko/data/residue_chem_templates.json``
**(1) Load from the default JSON file:** ``Meeko/meeko/data/residue_chem_templates.json``

This is the default residue template set curated by us, including:

Expand All @@ -30,22 +30,22 @@ This is the default residue template set curated by us, including:
├── parmBSC1.lib
└── all_modrna08.lib
(c) residues or ligands in CCD (Chemical Component Dictionary) that have conflicting names with the above residues.
(c) the ligands in the CCD (Chemical Component Dictionary) that have conflicting names with the above residues.

**(2) Loading by ``--add_template`` from an additional JSON file:** (example) ``Meeko/meeko/data/NAKB_templates.json``
**(2) Load from an additional JSON file** by argument ``--add_templates``, e.g., ``--add_templates Meeko/meeko/data/NAKB_templates.json``

This is an optional add-on template set generated by us, based on the curated set of modified nucleotides by Nucleic Acid Knowledgebase (NAKB).

**(3) Fetching from PDB by CCD name on the run**
**(3) Fetch from PDB by CCD name during runtime**

When an unknown residue is encountered, ``mk_prepare_receptor.py`` attempts to resolve its chemical identity by fetching a definition CIF file from PDB (Protein Data Bank) and generates chemical templates of all possible embedding forms of it when there are inter-residue bonds. Currently, this is an automated yet relatively simple process that only supports noncovalent ligands and residues with unmodified backbones.

Here, we present a quick guide of building potentially complicated residue templates on your own using the ``meeko.chemtempgen`` submodule. In this example, we will be working with residue ``CRO``, a naturally occuring fluorophore in green fluorescent proteins formed by condensation of three consecutive residues Ser-Tyr-Gly.
Here, we present a quick guide to building potentially complicated residue templates using the ``meeko.chemtempgen`` submodule. In this example, we will prepare a residue with modified backbone: ``CRO``, a naturally occurring fluorophore in green fluorescent proteins, formed by condensation of three consecutive residues Ser-Tyr-Gly.

Example usage
-------------

Before we start, we will import the required modules and optionally, suppress excess rdkit loggings that may occur during the editing of molecular structures. Then we will create a ``ChemicalComponent`` from a definition CIF file, which will be obtained by ``fetch_from_pdb`` (Internet connection is required).
Before we start, we will import the required modules and, optionally, suppress excess rdkit loggings that may occur during the editing of molecular structures. Then, we will create a ``ChemicalComponent`` from a definition CIF file, which will be obtained by ``fetch_from_pdb`` (Internet connection is required).

.. code-block:: python
Expand Down Expand Up @@ -92,7 +92,7 @@ The created ``ChemicalComponent`` object, ``CRO_from_cif``, has a corresponding
:width: 60%
:align: center

As we may see from the picture above, in order to forge ``CRO`` into a linking embedded fragment in a protein, some atoms need to be removed. In this example, we will simply do so by specifying the atom names. ``make_embedded`` calls function ``embed`` on the duplicated object ``cc``, which takes ``embed_allowed_smarts`` as the editable zone and removes atoms matching the names in ``leaving_names``. Here, the ``embed_allowed_smarts`` is chosen to be the SMARTS of altered backbone in residue ``CRO``. Note that by default, ``embed`` removes associated hydrogens for convenience. Therefore, in this case, ``leaving_names = {"H2", "OXT"}`` removes atoms ``H2``, ``OXT`` as well as the bonded hydrogen, ``HXT``. The same task could be alternatively done by the equivalent SMARTS pattern.
As we may see from the picture above, in order to forge ``CRO`` into a linking embedded fragment in a protein, some atoms need to be removed. In this example, we will simply do so by specifying the atom names. ``make_embedded`` calls function ``embed`` on the duplicated object ``cc``, which takes ``embed_allowed_smarts`` as the editable zone and removes atoms matching the names in ``leaving_names``. Here, the ``embed_allowed_smarts`` is chosen to be the SMARTS of altered backbone in residue ``CRO``. Note that by default, ``embed`` removes associated hydrogens for convenience. Therefore, in this case, ``leaving_names = {"H2", "OXT"}`` removes atoms ``H2``, ``OXT`` as well as the bonded hydrogen, ``HXT``. The equivalent SMARTS pattern could alternatively do the same task.

.. code-block:: python
Expand All @@ -108,7 +108,7 @@ As we may see from the picture above, in order to forge ``CRO`` into a linking e
:width: 60%
:align: center

Looking at the structure of the edited picture, we will see that the unneccessary atoms have gone and the hydrogens at the broken (blunt) ends become implict, which is exactly needed to generate the Smiles string for the chemical template. Function ``make_pretty_smiles`` makes the Smiles string with all Hs explicit for the template's RDKit molecule. Last but not least, we will determin the ``link_labels`` which specifies how ``CRO`` should be connected to other residues. Here, we will use the pattern from a built-in recipe, ``AA_recipe.pattern_to_label_mapping_standard``, which also applies to all other standard amino acid residues: ``{'[NX3h1]': 'N-term', '[CX3h1]': 'C-term'}``. Opionally, we can run a ``ResidueTemplate_check`` to see potential problems with the generated template.
Looking at the structure of the edited picture, we will see that the unnecessary atoms have gone and the hydrogens at the broken (blunt) ends become implicit, which is exactly needed to generate the Smiles string for the chemical template. Function ``make_pretty_smiles`` makes the Smiles string with all Hs explicit for the template's RDKit molecule. Last but not least, we will determine the ``link_labels`` which specifies how ``CRO`` should be connected to other residues. Here, we will use the pattern from a built-in recipe, ``AA_recipe.pattern_to_label_mapping_standard``, which also applies to all other standard amino acid residues: ``{'[NX3h1]': 'N-term', '[CX3h1]': 'C-term'}``. Optionally, we can run a ``ResidueTemplate_check`` to see potential problems with the generated template.

.. code-block:: python
Expand Down Expand Up @@ -165,7 +165,7 @@ To make the N-terminal embedding variant of ``CRO``:
cc_N.resname += "_N"
export_chem_templates_to_json([cc_N])
In the chained procedure above, we have removed ``OXT`` and protonated ``N1``, which is done by ``make_capped`` that adds hydrogen(s) to matching atom(s) with specified ``capping_names`` within the region of ``allowed_smarts``. The expected outout from ``export_chem_templates_to_json`` is:
In the chained procedure above, we have removed ``OXT`` and protonated ``N1``, which is done by ``make_capped`` that adds hydrogen(s) to matching atom(s) with specified ``capping_names`` within the region of ``allowed_smarts``. The expected output from ``export_chem_templates_to_json`` is:

.. code-block:: bash
Expand Down
4 changes: 2 additions & 2 deletions docs/source/rec_overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ Polymers and monomers
---------------------

Receptors are represented as a collection of subunits where each subunit is
treated as an individual molecule. THe same code path for ligand parameterization
is also used herein. The Python class for receptors is called
treated as an individual molecule. The Python class for receptors is called
``Polymer`` and the class for subunits is ``Monomer``.
The code for ligand parameterization is also used for the receptor.

During most of the v0.6.0 development, the name of the ``Polymer`` class was
``LinkedRDKitChorizo``. Many of the commit messages, as well as GitHub issues and pull
Expand Down
Loading

0 comments on commit 07a38f8

Please sign in to comment.