Skip to content

Commit

Permalink
Merge pull request #7 from ebi-jdispatcher/dev
Browse files Browse the repository at this point in the history
Dev
  • Loading branch information
biomadeira authored Sep 30, 2024
2 parents 2066efb + ad40b01 commit ca3b15c
Show file tree
Hide file tree
Showing 13 changed files with 285 additions and 143 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -134,4 +134,4 @@ dmypy.json
.idea
.history

testdata/tree*
# testdata/tree*
169 changes: 169 additions & 0 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,169 @@
Contributing to Taxonomy Resolver
=================================

Thank you for your interest in contributing to **Taxonomy Resolver**! We welcome all contributions to make this project better and more useful for the community. This document will guide you through the steps required to contribute to this project.

Table of Contents
-----------------

- `Getting Started <#getting-started>`_
- `How to Contribute <#how-to-contribute>`_
- `Reporting Bugs <#reporting-bugs>`_
- `Suggesting Features <#suggesting-features>`_
- `Contributing Code <#contributing-code>`_
- `Fork the Repository <#fork-the-repository>`_
- `Create a Branch <#create-a-branch>`_
- `Write Your Code <#write-your-code>`_
- `Run Tests <#run-tests>`_
- `Submit a Pull Request <#submit-a-pull-request>`_
- `Code Style <#code-style>`_
- `Testing <#testing>`_
- `Documentation <#documentation>`_
- `License <#license>`_

Getting Started
---------------

To start contributing:

1. Fork the repository from `taxonomy-resolver <https://github.com/ebi-jdispatcher/taxonomy-resolver>`_.
2. Clone your fork locally:

.. code-block:: bash
git clone https://github.com/your-username/taxonomy-resolver.git
3. Install the necessary dependencies:

.. code-block:: bash
pip install -r requirements.txt
4. Alternatively, use Poetry to install the necessary dependencies:

.. code-block:: bash
poetry install
Make sure to check the issues list before contributing. It helps to discuss your contribution idea beforehand, either by opening a new issue or commenting on an existing one.

How to Contribute
-----------------

Reporting Bugs
~~~~~~~~~~~~~~

If you encounter any bugs or unexpected behavior, feel free to open an issue on GitHub. When submitting an issue:

- Use a clear and descriptive title.
- Describe the steps to reproduce the problem.
- Mention the Python version and platform you're using.
- Include any relevant logs, screenshots, or error messages.

Suggesting Features
~~~~~~~~~~~~~~~~~~~

We welcome new feature suggestions! If you have an idea to improve **Taxonomy Resolver**, open an issue with:

- A detailed description of the feature.
- Potential use cases for the feature.
- Why the feature would be useful for the community.

Contributing Code
~~~~~~~~~~~~~~~~~

If you'd like to contribute code, follow these steps:

Fork the Repository
^^^^^^^^^^^^^^^^^^^

1. Fork the repository by clicking the "Fork" button on the top right of the repository page.
2. Clone the forked repository to your local machine:

.. code-block:: bash
git clone https://github.com/your-username/taxonomy-resolver.git
Create a Branch
^^^^^^^^^^^^^^^

Always create a new branch for your changes. Choose a descriptive name for the branch based on the feature or fix you're working on:

.. code-block:: bash
git checkout -b feature/my-feature-name
Write Your Code
^^^^^^^^^^^^^^^

- Add or modify functionality in the appropriate module.
- Ensure your code follows the `Code Style <#code-style>`_ guidelines.
- Write or update tests for your changes.

Run Tests
^^^^^^^^^

Before submitting your contribution, run the test suite to ensure your changes don't break existing functionality:

.. code-block:: bash
pytest tests/test_*.py
# or simply
pytest
If you're adding new functionality, be sure to include tests to cover that behavior.

Submit a Pull Request
^^^^^^^^^^^^^^^^^^^^^

Once you're ready to submit your changes:

1. Push the changes to your branch on your forked repository:

.. code-block:: bash
git push origin feature/my-feature-name
2. Open a pull request (PR) from your fork to the original repository. In your PR description:
- Explain the purpose of the changes.
- Link to the relevant issue if it exists.
- Provide any additional context or background for reviewers.

Code Style
----------

We follow `PEP 8 <https://pep8.org/>`_ for Python code style. Please ensure your code adheres to these guidelines. Additionally, we use the following tools to maintain code quality:

- **Black** for code formatting:

.. code-block:: bash
black .
Testing
-------

We use the ``pytest`` framework for testing. Please ensure that all new features and changes are covered by unit tests.

To run the test suite:

.. code-block:: bash
pytest
You are encouraged to write tests that cover edge cases and typical usage patterns. All tests should pass before you submit a pull request.

Documentation
-------------

Documentation is important! If you make changes to the codebase, please ensure the relevant documentation is updated. Documentation is currently provided as part of the main `README.rst <./README.rst>`_

- Ensure all public methods, functions, and classes are well-documented with docstrings.
- If adding a new feature or CLI command, update the README or other relevant documentation.

License
-------

By contributing to this project, you agree that your contributions will be licensed under the `Apache License 2.0 <./LICENSE>`_.

Thank you for your contributions!
69 changes: 54 additions & 15 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,12 @@
|PyPI license| |PyPI version|

.. |PyPI version| image:: https://img.shields.io/pypi/v/taxonomy-resolver.svg?label=PyPI%20version&color=blue
:target: https://pypi.org/project/taxonomy-resolver/

.. |PyPI license| image:: https://img.shields.io/pypi/l/taxonomy-resolver.svg?label=License&color=blue
:target: https://pypi.org/project/taxonomy-resolver/


#################
Taxonomy Resolver
#################
Expand Down Expand Up @@ -95,12 +104,12 @@ The resulting tree can be represented in tabular form:
Dependencies and Installation
=============================

Installation requires `Python`_ 3.9+ (recommended version 3.11). Additional requirements, which will be downloaded and installed automatically. See full list of dependencies in `requirements.txt`_
Installation requires `Python`_ 3.10+ (recommended version 3.11). Additional requirements, which will be downloaded and installed automatically. See full list of dependencies in `requirements.txt`_

Python Environment
------------------

Dependencies for the Python tools developed here, are the typical Python stack (3.9+ and pip). A good approach is to set a virtual environment:
Dependencies for the Python tools developed here, are the typical Python stack (3.10+ and pip). A good approach is to set a virtual environment:

.. code-block:: bash
Expand Down Expand Up @@ -132,7 +141,7 @@ Example of typical usage of the Taxonomy Resolver module is provided below:

.. code-block:: python
from taxonresolver import TaxonResolver
from taxonomyresolver import TaxonResolver
resolver = TaxonResolver()
Expand All @@ -150,6 +159,7 @@ Example of typical usage of the Taxonomy Resolver module is provided below:
# Get a list of children TaxIDs that compose a set of TaxIDs
searchfile = "taxids_search.txt"
tax_ids = resolver.search(searchfile)
# Write the TaxIDs to a file
taxidsfile = "taxids_list.txt"
with open(outfile, "w") as outfile:
Expand All @@ -160,7 +170,7 @@ When a Taxonomy Tree is already available one can simply load it with ``resolver

.. code-block:: python
from taxonresolver import TaxonResolver
from taxonomyresolver import TaxonResolver
resolver = TaxonResolver()
Expand All @@ -177,13 +187,11 @@ When a Taxonomy Tree is already available one can simply load it with ``resolver
CLI
---

Explore the CLI and each command by running
``python taxonomy_resolver.py (COMMAND) --help``. If Taxonomy Resolver was installed with
``python setup.py install``, then simply run ``taxonomy_resolver --help``:
Explore the CLI by running ``taxonomy-resolver (COMMAND) --help``

.. code-block:: bash
Usage: taxonomy_resolver [OPTIONS] COMMAND1 [ARGS]... [COMMAND2
Usage: taxonomy-resolver [OPTIONS] COMMAND1 [ARGS]... [COMMAND2
[ARGS]...]...
Taxonomy Resolver: Build a NCBI Taxonomy Tree, validate and search TaxIDs.
Expand All @@ -199,44 +207,74 @@ Explore the CLI and each command by running
validate Validates a list of TaxIDs against a Tree data structure.
Additional help is provided for each command, for example, running ``taxonomy-resolver (command) --help``, returns:

.. code-block:: bash
Usage: taxonomy-resolver search [OPTIONS]
Searches a Tree data structure and writes a list of TaxIDs.
Options:
-in, --infile TEXT Path to input NCBI BLAST dump or a prebuilt tree file, (currently: 'pickle'). [required]
-out, --outfile TEXT Path to output file.
-inf, --informat TEXT Input format (currently: 'pickle').
-taxid, --taxid TEXT Comma-separated TaxIDs or pass multiple values. Output to STDOUT by default, unless an output file is provided.
-taxids, --taxidinclude TEXT Path to Taxonomy id list file used to search the Tree.
-taxidexc, --taxidexc TEXT Comma-separated TaxIDs or pass multiple values.
-taxidse, --taxidexclude TEXT Path to Taxonomy id list file excluded from the search.
-taxidsf, --taxidfilter TEXT Path to Taxonomy id list file used to filter the search.
-ignore, --ignoreinvalid Ignores invalid TaxIDs.
-level, --log_level TEXT Log level to use. Expects: 'DEBUG', 'INFO', 'WARN', 'ERROR', and 'CRITICAL'.
-l, --log_output TEXT File name to be used to writing logging.
--quiet Disables logging.
-sep, --sep TEXT String Separator to use.
-indx, --indx INTEGER String positional index to use (starts with 0).
-h, --help Show this message and exit
Getting the NCBI Taxonomy Data from the `NCBI ftp server`_:

.. code-block:: bash
python taxonomy-resolver.py download -out taxdmp.zip
taxonomy-resolver download -out taxdmp.zip
Building a Tree structure from the ``taxdmp.zip`` file and saving it in JSON (or alternatively in ``pickle`` format):

.. code-block:: bash
python taxonomy-resolver.py build -in taxdmp.zip -out tree.pickle
taxonomy-resolver build -in taxdmp.zip -out tree.pickle
Filtering an existing Tree structure in ``pickle`` format by passing a file containing a list of TaxIDs, and saving it in ``pickle`` format:

.. code-block:: bash
python taxonomy-resolver.py build -in tree.pickle -inf pickle -out tree_filtered.pickle -outf pickle -taxidf testdata/taxids_filter.txt
taxonomy-resolver build -in tree.pickle -inf pickle -out tree_filtered.pickle -outf pickle -taxidf testdata/taxids_filter.txt
Load a previously built Tree data structure in ``pickle`` format and generating a list of TaxIDs that compose the hierarchy based on list of TaxIDs:

.. code-block:: bash
python taxonomy-resolver.py search -in tree.pickle -taxids testdata/taxids_search.txt
taxonomy-resolver search -in tree.pickle -taxids testdata/taxids_search.txt
Load a previously built Tree data structure in ``pickle`` format and generating a list of TaxIDs (included TaxIDs), exclude TaxIDs from the search (excluded TaxIDs), and filter the final result to only those TaxIDs that are available in the list of filter TaxIDs (filtered TaxIDs):

.. code-block:: bash
python taxonomy-resolver.py search -in tree.pickle -taxids testdata/taxids_search.txt -taxidse testdata/taxids_exclude.txt -taxidsf testdata/taxids_filter.txt -out taxids_list.txt
taxonomy-resolver search -in tree.pickle -taxids testdata/taxids_search.txt -taxidse testdata/taxids_exclude.txt -taxidsf testdata/taxids_filter.txt -out taxids_list.txt
Validating a list of TaxIDs against a Tree data structure in ``pickle`` format:

.. code-block:: bash
python taxonomy-resolver.py validate -in tree.pickle -taxids testdata/taxids_validate.txt
taxonomy-resolver validate -in tree.pickle -taxids testdata/taxids_validate.txt
Contributing
============

See the `CONTRIBUTING.rst` for more information about contributing to Taxonomy Resolver.

Bug Tracking
============
Expand All @@ -251,7 +289,7 @@ See release notes on `CHANGELOG.rst`_
Acknowledgments
===============

I would like to thanks Adrian Tivey for insightful discussions.
I would like to thank Adrian Tivey for insightful discussions.

License
=======
Expand All @@ -267,5 +305,6 @@ Apache License 2.0. See `license`_ for details.
.. _NCBI Taxonomy: https://www.ncbi.nlm.nih.gov/taxonomy
.. _NCBI ftp server: https://ftp.ncbi.nih.gov/pub/taxonomy/
.. _CHANGELOG.rst: CHANGELOG.rst
.. _CONTRIBUTING.rst: CONTRIBUTING.rst
.. _nodes_mock.dmp: testdata/nodes_mock.dmp
.. _EMBL-EBI: https://www.ebi.ac.uk/
14 changes: 0 additions & 14 deletions __main__.py

This file was deleted.

Loading

0 comments on commit ca3b15c

Please sign in to comment.