Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updates to pass tests w newer program versions (pandas, black, etc) #101

Merged
merged 12 commits into from
May 23, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 0 additions & 5 deletions .flake8

This file was deleted.

45 changes: 45 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
name: Run tests

on:
push:
branches:
- master
pull_request:
branches:
- master

jobs:
test:
name: Run tests
runs-on: ubuntu-latest
timeout-minutes: 60
steps:
- name: checkout
uses: actions/checkout@v4

- name: build conda environment
uses: conda-incubator/setup-miniconda@v3
with:
activate-environment: alignparse
environment-file: environment.yml
auto-activate-base: false
auto-update-conda: true
channel-priority: strict

- name: install package and dependencies
# NOTE: must specify the shell so that conda init updates bashrc see:
# https://github.com/conda-incubator/setup-miniconda#IMPORTANT
shell: bash -el {0}
run: pip install -e . && pip install -r test_requirements.txt

- name: lint code with ruff
shell: bash -el {0}
run: ruff check .

- name: check code format with black
shell: bash -el {0}
run: black --check .

- name: test code with `pytest`
shell: bash -el {0}
run: pytest
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ _temp*
!.travis.yml
!.flake8
!.nojekyll
!.github

*.pyc
docs/alignparse.*
Expand Down
30 changes: 0 additions & 30 deletions .travis.yml

This file was deleted.

19 changes: 19 additions & 0 deletions CHANGELOG.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,25 @@ All notable changes to this project will be documented in this file.

The format is based on `Keep a Changelog <https://keepachangelog.com>`_.

0.6.3
-----

Fixed
+++++
* Fix bug in handling ``minimap2`` errors ([see this issue](https://github.com/jbloomlab/alignparse/issues/99))
* Pass formatting with new ``black`` version
* Pass tests with new ``pandas`` version.
* Fixed ``simple_mut_consensus`` for newer versions of ``pandas`` when goruping by just one variable.

Changed
+++++++
* Change code linting to ``ruff`` rather than ``flake8``.
* Test with GitHub Actions rather than Travis CI.
* Remove ``mybinder`` examples.
* Test on Python 3.11 rather than 3.9.
* Don't allow ``pysam`` version 0.22.1 as it was causing some type of OPENSSL import error.
* Test with ``minimap2`` version 2.22

0.6.2
-----

Expand Down
32 changes: 7 additions & 25 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ Formatting
++++++++++
The code is formatted using `Black <https://black.readthedocs.io/en/stable/index.html>`_, which you can install using `pip install "black[jupyter]"`.
You may also wish to install a Black extension in your editor to, for example, auto-format upon save.
In any case, please run Black using `black .` before submitting your PR, because the Travis tests will not pass unless the files have been formatted.
In any case, please run Black using `black .` before submitting your PR, because the tests will not pass unless the files have been formatted.
Note that this will change files/notebooks that you may be actively editing.

Versions and CHANGELOG
Expand All @@ -57,15 +57,6 @@ When you add code that uses a new package that is not in the standard python lib
`See here <https://packaging.python.org/discussions/install-requires-vs-requirements/>`_ for information on how to do this, and how to specify minimal required versions.
As described in the above link, you should **not** pin exact versions in `install_requires` in `setup.py <setup.py>`_ unless absolutely necessary.

Notebooks on mybinder
-----------------------
The `Jupyter notebooks`_ in notebooks_ can be run interactively on mybinder_ by going to the following link:
https://mybinder.org/v2/gh/jbloomlab/alignparse/master?filepath=notebooks

In order for this to work, you need to keep the `environment.yml <environment.yml>`_ configuration file up to date with the dependencies for running these notebooks as `described here <https://mybinder.readthedocs.io/en/latest/config_files.html>`_.
Note that unlike for the `install_requires` in `setup.py <setup.py>`_, you may want to pin exact versions here to get reproducible installations.
Look into the `pip freeze <https://pip.pypa.io/en/stable/reference/pip_freeze/>`_ and `conda env export <https://packaging.python.org/discussions/install-requires-vs-requirements>`_ commands on how to automatically create such a configuration file.

Testing
---------

Expand All @@ -87,29 +78,21 @@ If these are not installed, install them with::

pip install -r test_requirements.txt

Then use flake8_ to `lint the code <https://en.wikipedia.org/wiki/Lint_%28software%29>`_ by running::
Then use ruff_ to `lint the code <https://en.wikipedia.org/wiki/Lint_%28software%29>`_ by running::

flake8
ruff check .

If you need to change the flake8_ configuration, edit the `.flake8 <.flake8>`_ file.
If you need to change the ruff_ configuration, edit the `ruff.toml <ruff.toml>`_ file.

Then run the tests with pytest_ by running::

pytest

If you need to change the pytest_ configuration, edit the `pytest.ini <pytest.ini>`_ file.

Automated testing on Travis
+++++++++++++++++++++++++++
The aforementioned flake8_ and pytest_ tests will be run automatically by the Travis_ continuous integration system as specified in the `.travis.yml <.travis.yml>`_ file.
Note that running the Travis_ tests requires you to register the project with Travis_.

If the tests are passing, you will see this on the Travis_ badge on GitHub repo main page.

Slack notifications of test results
Automated testing with GitHub Actions
+++++++++++++++++++++++++++++++++++++
You can configure Travis_ to provide automatic Slack notifications of the test results.
To do that, follow the `instructions here <https://docs.travis-ci.com/user/notifications/#configuring-slack-notifications>`_.
The aforementioned ruff_ and pytest_ tests will be run automatically by GitHub Actions using the test in [.github/workflows/test.yml](.github/workflows/test.yml).


Building documentation
Expand All @@ -133,8 +116,7 @@ Finally, upload to PyPI_ with twine_ as `described here <https://github.com/pypa
Note that this requires you to have registered the package on PyPI_ if this is the first version of the package there.

.. _pytest: https://docs.pytest.org
.. _flake8: http://flake8.pycqa.org
.. _Travis: https://docs.travis-ci.com
.. _ruff: https://github.com/astral-sh/ruff
.. _PyPI: https://pypi.org/
.. _pip: https://pip.pypa.io
.. _sphinx: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html
Expand Down
13 changes: 8 additions & 5 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,21 @@ alignparse
.. image:: https://img.shields.io/pypi/v/alignparse.svg
:target: https://pypi.python.org/pypi/alignparse

.. image:: https://app.travis-ci.com/jbloomlab/alignparse.svg?branch=master
:target: https://app.travis-ci.com/github/jbloomlab/alignparse

.. image:: https://mybinder.org/badge_logo.svg
:target: https://mybinder.org/v2/gh/jbloomlab/alignparse/master?filepath=notebooks
.. image:: https://github.com/jbloomlab/alignparse/actions/workflows/test.yml/badge.svg
:target: https://github.com/jbloomlab/alignparse/actions/workflows/test.yml

.. image:: https://zenodo.org/badge/194140958.svg
:target: https://zenodo.org/badge/latestdoi/194140958

.. image:: https://joss.theoj.org/papers/10.21105/joss.01915/status.svg
:target: https://doi.org/10.21105/joss.01915

.. image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/psf/black

.. image:: https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/charliermarsh/ruff/main/assets/badge/v2.json
:target: https://github.com/astral-sh/ruff

``alignparse`` is a Python package written by `the Bloom lab <https://research.fhcrc.org/bloom/en.html>`_.
It is designed to align long sequencing reads (such as those from PacBio circular consensus sequencing) to targets, filter these alignments based on user-provided specifications, and parse out user-defined sequence features.
For each read that passes the filters, information about the features (e.g. accuracy, sequence, mutations) is retained for further analyses.
Expand Down
2 changes: 1 addition & 1 deletion alignparse/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,5 @@

__author__ = "`the Bloom lab <https://research.fhcrc.org/bloom/en.html>`_"
__email__ = "[email protected]"
__version__ = "0.6.2"
__version__ = "0.6.3"
__url__ = "https://github.com/jbloomlab/alignparse"
7 changes: 3 additions & 4 deletions alignparse/ccs.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@

"""


import collections
import io
import itertools
Expand Down Expand Up @@ -75,7 +74,7 @@ def __init__(self, name, fastqfile, reportfile):
self.name = name
self.fastqfile = fastqfile
if not os.path.isfile(fastqfile):
raise IOError(f"cannot find `fastqfile` {fastqfile}")
raise OSError(f"cannot find `fastqfile` {fastqfile}")

ccs_stats = get_ccs_stats(self.fastqfile)
self.passes = ccs_stats.passes
Expand All @@ -87,7 +86,7 @@ def __init__(self, name, fastqfile, reportfile):
if reportfile:
self.reportfile = reportfile
if not os.path.isfile(reportfile):
raise IOError(f"cannot find `reportfile` {reportfile}")
raise OSError(f"cannot find `reportfile` {reportfile}")
self.zmw_stats = report_to_stats(self.reportfile)
zmw_stats_nccs = self.zmw_stats[
self.zmw_stats["status"].str.match("^Success")
Expand Down Expand Up @@ -672,7 +671,7 @@ def report_to_stats(reportfile):
if df is not None:
return df

raise IOError(f"Cannot match report in {reportfile}")
raise OSError(f"Cannot match report in {reportfile}")


def _reportfile_version_check(reportfile, pattern):
Expand Down
13 changes: 7 additions & 6 deletions alignparse/consensus.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@

"""


import collections
import io # noqa: F401
import itertools
Expand Down Expand Up @@ -461,9 +460,11 @@ def empirical_accuracy(
.assign(
**{
mutation_col: (
lambda x: x[mutation_col].map(lambda s: " ".join(sorted(s.split())))
if sort_mutations
else x[mutation_col]
lambda x: (
x[mutation_col].map(lambda s: " ".join(sorted(s.split())))
if sort_mutations
else x[mutation_col]
)
)
}
)
Expand All @@ -475,7 +476,7 @@ def empirical_accuracy(
.rename("_ngroups")
.reset_index()
# get error rate
.groupby(upstream_group_cols)
.groupby(upstream_group_cols)[["_n", "_u", "_ngroups"]]
.apply(
lambda x: 1
- _LnL_error_rate(
Expand Down Expand Up @@ -696,7 +697,7 @@ def simple_mutconsensus(
dropped = []
consensus = []
for g, g_df in df.groupby(group_cols, observed=True)[mutation_col]:
if len(group_cols) == 1:
if len(group_cols) == 1 and isinstance(g, str):
g = [g]

nseqs = len(g_df)
Expand Down
1 change: 0 additions & 1 deletion alignparse/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@

"""


CBPALETTE = (
"#999999",
"#E69F00",
Expand Down
1 change: 0 additions & 1 deletion alignparse/cs_tag.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@

"""


import functools

import numpy
Expand Down
6 changes: 3 additions & 3 deletions alignparse/minimap2.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ class Mapper:
m54228_181120_212724/4194376/ccs 0 refseq 1 1 63M * 0 0
ATGCAAAATGATGCATAGTATTAGCATAAATAGGATAGCCATAAGGTTACTGCATAAGAGTAT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NM:i:4 ms:i:111 AS:i:111 nn:i:3 tp:A:P cm:i:7 s1:i:41 s2:i:0 de:f:0.0167
NM:i:4 ms:i:111 AS:i:111 nn:i:3 tp:A:P cm:i:7 s1:i:41 s2:i:0 de:f:0.0635
cs:Z::6*na*na*nt:49*ga:4 rl:i:0
>>> print(tag_names)
['NM', 'ms', 'AS', 'nn', 'tp', 'cm', 's1', 's2', 'de', 'cs', 'rl']
Expand All @@ -168,7 +168,7 @@ class Mapper:
m54228_181120_212724/4194376/ccs 0 refseq 1 1 63M * 0 0
ATGCAAAATGATGCATAGTATTAGCATAAATAGGATAGCCATAAGGTTACTGCATAAGAGTAT
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
NM:i:4 ms:i:111 AS:i:111 nn:i:3 tp:A:P cm:i:7 s1:i:41 s2:i:0 de:f:0.0167
NM:i:4 ms:i:111 AS:i:111 nn:i:3 tp:A:P cm:i:7 s1:i:41 s2:i:0 de:f:0.0635
cs:Z::6*na*na*nt:49*ga:4 rl:i:0 np:i:127
>>> print(tag_names) # doctest: +NORMALIZE_WHITESPACE
['NM', 'ms', 'AS', 'nn', 'tp', 'cm', 's1', 's2', 'de', 'cs', 'rl', 'np']
Expand Down Expand Up @@ -227,7 +227,7 @@ def map_to_sam(self, targetfile, queryfile, samfile):
"""
for fname, f in [("target", targetfile), ("query", queryfile)]:
if not os.path.isfile(f):
raise IOError(f"cannot find `{fname}file` {f}")
raise OSError(f"cannot find `{fname}file` {f}")

if os.path.splitext(samfile)[1] != ".sam":
raise ValueError(f"`samfile` lacks extension '.sam': {samfile}")
Expand Down
9 changes: 4 additions & 5 deletions alignparse/targets.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,6 @@

"""


import contextlib
import copy
import itertools
Expand Down Expand Up @@ -160,7 +159,7 @@ def __init__(
)
if not (allow_extra_features or (feature_name in allow_features)):
raise ValueError(f"feature {feature_name} not allowed feature")
if bio_feature.strand != 1:
if bio_feature.location.strand != 1:
raise ValueError(
f"feature {feature_name} of {self.name} is - "
"strand, but only + strand features handled"
Expand Down Expand Up @@ -902,7 +901,7 @@ def map_func(f, *args):
if overwrite:
os.remove(tup.samfile)
else:
raise IOError(f"file {tup.samfile} already exists")
raise OSError(f"file {tup.samfile} already exists")
_ = map_func(
self.align, df[queryfile_col], df["samfile"], itertools.repeat(mapper)
)
Expand Down Expand Up @@ -953,7 +952,7 @@ def map_func(f, *args):
for f in list(filtered.values()) + list(aligned.values()):
if os.path.isfile(f):
if not overwrite:
raise IOError(f"file {f} already exists.")
raise OSError(f"file {f} already exists.")
else:
os.remove(f)

Expand Down Expand Up @@ -1131,7 +1130,7 @@ def parse_alignment(
}
filenames = list(filtered.values()) + list(aligned.values())
if (not overwrite_csv) and any(map(os.path.isfile, filenames)):
raise IOError(f"existing file with name in: {filenames}")
raise OSError(f"existing file with name in: {filenames}")
else:
filtered = {t: [] for t in self.target_names}
aligned = {t: [] for t in self.target_names}
Expand Down
1 change: 0 additions & 1 deletion alignparse/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@

"""


import math
import numbers
import re
Expand Down
Loading