Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pad the target when aligning #312

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

pad the target when aligning #312

wants to merge 7 commits into from

Conversation

ekg
Copy link
Collaborator

@ekg ekg commented Jan 22, 2025

Changes so far:

  1. A new parameter target_padding was added to the Parameters struct in align_parameters.hpp. This parameter adds additional padding around target sequences.

  2. In computeAlignments.hpp, the main changes are:

    • Modified parseMashmapRow function to accept a new target_padding parameter
    • Added logic to apply padding to reference sequence coordinates (rStartPos and rEndPos) while ensuring they stay within valid bounds (not below 0 or above reference length)
    • Updated function calls to pass through the new target_padding parameter
  3. In parse_args.hpp, added command-line support for the target padding feature:

    • Added new command-line option -E or --target-padding to specify padding around target sequence
    • Default value is set to 0
    • Added validation to ensure the padding value is non-negative
    • Integrated the parameter into the alignment parameters structure

Objectives:

  • Integrate https://github.com/ekg/indelswizzle/blob/main/cigar_swap.cpp, ideally within the biWFA alignment function in wflign. This will mean left and right-aligning leading/trailing indels, so that we have gaps at the start and end of the alignment. We should then trim back the coordinates of the target.
  • Test!
  • Only run this when we are on mappings which have been split (via -P 50k for instance) and not at the starts and ends of chains. This would mean integrating a new bit of information into the mapping records to say where we are in the chain in addition to which chain id we have. We could stash that in a single variable or put multiple ones on the row (three are needed—chain id, chain length, position in chain).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant