Fix gene ID mismatch in projection command #208
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes the bug reported in #207.
In the projection command using a gbff file as input genome, the input genes weren't linked to any gene families, resulting in all genes being without a pangenome family. This was due to a mismatch between ppanggolin's internal IDs and the original IDs from the GBFF file.
When gene IDs are unique across input genomes, the original ID (local_identifier) should be used instead of the PPanGGOLiN-generated IDs. However, this replacement wasn't happening because of a flaw in the iterator management, which left it empty.
The iterator has been properly fixed, resolving the issue.