Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test digest two xmls for same protein #824

Conversation

trishorts
Copy link
Contributor

@trishorts trishorts commented Jan 14, 2025

the order of peptides digested from a protein depend on the order in which localized modifications are encoded in the xml. This is particularly problematic when the number of modified forms of a peptide is limited to a specific number (e.g. take the first 1024).

This PR adds code that sorts all protein onebasedlocalizedmodifications first by the position and then by the modification. This prevents variations in the xml encoded order from influencing the resultant peptides in the digest.

This PR also includes a test, where the modifications are present in different orders. This test will fail without the sort. But since the sort is added, the failure is eliminated and the test passes.

@trishorts trishorts added the WIP label Jan 14, 2025
var localizedModsOrderedByPositionThenByModification = Protein.OneBasedPossibleLocalizedModifications
.SelectMany(kvp => kvp.Value.Select(value => (kvp.Key, value)))
.OrderBy(tuple => tuple.Key)
.ThenBy(tuple => tuple.value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given how many times this gets called, I think it would be better to use a custom comparer, that way we only have to iterate through the list once.

Additionally, I'm not sure that it makes sense to store this as a list, since we're only iterating through it once. We could just put the linq expression in the foreach

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated according to your comments

.SelectMany(kvp => kvp.Value.Select(value => (kvp.Key, value)))
.OrderBy(tuple => tuple.Key)
.ThenBy(tuple => tuple.value)
.ToList();
foreach (var localizedMod in localizedModsOrderedByPositionThenByModification)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant a custom comparer that considers the tuple's key and value simultaneously. So there would only be an OrderBy statement with a custom comparer, as opposed to an OrderBy().ThenBy().

}

return Nullable.Compare(this.MonoisotopicMass, other.MonoisotopicMass);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this was fine

@trishorts trishorts closed this Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants