-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sequence flattening algorithm / pseudocode in Spec #8
Comments
One key thing to remember is this is different depending on the type of ComponentDefinition. The statements below apply only to things that have explicit sequences, such as DNA.
For this type, there is a starting point implemented in libSBOLj within ComponentDefinition.java called getImpliedNucleicAcidSequence. This function recursively attempts to generate the sequence implied by the Components. This is used to check validation rule 10520. Namely, it creates the sequence implied by the Components and their corresponding SequenceAnnotations. It starts with a sequence of Ns, and fills in as annotations are found that fill in parts. Finally, in the validation check, the resulting generated sequence is compared with the one on the top-level CD to see if it includes the implied sequence.
Currently, inconsistencies are not checked for. The assumption is these would be flagged as 10520 errors when the lower level CDs are validated. So, essentially, if an SBOLDocument does not have 10520 errors, then it meets condition 1 below. If it does violate this rule, then it meets condition 3 below. I’m not sure what condition 2 is? I’m also not sure what unique sequence means. In IUPAC, you are allowed to express uncertainty in bases, so anytime you have a character that is not ATGC, you are not specifying a “unique” sequence.
… On Jan 19, 2017, at 4:35 PM, Raik Grünberg ***@***.***> wrote:
We need a clear definition of how to "flatten" the overall sequence of a SBOL construct. This needs to be described in the spec.
Three possible outcomes:
straightforward to flatten without conflicts
cannot generate a unique sequence (although it might be possible)
conflicts, no actual sequence defined
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#8>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADWD96skgSAdw8Ktdnv46C9QrhHkmTLhks5rT5DjgaJpZM4LoUqP>.
|
Great comment. The libSBOLj method could serve as a reference and starting point then. Option 2 is supposed to indicate that we have to draw the line somewhere with regard to what kind of complexity a SBOL-compatible tool is supposed to resolve and understand. |
While flattening is now much simpler in SBOL 3, writing up some canonical examples would still be highly valuable. |
We need a clear definition of how to "flatten" the overall sequence of a SBOL construct. This needs to be described in the spec.
Three possible outcomes:
See also somewhat related issue #7
The text was updated successfully, but these errors were encountered: