Replies: 2 comments 5 replies
-
It sounds like there are two different types of information that potentially rely on this operation: Summary information about lengths Information about positions So for instance, say you have a chromosome of length 100 with the p arm going from 0-45, the centromere going from 45-55, and the q arm from 55-100. Then if |
Beta Was this translation helpful? Give feedback.
-
I'm not sure about implicit use of a stored mask, it wouldn't be obvious to the reader of some code that one was being used. It would also be a breaking change to all methods that you modified to use one, as the current expectation is that all sites will be used. Once we have mask usage in the stats framework it may be that storing a mask would be useful, at least in top-level metadata. |
Beta Was this translation helpful? Give feedback.
-
In 2a7341c two sorts of mask were implemented for VCF output. In some cases, we might want to save a default mask within the tree sequence itself, for example one that masks off the telomeres, or (in the case of the tree sequence of, say, the q arm of a chromosome) a large region which is there simply in order to get the positions aligned correctly. By default, these regions could also be ignored by the stats framework (see #346 for a related discussion). @gtsambos noticed this problem with the unified genealogy tree sequences recently.
Would it be possible to store a mask somehow at the top level of the tree sequence. I assume this isn't metadata, if it is used by functions such as
write_VCF
or the stats stuff.Beta Was this translation helpful? Give feedback.
All reactions