Clustering a set of haplotypes by successive splitting and merging steps.
- a Python3 installation,
- Jupyter Notebook,
- (optional) a specific virtual environment, activated
This notebook intended to be a case study of the algorithm implemented in the BEAGLE software, which is used in my research project. The purpose was to get insights into the clustering mechanisms, and to get hands on with graph library (NetworkX chosen here).
Additional references and explanations are provided in the notebook itself. In particular, these two articles feature the primary implentation of the haplotypes clustering algorithm:
- Browning, S.R.: Multilocus association mapping using variable-length markov chains. Am. J. Hum. Genet. 78, 903–913 (2006)
- Browning, S.R., Browning, B.L.: Rapid and accurate haplotype phasing and missing data inference for whole genome association studies by use of localized haplotype clustering. The American Journal of Human Genetics 81, 1084–1097 (2007)