SpecHap

SpecHap is an ultra fast phasing algorithm based on spectral graph analysis. SpecHap currently support general WGS sequencing, Hi-C, 10X linked-reads, PacBio SMRT and Oxford Nanopore .

Getting Started

To build SpecHap, run the following command:

cd /path/to/SpecHap/
mkdir build
cd build
cmake ..
make && make install

To see the option SpecHap support, run the following command:

SpecHap --help

A modified utility software ExtractHair, originally from HAPCUT2, is needed for fragment processing. To install, run

cd /path/to/SpecHap/
git submodule init && git submodule update
cd submodules/htslib
git checkout 26229a3
cd ../samtools
git checkout 255f97d
cd ../../hair-src
make

Prerequisites

SpecHap relies on ARPACK for Eigen-calculation. To gain stable utilization, we recommend arapack-ng.

Arpack-ng can be easily compile with cmake, ensure you have BLAS and LAPACK installed before compiling.

To build arapack-ng, try run

cd /path/to/arpack-ng/
sh bootstrap
./configure --enable-icb
make
make && make install

Htslib is also required. To install htslib, simply run

cd /path/to/htslib
autoheader  #required if htslib is cloned from github
autoconf    #required if htslib is cloned from github
./configure
make && make install

Using SpecHap

Data preprocessing

SpecHap requires at least a fragment file and a bgziped and indexed VCF to perform phasing. Ensure your VCF is sorted by position. The fragment file is also required to be sorted by the same order.

To generate the fragment file, run the following command

extractHAIRS --bam /your/bam/file --VCF /your/vcf/file --out fragment_file

With Hi-C sequenced file, try

extractHAIRS --bam /your/bam/file --VCF /your/vcf/file --out fragment_file --hic 1

With 10X linked reads, try

extractHAIRS --bam /your/bam/file --VCF /your/vcf/file --out fragment_file --10x 1

You also need a bed file indicating each barcode's inferred spanning range. You can use the BarcodeExtract to do your job

BarcodeExtract /you/bam/file barcode_spnanning.bed
bgzip -c barcode_spanning.bed > barcode_spanning.bed.gz
tabix -p bed barcode_spanning.bed.gz

With PacBio SMRT:

extractHAIRS --pacbio 1 --bam /your/bam/file --VCF /your/vcf/file --out fragment_file --ref /reference/file

Similarly with Nanopore, change the --pacbio into --ont

With a fragment file, you can sort it with following command, if your are phasing with 10X, Hi-C or the new-format is specified

sort -n -k6 in.frag > sorted.frag

If paired-ended NGS, PacBio SMRT or Oxford Nanopore is used with default format, use

sort -n -k3 in.frag > sorted.frag

Run SpecHap

The detailed usage can be found by

SpecHap --help

For instance, to phase PacBio SMRT reads:

SpecHap --vcf /your/gzvcf/file --frag /your/fragment/file --out /phased/vcf --pacbio

Script for VCF handling.

You may find bunch of scripts that we use to benchmark the accuracy and completeness of assembled haplotype under folder ./scripts.

Author

SpecHap is developed by DeepOmics lab under the supervision of Dr. Li Shuaicheng, City University of Hong Kong, Hong Kong, China.

To contact us, send email to [email protected]

Built With

License

This project is licensed under the MIT License - see the LICENSE.md file for details

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
Eigen		Eigen
barcode_extractor		barcode_extractor
cmake/Modules		cmake/Modules
hairs-src		hairs-src
reproduce_paper_result		reproduce_paper_result
scripts		scripts
submodules		submodules
unsupported		unsupported
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE.md		LICENSE.md
README.md		README.md
frag_io.cpp		frag_io.cpp
frag_io.h		frag_io.h
graph.cpp		graph.cpp
graph.h		graph.h
hic_util.cpp		hic_util.cpp
hic_util.h		hic_util.h
indexing.h		indexing.h
main.cpp		main.cpp
optionparser.h		optionparser.h
phaser.cpp		phaser.cpp
phaser.h		phaser.h
results.cpp		results.cpp
results.h		results.h
spectral.cpp		spectral.cpp
spectral.h		spectral.h
tenx_util.cpp		tenx_util.cpp
tenx_util.h		tenx_util.h
type.cpp		type.cpp
type.h		type.h
util.cpp		util.cpp
util.h		util.h
vcf_io.cpp		vcf_io.cpp
vcf_io.h		vcf_io.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpecHap

Getting Started

Prerequisites

Using SpecHap

Data preprocessing

Run SpecHap

Script for VCF handling.

Author

Built With

License

About

Releases 2

Packages

Contributors 2

Languages

License

deepomicslab/SpecHap

Folders and files

Latest commit

History

Repository files navigation

SpecHap

Getting Started

Prerequisites

Using SpecHap

Data preprocessing

Run SpecHap

Script for VCF handling.

Author

Built With

License

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 2

Languages

Packages