Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

starts and ends should go to different files #1

Open
nsheff opened this issue Oct 20, 2020 · 3 comments
Open

starts and ends should go to different files #1

nsheff opened this issue Oct 20, 2020 · 3 comments
Labels
bug Something isn't working

Comments

@nsheff
Copy link
Member

nsheff commented Oct 20, 2020

It seems the counting of number of bases output is off.

./uniwig test2.bed 1 5 | wigToBigWig stdin `refgenie seek hg19/fasta.chrom_sizes` out.bw
There's more than one value for chr11 base 50 (in coordinates that start with 1).
cat test2.bed
chr11   10      50
chr11   20      76
@nsheff nsheff added the bug Something isn't working label Oct 20, 2020
@nsheff
Copy link
Member Author

nsheff commented Oct 20, 2020

yeah something's not right here. why would this example output 2 headers?

cat test3.bed
chr1    1       10
chr1    8       15
chr1    9       22
./uniwig test3.bed 1 0
fixedStep chrom=chr1 start=1 step=1
1
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
fixedStep chrom=chr1 start=10 step=1
0
0
0
0
0
0
0
0
0
1
0
0
0
0
1
0
0
0
0
0
0
1

@nsheff
Copy link
Member Author

nsheff commented Oct 20, 2020

I see now. It appears to be spitting out the starts and then the ends right afterwards!

@nsheff nsheff changed the title bug in fixedstep calculation, number of bases is incorrect starts and ends should go to different files Oct 20, 2020
@nsheff
Copy link
Member Author

nsheff commented Oct 20, 2020

So, starts and ends need to go to different files somehow.

It's nice that it prints to stdout, though, so I can just pipe to wigToBigWig. But that's not compatible with 2 outputs.

is there an efficiency gain in doing both at the same time? but then why is one printed after the other? If you're really going through the file twice, then I'd suggest just introducing CLI flags, --starts and --ends, and running it twice.

But there are efficiency gains to be had from doing them simultaneously, but then you have to print to 2 places. You can't do one after the other without retaining everything in memory. In that case I'd do --starts file.wig --ends file.wig. then you only have to go through the file once, and just print to 2 separate file handles as you go, for each start/end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant