Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Advanced FASTQ processing, Biostar handbook #315

Open
Meshesha78 opened this issue Jan 3, 2024 · 2 comments
Open

Advanced FASTQ processing, Biostar handbook #315

Meshesha78 opened this issue Jan 3, 2024 · 2 comments

Comments

@Meshesha78
Copy link

I was trying to download viral reference genome while exercising the 'Advanced FASTQ processing' section of the biostar handbook. Here are the commands I run;

wget -nc http://data.biostarhandbook.com/reads/duplicated-reads.fq.gz
wget -nc ftp://ftp.ncbi.nih.gov/refseq/release/viral/viral.2.1.genomic.fna.gz
wget -nc ftp://ftp.ncbi.nih.gov/refseq/release/viral/viral.2.protein.faa.gz

While I'm successful downloading the first (duplicated-reads), I keep getting error message on the last two, viral genomic and protein seqs. The message looks like this:

$ wget ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.2.1.genomic.fna.gz
--2024-01-02 16:46:27-- ftp://ftp.ncbi.nlm.nih.gov/refseq/release/viral/viral.2.1.genomic.fna.gz
=> 'viral.2.1.genomic.fna.gz'
Resolving ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)... 130.14.250.13, 130.14.250.12, 2607:f220:41e:250::10, ...
Connecting to ftp.ncbi.nlm.nih.gov (ftp.ncbi.nlm.nih.gov)|130.14.250.13|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done. ==> PWD ... done.
==> TYPE I ... done. ==> CWD (1) /refseq/release/viral ... done.
==> SIZE viral.2.1.genomic.fna.gz ... done.

==> PASV ... done. ==> RETR viral.2.1.genomic.fna.gz ...
No such file 'viral.2.1.genomic.fna.gz'.

Also,I dont see the files at https://ftp.ncbi.nlm.nih.gov/refseq/release/viral/

Thanks,

Mes

@ara-rdgz
Copy link

ara-rdgz commented Jan 5, 2024

Hello!

I was having the same exact problem and, just like you, I checked the NCBI FTP site for said files and could not find them. So, I decided to do the best next thing which was to download the files with similar names to the ones I wanted. If you go to the FTP site you'll see that while the viral.2.1.genomic.fna.gz and viral.2.protein.faa.gz files do not exit, however the files viral.1.1.genomic.fna.gz and viral.1.protein.faa.gz do exist. So, I just downloaded these and used them for the subsequent steps.

@Meshesha78
Copy link
Author

Thanks for the tip! I saw that the the viral.2.1.genomic.fna.gz and viral.2.protein.faa.gz files also used by some published work but do not exist at the FTP site anylonger. I did use the availlable 1.1 versions as you suggested and works fine for the purpose.
Thanks again,

Mes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants