Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data input #1

Open
Shellfishgene opened this issue Aug 21, 2018 · 5 comments
Open

Data input #1

Shellfishgene opened this issue Aug 21, 2018 · 5 comments

Comments

@Shellfishgene
Copy link

Hi,
I don't understand how to prepare data for MicroPheno. Typically 16S raw data comes as two fastq files per sample, R1 and R1. How do these need to be processed? Joined and one fastq/fastq file per sample? Or like the "Ecological environments" example, with one single fasta file and the samples denoted in the fasta headers?

@ehsanasgari
Copy link
Owner

Hi there, thank you for your interest in MicroPheno. Yes, one file per sample. And then you can give the directory as the input to the pipeline.

However, the ECO datasets in the paper are different. They are not metagenmic samples, but they are representative sequences.

Please let me if you had further questions or the answer was not clear enough.

@Shellfishgene
Copy link
Author

It seems to work with the input files, but after the k-mer steps there is an error (see below). Also I think the requirements file is missing tensorflow and biopython.

6 -mer bootstrapping completed
Traceback (most recent call last):
  File "/sfs/fs6/home-geomar/smomw240/miniconda2/envs/micropheno/lib/python3.6/site-packages/matplotlib/texmanager.py", line 393, in make_dvi
    stderr=subprocess.STDOUT)
  File "/miniconda2/envs/micropheno/lib/python3.6/subprocess.py", line 336, in check_output
    **kwargs).stdout
  File "/miniconda2/envs/micropheno/lib/python3.6/subprocess.py", line 418, in run
    output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['latex', '-interaction=nonstopmode', '850c700a4a598c767be1824e3c42fb00.tex']' returned non-zero exit status 1.
RuntimeError: LaTeX was not able to process the following string:
b'(i) \\\\textbf{Self-inconsistency $\\\\bar{D_S}$,} with respect to sample size (N)\\\\\\\\ demonstrated for different k values in the coral_test dataset'

Here is the full report generated by LaTeX:
This is pdfTeX, Version 3.1415926-2.5-1.40.14 (TeX Live 2013)
 restricted \write18 enabled.
entering extended mode
(./850c700a4a598c767be1824e3c42fb00.tex
LaTeX2e <2011/06/27>
Babel <v3.8m> and hyphenation patterns for english, dumylang, nohyphenation, lo
aded.
(/usr/share/texlive/texmf-dist/tex/latex/base/article.cls
Document Class: article 2007/10/19 v1.4h Standard LaTeX document class
(/usr/share/texlive/texmf-dist/tex/latex/base/size10.clo))
(/usr/share/texlive/texmf-dist/tex/latex/type1cm/type1cm.sty)
(/usr/share/texlive/texmf-dist/tex/latex/base/textcomp.sty
(/usr/share/texlive/texmf-dist/tex/latex/base/ts1enc.def))
(/usr/share/texlive/texmf-dist/tex/latex/geometry/geometry.sty
(/usr/share/texlive/texmf-dist/tex/latex/graphics/keyval.sty)
(/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifpdf.sty)
(/usr/share/texlive/texmf-dist/tex/generic/oberdiek/ifvtex.sty)
(/usr/share/texlive/texmf-dist/tex/generic/ifxetex/ifxetex.sty)

Package geometry Warning: Over-specification in `h'-direction.
    `width' (5058.9pt) is ignored.


Package geometry Warning: Over-specification in `v'-direction.
    `height' (5058.9pt) is ignored.

)
No file 850c700a4a598c767be1824e3c42fb00.aux.
(/usr/share/texlive/texmf-dist/tex/latex/base/ts1cmr.fd)
*geometry* driver: auto-detecting
*geometry* detected driver: dvips
! Missing $ inserted.
<inserted text>
                $
l.12 ...rated for different k values in the coral_
                                                  test dataset}
! Extra }, or forgotten $.
l.12 ...ferent k values in the coral_test dataset}

! Missing $ inserted.
<inserted text>
                $
l.13 \end{document}

[1] (./850c700a4a598c767be1824e3c42fb00.aux) )
(\end occurred inside a group at level 1)

### simple group (level 1) entered at line 12 ({)
### bottom level
(see the transcript file for additional information)
Output written on 850c700a4a598c767be1824e3c42fb00.dvi (1 page, 560 bytes).
Transcript written on 850c700a4a598c767be1824e3c42fb00.log.

@ehsanasgari
Copy link
Owner

This errors seems to be related to using latex in plotting for bootstrapping. So you can just remove the latex scripts from the plot if you want.

@Shellfishgene
Copy link
Author

I ran it again with fewer samples and now it worked for some reason. Next question: What's supposed to be in labels_phenotypes.txt? I can't find info on that.

@FaalkL
Copy link

FaalkL commented Jun 7, 2021

Hello @ehsanasgari,
I am a bit late but i don't understand what is in the "labels_penotypes.txt" file either. Can we have an example ?
Thank you for your attention

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants