Skip to content

External Annotations

Julian Dosch edited this page Sep 16, 2022 · 3 revisions

fas.parseAnno

This script can parse .tsv files (for example from InterProScan)and create a .json file for FAS:

fas.parseAnno -i INPUT.tsv -o OUTPUT.json -t TOOLNAME -f PROTEIN_ID_COLUMN PROTEIN_LENGTH_COLUMN FEATURE_ID_COLUMN START_COLUMN END_COLUMN

[-t|--tool_names] This option defines, which database/tool (pfam, smart, ...) the features belong to.

[-f|--feature_columns] This takes 5 integer values that point to the columns in the tsv file that contain: (1) the protein id, (2) the protein length, (3) the feature id, (4) the start position of the feature, (5) the end position of the feature. The column indices start at 0 so the first column has index 0, the second index 1, etc.

If you have multiple databases/tools in one tsv file (like in InterPro) you can use [-c|--tool_column] to set the column index that contains the database/tool name. If you do this, all toolnames need to be given with [-t]

[--ignore_lines] With this you can tell the parser to ignor the first n lines

InterProScan example

fas.parseAnno -i INPUT.tsv -o OUTPUT.json -t CDD Hamap PANTHER Pfam PIRSF SUPERFAMILY PRINTS Gene3D ProDom SMART TIGRFAM ProSiteProfiles ProSitePatterns SFLD Coils MobiDBLite Phobius SignalP_GRAM_POSITIVE SignalP_GRAM_NEGATIVE SignalP_EUK TMHMM -f 0 2 4 6 7 -c 3