-
Notifications
You must be signed in to change notification settings - Fork 1
External Annotations
This script can parse .tsv
files (for example from InterProScan)and create a .json
file for FAS:
fas.parseAnno -i INPUT.tsv -o OUTPUT.json -t TOOLNAME -f PROTEIN_ID_COLUMN PROTEIN_LENGTH_COLUMN FEATURE_ID_COLUMN START_COLUMN END_COLUMN
[-t|--tool_names] This option defines, which database/tool (pfam, smart, ...) the features belong to.
[-f|--feature_columns] This takes 5 integer values that point to the columns in the tsv file that contain: (1) the protein id, (2) the protein length, (3) the feature id, (4) the start position of the feature, (5) the end position of the feature. The column indices start at 0 so the first column has index 0, the second index 1, etc.
If you have multiple databases/tools in one tsv file (like in InterPro) you can use [-c|--tool_column] to set the column index that contains the database/tool name. If you do this, all toolnames need to be given with [-t]
[--ignore_lines] With this you can tell the parser to ignor the first n lines
fas.parseAnno -i INPUT.tsv -o OUTPUT.json -t CDD Hamap PANTHER Pfam PIRSF SUPERFAMILY PRINTS Gene3D ProDom SMART TIGRFAM ProSiteProfiles ProSitePatterns SFLD Coils MobiDBLite Phobius SignalP_GRAM_POSITIVE SignalP_GRAM_NEGATIVE SignalP_EUK TMHMM -f 0 2 4 6 7 -c 3