-
Notifications
You must be signed in to change notification settings - Fork 1
File Formats
Vinh Tran edited this page Nov 7, 2023
·
16 revisions
The featuretypes file contains information on which feature-types are used for calculation and if they should get linearized. Note that for non-core feature types, you need provide the annotation
#linearized
Pfam
SMART
#normal
fLPS
COILS2
SEG
TMHMM
SignalP
#checked
#class
pfam 0.3
fLPS 0.1
#type
pfam_WASH_WAHD 0.1
The pairwise file is a simple tab-separated file with:
seed_id_1 query_id_1
seed_id_2 query_id_2
For each pair that you want to calculate the score for. All, protein ids must be present in the seed/query input files
Tab-separated file with 4 columns:
id_A taxon_1 id_A taxon_2
id_B taxon_1 id_B taxon_2
id_A taxon_1 id_C taxon_3
Note: with current output file format, protein IDs between different taxa need to be unique (e.g. 2 taxa cannot have proteins with the same IDs).
The phyloprofile mapping file is a tab-separated file that contains the NCBI id of the source proteome of each query preotein.
A2P2R3 ncbi559292
A5Z2X5 ncbi559292
D6VPM8 ncbi559292