-
Notifications
You must be signed in to change notification settings - Fork 2
/
clean_fasta_header.xml
executable file
·53 lines (37 loc) · 1.32 KB
/
clean_fasta_header.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
<?xml version="1.0"?>
<tool id="clean_fasta_header_1" name="Clean fasta header">
<description>Removes fasta description fields in header </description>
<command>sed 's/\(>\w*\)\s*.*/\1/' $inputFastaFile > $fasta_outputfile</command>
<inputs>
<param format="fasta" name="inputFastaFile" type="data" label="fasta File"/>
</inputs>
<outputs>
<data format="fasta" name="fasta_outputfile" />
</outputs>
<help>
.. class:: infomark
**TIP**
This tool requires *fasta* format.
It simply removes any additional definition strings from the header line prior to using tools that dont handle these.
----
**Example**
--Query sequence
::
>contig00001 gene=isogroup00001 length=2159
tttAaGCATTTAACACTGCATATTGATTGATATAGTTGTTCAGTACAAGCCAATTACATT
GTAGACATAAAACAAAGCATTCGAAACAGTTGAAATTTTGATTCCTCTATACTGGATCAG
GCGGTAATCA
>contig00003 gene=isogroup00001 length=2206
ggTGGCTGCTTTCTCAAATCCACCCCTTCCCAAGGAAACCCTAAACTCGCAGATAAATTT
--Output
::
>contig00001
ttAaGCATTTAACACTGCATATTGATTGATATAGTTGTTCAGTACAAGCCAATTACATT
GTAGACATAAAACAAAGCATTCGAAACAGTTGAAATTTTGATTCCTCTATACTGGATCAG
GCGGTAATCAGGGGAAGGAAACCATGGTGTAAGGCTGCATCCCATACTTTATCTATGTCA
>contig00003
ggTGGCTGCTTTCTCAAATCCACCCCTTCCCAAGGAAACCCTAAACTCGCAGATAAATTT
GTAGGGTTTCTATGTCGACCGAGCGCCGTCGGAAAGTGAGCCTTTTCGACGTAGTTGAC
GAGACCTCAGTCTCTG...
</help>
</tool>