Skip to content

Latest commit



105 lines (83 loc) · 4.1 KB

File metadata and controls

105 lines (83 loc) · 4.1 KB

Here, I outline the set-up for CARD RGI (including RGI tool, CARD database, wildCARD and kmer databases). This is based on instructions at .

Installing RGI Tool

note: this was the initial way that installation was done / recommended

conda create -n rgienv  
conda activate rgienv  
conda install --channel conda-forge --channel bioconda --channel defaults rgi=5.2.0 # SUCCESS!  
conda install -c bioconda tbb=2020.2 # to mitigate issue with bowtie2 downstream

However, at some point I ran into an error with the tool that required modification of the code. Therefore, I opted to install using this method:

git clone
conda env create -f conda_env.yml
conda activate rgi
# modified the code of RGI see note below
python build
python test
python install

note: modification of the code involved the following changes...

Make the script parse based on ___ instead of __ and modify the function get_bwt_sequences(self) to deal with kma inputs.

line 101: updates to ___ parsing

    .format(orf=contig_id, hsp=hsp, model=model, type_hit=type_hit, dna=dna))

line 124: updates to ___ and additional elif option

if aligner == "":
    os.system("""samtools view -F 4 -F 2048 {bam} | while read line; do awk '{cmd}'; done > {out}"""
      .format(bam=self.input_bam_file, cmd="""{print ">"$1"___"$4"___"$3"___"$5"\\n"$11}""", out=self.fasta_file))
# added by KK for dealing with kma
elif  "ID:KMA"  in aligner:
    print("inside ID:KMA aligner")
    os.system("""samtools view -F 4 -F 2048 {bam} | while read line; do awk '{cmd}'; done > {out}"""
      .format(bam=self.input_bam_file, cmd="""{print ">"$1"___"$4"___"$3"___"$5"\\n"$11}""", out=self.fasta_file))
# end section added by KK
    os.system("""samtools view -F 4 -F 2048 {bam} | while read line; do awk '{cmd}'; done > {out}"""
      .format(bam=self.input_bam_file, cmd="""{print ">"$1"___"$3"___"$2"___"$5"\\n"$10}""", out=self.fasta_file))

Installing associated CARD DBs

mkdir CARD_DBs  
cd CARD_DBs  

obtain CARD data

tar -xvf data ./card.json

load the CARD data

rgi card_annotation -i /home/ubuntu/comp-bio-secure/AMR-research/CARD_DBs/card.json > card_annotation.log 2>&1  
rgi load -i /home/ubuntu/comp-bio-secure/AMR-research/CARD_DBs/card.json --card_annotation card_database_v3.1.4.fasta  

obtain wildcard data:

wget -O wildcard_data.tar.bz2  
mkdir -p wildcard  
tar -xjf wildcard_data.tar.bz2 -C wildcard  
gunzip wildcard/*.gz  

load the wildcard data:

rgi wildcard_annotation -i wildcard --card_json /home/ubuntu/comp-bio-secure/AMR-research/CARD_DBs/card.json -v 3.0.9 > wildcard_annotation.log 2>&1 # note: version number comes from web downloads page - [](

#local install:
rgi load --wildcard_annotation wildcard_database_v3.0.9.fasta --wildcard_index /home/ubuntu/comp-bio-secure/AMR-research/CARD_DBs/wildcard/index-for-model-sequences.txt --card_annotation card_database_v3.1.4.fasta --local

#system install:
rgi load --wildcard_annotation wildcard_database_v3.0.9.fasta --wildcard_index /home/ubuntu/comp-bio-secure/AMR-research/CARD_DBs/wildcard/index-for-model-sequences.txt --card_annotation card_database_v3.1.4.fasta

Enabling “pathogen-of-origin analysis”

#local install:
rgi load --kmer_database /home/ubuntu/comp-bio-secure/AMR-research/CARD_DBs/wildcard/61_kmer_db.json --amr_kmers /home/ubuntu/comp-bio-secure/AMR-research/CARD_DBs/wildcard/all_amr_61mers.txt --kmer_size 61 --local --debug > kmer_load.61.log 2>&1

#system install:
rgi load --kmer_database /home/ubuntu/comp-bio-secure/AMR-research/CARD_DBs/wildcard/61_kmer_db.json --amr_kmers /home/ubuntu/comp-bio-secure/AMR-research/CARD_DBs/wildcard/all_amr_61mers.txt --kmer_size 61 --debug > kmer_load.61.log 2>&1

Close the environment

mkdir run-rgi
conda deactivate