This repository contains the GlobalCR catalog, the largest and most comprehensive collection of chemoreceptors (CRs) to date. It includes over 1.4 million CR sequences from more than 800,000 metagenome-assembled genomes (MAGs) and 229,000 CRs from 41,000 representative reference genomes.
The GlobalCR catalog enables the exploration of the diversity, evolution, and ecological roles of chemoreceptors across bacterial and archaeal lineages. By analyzing this dataset, we highlight the immense diversity of extracellular sensing domains and their strong links to specific natural habitats, representing significant adaptive values.
The repository includes the following files:
globalCR_MCPsignal_metadata.csv
#Metadata associated with MCPsignal domains, including taxonomic, functional, and ecological annotations.
globalCR_MCPsignal.faa
#FASTA file containing all MCPsignal domain protein sequences.
globalCR_LBDs_metadata.csv
#Metadata for ligand-binding domains (LBDs), including domain type, taxonomic classification, and habitat associations.
globalCR_LBDs.faa
#FASTA file containing sequences of all identified LBDs.
globalCR_CRs_orf.faa
#FASTA file with open reading frame (ORF) sequences corresponding to the entire set of chemoreceptors in the GlobalCR catalog.