Resfams is a curated database of protein families and associated profile hidden Markov models (HMMs), confirmed for antibiotic resistance function and organized by ontology.
Gibson MK, Forsberg KJ, Dantas G. Improved annotation of antibiotic resistance functions reveals microbial resistomes cluster by ecology. The ISME Journal. 2014, doi:ISMEJ.2014.106 | [PDF]
The core database of Resfams profile HMMs was trained using unique antibiotic resistance protein sequences from the Comprehensive Antibiotic Resistance Database (CARD) database, the Lactamase Engineering Database (LacED), and Jacoby and Bush's collection of curated beta-lactamase proteins. The core database of Resfams profile HMMs was supplemented with addition profile HMMs from the Pfam and TIGRFam databases to generate the full Resfams profile HMM database. The additional profile HMMs included in this version of Resfams have been curated and demonstrated to identify protein families that commonly contribute to antibiotic resistance, such as acetyltransferases, AraC transcriptional regulators, Major Facilitor Superfamily (MFS) transporters, ATP-binding cassette (ABC) efflux pumps, etc. These supplementary profile HMMs were verified using functional assays, including functional metagenomic selections of the soil microbiota and the human gut microbiota. The full version of the Resfams database should only be utilized when there is previous functional evidence of antibiotic resistance activity, such as functional metagenomic selections.
Following extensive curation and functional validation, the first version of the full Resfams HMM database contains 166 profile HMMs representing major antibotic resistance gene classes, including antibiotic resistance genes against beta-lactams, aminoglycosides, fluoroquinolones, glycopeptides, macrolides, tetracyclines, as well as efflux pumps and transcription factors modulating antibiotic resistance. To improve Resfams prediction accuracy, we optimized profile specific gathering thresholds, which set an inclusion bit score cut-off for a protein sequence alignment on a profile-by-profile basis. After optimization, we achieved perfect precision and high recall (99 ± 0.02%) of all AR proteins used to train the Resfams profile HMMs. We further tested the prediction accuracy of these optimized Resfam profile HMMs on independent antibiotic resistance proteins from the Antibiotic Resistance Database (ARDB) not used in training of the original profile HMMs. We predicted the antibiotic resistance function of 2,454 unique sequences, representing 54 Resfams protein families, and achieve perfect precision and high recall (98 ± 0.03%) for all protein families.
The prediction sensitivity and specificity of Resfams was evaluated using recent functional metagenomic studies investigating the resistomes of cultured soil microbiota (Forsberg et al, 2012) and human gut microbiota (Moore et al, 2013). We compared Resfams HMMs to pairwise sequence alignment (BLAST) against the ARDB and CARD databases for their ability to predict AR function. Resfams demonstrated improved sensitivity: 64% of AR genes identified using Resfams in both the soil and the human gut microbiota were not identified by BLAST (Figure 1). Focusing on the hand-curated, gold standard set of AR genes identified in the set of 95 multidrug resistant (MDR) cultured soil isolates, we show that Resfams was able to predict over 95% of the full-length AR genes (>90 amino acids) (Figure 1A). In contrast, pairwise sequence alignment to AR specific databases using widely accepted identity thresholds for AR genes annotated less than 34% of full-length AR genes. Importantly, Resfams did not identify any false positive AR proteins (i.e. not predicted by intensive hand-curation), ensuring that AR potential of microbial communities is not over-estimated. Generating the gold standard set of AR genes involved extensive hand-curation, including functional characterization, phylogenetic analysis, and primary literature validation to identify causative AR genes. This time-intensive process is prohibitive for large-scale studies of antibiotic resistance potential of microbial communities. In comparison, Resfams enables automated, rapid, accurate, high-resolution predictions of AR genes.
PLANNED RESFAMS UPDATES
Resfams was originally built for use with functional metagenomic selections and includes important antibiotic classes including beta-lactams, aminoglycosides, fluoroquinolones, glycopeptides, macrolides, tetracyclines. Future updates of the resfams will include additional important antibitoic resistance gene classes and mechanisms, such as macrolide, rifampin, and streptogramin resistance proteins. In addition, Resfams will continued to be updated by incorporating functionally identified resistance genes in order to keep up with the rapidly evolveing bacterial antibtioic resistance.
- Resfams HMM Database (Core) - v1.2, updated 2015-01-27
Database version for annotation of microbial proteins in the absence of any functional confirmation for antibiotic resistance.
- Resfams HMM Database (Full) - v1.2, updated 2015-01-27
Database version for annotation of microbial proteins when functional confirmation for antibiotic resistance is available (such as functional metagenomic selections).
- Resfams profile HMM Metadata - v1.2.1, updated 2017-03-25
Metadata on profile HMMs in resfams including description, ARO identifiers, HMM database source, and mechamism classification. Updated to reflect latest CARD (v.1.1.5)
- Resfams AR Proteins - v1.2, updated 2015-01-27
Proteins used to build Resfams base HMM database