Explore lists of top-scoring Pfam domains according to pathogen-association and other criteria.
PathFams is a database of pre-computed analyses of Pfam domain families. The abundance of all 17,929 families in Pfam v. 32.0 was examined in:
Environments (human gut, marine, soil)
Pathogens vs non-pathogens
Additionally, all Pfam families were analyzed to predict:
Co-occurring gene families across the bacterial tree of life using PhyloCorrelate [link]
Feasibility for structure determination
The results of the above analyses and other collected information about the domain families (not including Phylocorrelate data) is available to browse in several different ways:
The sequence input section performs a domain search on your protein sequence query, and displays pathogen-association statistics for each domain match. The displayed domain matches also link to their individual pages in the Domain database (see below).
Each page in this section contains visualizations of the data for a single domain family from the various analyses listed above, and how their rankings and statistics in the various categories compare to the other domain families.
Here there is a selection of eight different lists based on filtering and ranking the collected domain information. Each list is displayed with only pertinent columns but the full dataset is available to download at the top of every list page, and on the main page of the website (see below).
This section only contains a download link to the full dataset that is used for the Domain database and the Domain rankings. It is a copy of Supplemental Data S3 from our manuscript.
For more information, please see our manuscript.
Lobb B, Tremblay BJ, Moreno-Hagelsieb G, Doxey AC. PathFams: statistical detection of pathogen-associated protein domains. Forthcoming.