A comprehensive aligned nifH gene database: A multipurpose tool for studies of nitrogen-fixing bacteria

Abstract

We describe a nitrogenase gene sequence database that facilitates analysis of the evolution and ecology of nitrogen-fixing organisms. The database contains 32 954 aligned nitrogenase nifH sequences linked to phylogenetic trees and associated sequence metadata. The database includes 185 linked multigene entries including full-length nifH, nifD, nifK and 16S ribosomal RNA (rRNA) gene sequences. Evolutionary analyses enabled by the multigene entries support an ancient horizontal transfer of nitrogenase genes between Archaea and Bacteria and provide evidence that nifH has a different history of horizontal gene transfer from the nifDK enzyme core. Further analyses show that lineages in nitrogenase cluster I and cluster III have different rates of substitution within nifD, suggesting that nifD is under different selection pressure in these two lineages. Finally, we find that that the genetic divergence of nifH and 16S rRNA genes does not correlate well at sequence dissimilarity values used commonly to define microbial species, as stains having textless3% sequence dissimilarity in their 16S rRNA genes can have up to 23% dissimilarity in nifH. The nifH database has a number of uses including phylogenetic and evolutionary analyses, the design and assessment of primers/probes and the evaluation of nitrogenase sequence diversity. Database URL: http://www.css.cornell.edu/faculty/buckley/nifh.htm.

Publication
Database : the journal of biological databases and curation