Scientists have deciphered 12,566 new species of microbiotas from DNA samples, expanding the diversity of bacteria and archaea by 44%.
A large international team led by DOE Joint Genome Institute in the US unravelled the jumbled-up cocktail of DNA in thousands of samples from soil, oceans, animals and humans across the globe, assembling 52,000 genomes in all.
The work, described in the journal Nature Biotechnology, demonstrates incredible collaborative science, and the power of genome databases.
Accessible genomic databases quickly become fixtures for anyone who uses genetics in their research, especially when considering how diverse microbiota are. Researchers can use them to identify proteins, viruses and processes in an ecosystem using DNA found in metagenomes.
A metagenome is the entire mix of genetic material in a sample from an environment. The DNA is a hodgepodge of sequences, which takes very careful analysis to untangle and identify. Instead of building a jigsaw, it is more like sorting through a pile of pieces from a thousand different unknown jigsaws and finding which box they came from.
This has been one major roadblock in understanding the relationship between hosts and bacteria, as the right bacterial species – or jigsaw box – may not even have been identified yet.
To make it even more complicated, each environmental sample has a different assortment of species, which tells us a lot about how a specific environment affects bacteria and vice versa. For humans, we can also use this information to understand how our microbiome helps us and whether the human microbiome is affected by where we live.
Another way to use this data is in medical research. Secondary metabolites are small molecules produced by micro-organisms. While they aren’t essential for growth, they help with survival in many ways, and can be used in medicine. A notable example is antibiotics.
In the latest study the team found over 100,000 regions – called gene clusters – that work together to build these secondary metabolites, some of which could be useful in medical research.
Interestingly, 75% of the newly defined species were uncultured; that is, they can’t be grown on a petri dish yet. This is a reminder of how complex the relationship between microorganisms is, as some cannot survive without each other. This makes it very difficult to understand the relationship between a virus and a host – a relationship that is extremely important to know in medical science.
However, the researchers used metagenomics to predict bacterial hosts of viruses and more than doubled the number of viruses that have a potentially known host. This is useful in learning where viruses come from and how to isolate or destroy them.
They also found that the 12,000 species mostly only revealed new orders, families, and genera. In simpler terms, they didn’t find previously unknown roots in the family tree of bacteria, but the new species fell nicely into the known family tree.
“Despite an overall 44% increase in phylogenetic diversity of bacteria and archaea, we found little evidence of new deep-branching lineages representing new phyla, consistent with recent studies of microbial diversity,” they say.
Of course, the family of microbiota is so large that there are plenty more species to discover. For this reason, the authors recognise that more metagenomics research is required to get a complete picture of species, secondary metabolite genes and viral relationships found In bacteria and archaea.
“Despite a 3.6-fold increase in recruitment of metagenomic reads, over two-thirds of metagenome reads still lack a mappable reference genome. Thus, continued efforts to capture the genomes of new species- and strain-level representatives will further improve metagenomic resolution,” they say.