How do you begin to study microbes that live in extreme conditions like hydrothermal vents in the deep sea, or hypersaline lakes in Antarctica? Even if you can collect them, what if you can’t grow them in culture? Metagenomics allows scientists to study these tricky organisms using only a water or soil sample, and draw conclusions about their life using their DNA.
Traditional microbiology relies on isolated cultures of microbes to make conclusions about an individual organism’s biology, physiology and overall ecology. Since the first days of microbiology, we have been growing our microbes in the lab and looking at them under the microscope. Scientists have been making discoveries about genetic pathways and metabolism well before the age of genetic sequencing, but the availability of genomic data since the 1970s has led to an explosion of discoveries that have changed microbiology. We are now well into the age of “High Throughput sequencing”, and as of 2014, over 30,000 bacterial genomes were available through the National Center for Biotechnology Information (NCBI) database. Thanks to a decrease in cost and quicker results, genetic sequencing has become a strong partner to in-lab microbiology methods.
Growing and isolating microbes in the lab, and extracting DNA for sequencing is a standard protocol. However, for scientists looking at organisms in extreme and challenging environments, studying the physiology of these organisms can prove quite difficult, motivating the search for new methods. Metagenomics, or the study of community DNA as opposed to the DNA of a single organism is one way we can begin to study organisms from challenging environments, such as the deep sea and polar regions. Sequencing the total DNA from a water or soil sample results in a large dataset of DNA which scientists organize and sift through using bioinformatic pipelines Processing these giant datasets is facilitated by new technologies in computing, and a community of coders who share their tools for analyzing metagenomic DNA for the greater good.
Metagenomic data can be used in a variety of ways. Whether it’s discovering novel viruses or bacteria in Antarctic lakes, identifying methanotrophic bacteria in basic springs, or proving the timeline of Smelt migrations in rivers in Maine, environmental DNA provides functional insights into communities, or individual organisms impact in complicated and ever changing environments. Here at Bigelow Laboratory, we are using water sampled from a lake in Antarctica and the metagenomic libraries created from those samples, to discover what viruses are present in the lake. Since the whole community is contained in the sample, we can also look for the algae hosts of the viruses, and start to make conclusions about the interactions between these groups. In particular, we are searching for double-stranded DNA viruses that infect eukaryotic hosts such as algae and fungi. There are many other organisms present in the lake, so we use a series of filters with a variety of pore sizes to limit the organisms that show up in our DNA samples. Or work focuses on these eukaryotic viruses, virophages (or viruses that infect other viruses), and polinton-like viruses (self-replicating transposable elements that can infect either viruses or eukaryotic hosts) specifically. However collaborators around the world are also searching for bacteria and bacteriophage present in the lake also using the exact same DNA collected for our experiment!
Bioinformatic work is a fast and constantly changing field. We are lucky to work in a time where collaboration online has become prevalent, allowing anyone interested to begin studying using these tools even if they’ve never used them before, including students like myself.
Emily Haggett is a Southern Maine Community College student in Bigelow Laboratory for Ocean Science’s Research Experience for Undergraduates program. This intensive experience provides an immersion in ocean research with an emphasis on hands-on, state-of-the-art methods and technologies.