Functional Metagenomics: Sequence Everything and Let DNA Sort The Functions Out

One of the cool things you can do with the high throughput DNA analysis of pyrosequencing, is to collect a sample from the environment, isolate the DNA from everything in it and sequence it. Then you can match the DNA up with known sequences and see what sort of microbes you had. Dinsdale and a bunch of coauthors collected the data from a bunch of such studies. They managed to find 45 bacterial samples and 42 viral samples from 9 broad environmental classifications. You can see all the different samples the authors pooled together (circles microbial and squares viral).

Locations of metagenomic samples from Dinsdale et al.

The interesting thing about this study was that instead of looking at the taxonomy of the critters as usual, they looked at the function of the genes. By simply looking at what the genes do, the researchers hoped to get a feel for what activities were going on in that environment without necessarily having to identify the species of the bacteria and viruses. To do this, they fed their 14.5 million sequences (pyrosequencing sure can generate data) into the SEED database, a big collection of genes which have been assigned to functions (for example membrane transport or sulphur metabolism) by experts. They were able to match 1 million of the bacterial and 500,000 of the viral sequences to previously identified gene functions.

It might seem odd that they would look at viral DNA since viruses are rather simple and have only a few basic genes. But the researchers were actually looking at bacterial genetic sequences being carried inside viruses. This of course brings up the question of what bacterial DNA is doing inside viruses. It turns out there are a lot of bacteriophage viruses that like to infect bacteria and sometimes these viruses capture some of the DNA of their bacterial hosts and carry it to their next host. Looking at the bacterial DNA present in a viral population gives an interesting look at what types of genes are being passed around between individual bacteria (and even between bacterial species).

So here are the high level classifications of the function of the genes they found for each environment.

Percentages of gene function of bacterial and viral gene function from Dinsdale et al.

It’s pretty cool that the viruses were carrying around so much of a variety of bacterial DNA. The authors suggest that motility genes coding for things like flagella and cilia (which could help the bacterial host spread the virus further) were enriched in the viral samples but it seems a bit hard to say that for certain without a bit more analysis.

A useful way to look at huge masses of data, like their 1.5 million matches, is to try and reduce all the different counts in the functional categories into a couple of condensed variables. This can be seen in the next couple plots. They could use a little explaining. Bacterial sequences are on top and viral sequences on the bottom. Lines show how the various functional categories have been condensed into the x and y variables. For example, samples that contained lots of genes for making cell walls will tend to be at the top of the plot in the bacterial samples and tend not to have many genes for respiration.

Canonical discriminant function analysis of bacterial and viral gene function from Dinsdale et al.

It’s pretty cool to see how the various environments clustered with other samples from the same environment. For example, all the yellow diamond fish farm samples ended up on the right side of the bacteria graphs even though they were sampled independently. It appears that functions seem to correlate with environmental conditions. For example, the fish food at the fish farms contained a lot of sulfur supplements and the bacteria from those samples were rich in sulfur metabolism genes and the bacteria from corals contained many different respiration genes to deal with the highly variable oxygen concentrations found there. Dinsdale and her coauthors go so far as to suggest that gene function may provide a better indicator of environment than the taxonomy of the bacteria present.

The paper did have a little trouble in the math in one part but the authors already have a correction in for it so it’s really not worth worrying about. Overall, it was a pretty interesting story and a good example of stuff to do with a sequencing machine (also it must have taken a good bit of work to collect all that data together from all those authors).


Elizabeth A. Dinsdale, Robert A. Edwards, Dana Hall, Florent Angly, Mya Breitbart, Jennifer M. Brulc, Mike Furlan, Christelle Desnues, Matthew Haynes, Linlin Li, Lauren McDaniel, Mary Ann Moran, Karen E. Nelson, Christina Nilsson, Robert Olson, John Paul, Beltran Rodriguez Brito, Yijun Ruan, Brandon K. Swan, Rick Stevens, David L. Valentine, Rebecca Vega Thurber, Linda Wegley, Bryan A. White, Forest Rohwer (2008). Functional metagenomic profiling of nine biomes Nature, 452 (7187), 629-632 DOI: 10.1038/nature06810


Comments (1)


Cancer Fighting Bacteria

I was doing a bit of background reading and came across an interesting paper about mutating normal bacteria into cancer-fighting bacteria. The paper centers around a single gene called inv (short for invasin) that can give an otherwise mild-mannered noninfectious bacteria the ability to invade cells.

Now this might seem like a pretty bad idea since there are probably enough infectious bacteria in the world already but this was only the first step of the research. Anderson and colleagues attached inv to a genetic switch (normally used for bacterial metabolism control) that turns on when arabinose (a type of sugar) is present. Unfortunately this switch was a little leaky. So even bacteria without arabinose were still infectious. Not ones to let that stop them, the researchers took out the ribosome (protein-making organelle) binding region of the gene, randomly mutated it and tested to find bacteria that were off by default but still able to turn on.

Once they got that working, they decided to attach a sensor to the infective gene. Bacteria often do things like switch metabolisms when they run out of oxygen. The researchers picked one of the bacteria genes that turns on when oxygen is low and replaced the arabinose switch from the previous bit with the oxygen sensing switch from this gene. Again the switch was leaky and they had to mutate it so it stayed off by default. Once that was done they had a bacteria that was only invasive in anaerobic environments. That’s pretty cool because tumors are often anaerobic (since they’re big lumps of fast growing dense tissue).

Plasmid for density dependent infectious bacteria

To go even further, the researchers tried to create bacteria that only turn on when there are many bacteria in one location. This will be useful because tumors often have higher concentrations of bacteria due to leaking nutrients and poor immune response. By creating a switch that only turns on when a bunch of bacteria are present, the bacteria can be further targeted to cancerous cells. To do this they used a gene from an ocean-dwelling bacteria that only turns on when many bacteria are present (the ocean bacteria uses the gene to detect when it has reached the light organ of squid). It seems odd that bacteria can communicate but it comes down to a simple mechanism made up of two genes. One gene encodes an enzyme that makes a chemical, called AI-1, that easily disperses in and out of the cell membrane. The second encodes a gene activator that is turned on by high concentrations of AI-1. When there are many bacteria, the environment becomes rich in AI-1 and the gene activator turns on even more production of AI-1 and gene activators. This positive feedback causes creates a sensitive switch that switches quickly from all off to all on when bacterial concentration crosses a certain level. By linking these genes to the infectious inv gene, the researchers created a bacteria that was only infectious when in high concentrations.

So now we have bacteria that might be able to selectively infect tumor cells. By combining this selective invasiveness with cell killing or immune response activating mechanisms, bacteria could become helpful tools for treating cancer (although there is still a pretty long way to go). The paper makes it look easy but that must have taken a good bit of work to get it all working so nicely. They ended up using DNA from three different bacteria species and many different bacterial systems. It’s always really cool to see how scientists can take DNA “parts” and combine them together to create new and useful functions and even edit the DNA directly when the parts don’t fit correctly.

I guess the next step in the research is to figure out how to get a bacteria to sense both an anaerobic and a high density environment. This might be a bit tricky since the two sensors would have to interact but I see some of the same researchers also have a paper on creating bacterial AND gates so I’ll have to give that one a read too.


Anderson, J., Clarke, E., Arkin, A., Voigt, C. (2006). Environmentally Controlled Invasion of Cancer Cells by Engineered Bacteria. Journal of Molecular Biology, 355(4), 619-627. DOI: 10.1016/j.jmb.2005.10.076


Comments (3)