A consortium of U.S. and international researchers, including a group at the University of California, Davis, Genome Center, has completed a detailed study of a piece of the human genome. The project yielded some surprises as well as developing technology for exploring the rest of the genome.
A team led by Peggy Farnham, professor at the Genome Center and the Department of Pharmacology at 51³Ô¹ÏºÚÁÏ Davis, looked for sites where factors bind on to DNA to turn genes on or off. These "transcription factors" play a key role in deciding what cells do and when, and play an important role in stem cell development and cancer.
In a group paper published in the June 14 issue of Nature and in 28 companion papers published in the June issue of Genome Research, the ENCyclopedia Of DNA Elements (ENCODE) consortium, which is organized by the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health (NIH), reported results of its exhaustive, four-year effort to build a parts list of all biologically functional elements in one percent of the human genome. The work was carried out by 35 groups from 80 organizations around the world.
"This impressive effort has uncovered many exciting surprises and blazed the way for future efforts to explore the functional landscape of the entire human genome," said Francis Collins, director of NHGRI. "Because of the hard work and keen insights of the ENCODE consortium, the scientific community will need to rethink some long-held views about what genes are and what they do, as well as how the genome's functional elements have evolved. This could have significant implications for efforts to identify the DNA sequences involved in many human diseases."
One outcome of the study so far is that while some transcription factors attach to the DNA close to the gene they regulate, many others bind a long way away. By looking across the genome, we can study these patterns, Farnham said.
"If we want to know how a gene is turned on, we need to fill in the grid," Farnham said. About 30 transcription factors have been looked at so far, but there are thought to be about 2,000 in the entire genome.
The three billion base pairs or "letters" of DNA in the human genome form the instruction manual needed to make the human body. Researchers still must learn how to read the manual, identify every part and understand how the parts work together to contribute to health and disease.
Farnham's lab uses chromatin immunoprecipitation chips, or "ChIP chips," developed by Nimblegen Systems of Madison, Wis., to look for binding sites for transcription factors. At the beginning of the ENCODE study in 2004, "homemade" chips carried 20,000 spots of DNA on a glass slide. Now slides carry 380,000 spots per slide, and the newest chips carry more than 2 million spots.
"Over the course of the project we've gone from proof of principle to 2 million-feature arrays," Farnham said. She began the collaboration with Nimblegen while on the faculty at the University of Wisconsin-Madison, before joining 51³Ô¹ÏºÚÁÏ Davis in 2004.
The work has applications in areas such as cancer and stem cell research. Cells of different types -- for example breast or liver cells -- have different parts of their genome switched off or "silenced." This silencing tells cells what they should be as they grow from stem cells into mature cells. In cancer, some of this silencing may come undone. Farnham's lab is building maps of the genome to show which parts are silenced or active in different cell types.
DNA can be silenced in different ways. As part of the ENCODE studies, Farnham has found that different modes of silencing affect distinct parts of the genome and are tied to distinct classes of transcription factors.
Overall, a major finding of the consortium was that most of the DNA in the human genome is transcribed into functional molecules, called RNA, and that these transcripts extensively overlap one another. This challenges the long-standing view that the human genome consists of a relatively small set of discrete genes, along with a vast amount of so-called junk DNA that is not biologically active. The new data indicate the genome contains very little unused sequences and, in fact, is a complex, interwoven network. Genes are just one of many types of DNA sequences that have a functional impact.
The data from the ENCODE project is available through public databases, primarily through a Web site run by 51³Ô¹ÏºÚÁÏ Santa Cruz. In the next phase of the project, the NIH will issue grants to apply the techniques and knowledge from ENCODE to the rest of the genome.
Media Resources
Andy Fell, Research news (emphasis: biological and physical sciences, and engineering), 530-752-4533, ahfell@ucdavis.edu
Peggy Farnham, Genome Center, (530) 754-4988, pjfarnham@ucdavis.edu