A new computational method has been shown to quickly assign, order and orient DNA sequencing information along entire chromosomes.聽 The method may help overcome a major obstacle that has delayed progress in designing rapid, low-cost — but still accurate — ways to assemble genomes from scratch. 聽Data gleaned through this new method can also validate certain types of chromosomal abnormalities in cancer, research findings indicate.
The advance was reported Nov. 3 in by several 天美影视传媒 scientists led by Dr. Jay Shendure, associate professor of genome Sciences.
Existing technologies can quickly produce billions of 鈥渟hort reads鈥 of segments of DNA at very low cost.聽 Various approaches are currently used to put the pieces together to see how DNA segments line up to form larger stretches of the genetic code.
However, current methods produce a highly fragmented genome assembly, lacking long-range information about what sequences are near what other sequences, making further biological analysis difficult.
鈥淕enome science has remained remarkably distant from routinely assembling genomes to the standards set by the Human Genome Project,鈥 said the researchers. 聽They noted that the Human Genome Project tapped into many different techniques to achieve its end result.聽 Many of these are too expensive, technically difficult, and impractical for large-scale initiatives such as the Genome 10K Project, which aims to sequence and assemble the genomes of 10,000 vertebrate species.
Members of the Shendure lab that developed what they hope will be a more scalable strategy were Joshua N. Burton, Andrew Adey, Rupali P. Patwardhan, Ruolan Qiu, and Jacob O. Kitzman.
To more completely assemble genomes, they tapped into a technology called Hi-C, which measures the three-dimensional architecture and physical territories of chromosomes within the nuclei of cells. Hi-C maps the physical interactions between regions of the chromosomes in a genome, including contact within a chromosome and with other chromosomes. 聽The results indicate which regions tend to occur near each other within three-dimensional space in a cell鈥檚 nucleus.
The researchers speculated that this interaction data, because it offers clues about the position of and distances between various regions of the chromosome, might reveal how DNA sequences are grouped and lined up along entire chromosomes. 聽聽They wondered if the interaction data could show them which regions of the genome are near each other on each chromosome.
Their investigation of this possibility led them to create what they named LACHESIS (an acronym for 鈥渓igating adjacent chromatin enables scaffolding in situ鈥), and also the Fate that measures the thread of destiny.
The map of physical interactions generated by Hi-C was interpreted by the LACHESIS computational program to assign, order and orient genomic sequences into their correct position along chromosomes, including DNA positioned close to the centromere, the 鈥減inch waist鈥 gap in the chromosome shape.
The researchers combined their new approach with other cheap and widely used sequencing methods to generate chromosome-scale assemblies of the human, mouse and fruit fly genomes. The researchers were able to cluster nearly all scaffolds — collections of short DNA segments whose position relative to each other is unknown — into groups that corresponded to individual chromosomes.
They then ordered and oriented the scaffolds assigned to each chromosome group, and validated their results by comparing them to the high-quality reference genomes for these species that were generated by the Human Genome Project. In the case of human genome, they achieved 98 percent accuracy in assigning tens of thousands of sequences of contiguous DNA to chromosome groups and 99 percent accuracy in ordering and orienting these sequences within chromosome groups.
鈥淲e think the method may fundamentally change how we approach the assembly of new genomes with next-generation sequencing technologies,鈥 noted Shendure.
While he and his team cite many areas in which the computational and experimental methods can be improved, the approach is an important step in his lab鈥檚 long-term goal to facilitate the assembly, for a variety of species, of low-cost, high-quality genomes that meet the rigorous standards set by the Human Genome Project.
The research was supported by grants HG006283 and T32HG000035 from the National Human Genome Research Institute, and graduate research fellowships from the National Science Foundation.