UCSC-CRL-02-30: SCORE FUNCTIONS FOR ASSESSING CONSERVATION IN LOCALLY ALIGNED REGIONS OF DNA FROM TWO SPECIES

07/13/2002 09:00 AM
Biomolecular Engineering
There are dozens of universities and research centers around the world working on sequencing and assembling the genomes of a whole host of species. One of the countless uses of the data generated by these genome projects is its ability to aid in the discovery of genes and other functional regions in the human genetic code. Most exciting are the projects to sequence the genomes of other mammals, because their genes share most of the same functions with human genes, making them recognizably similar at the DNA level. The mouse genome will be the second mammalian genome that is completely sequenced, after the human genome. It is 95% complete now, with more than 2.6 billion bases of DNA sequenced. In this paper we construct several score functions for human-mouse aligned genomic regions. These score functions are derived from properties of neutrally evolving sites on the mouse and human genome. The aim of these functions is to identify regions of the human genomes that are conserved by evolutionary selection, because they have an important function, rather than by chance. Only by looking at what parts of the genome are conserved over long periods of evolution can we find the regions give a selective advantage because they contain key functional elements. In this way we can use the mouse genome as a key to decoding the human genetic code.

UCSC-CRL-02-30