Where genome sequences come from Note that the path leading to Stockholm described in the previous set of notes relied on the existence of Drosophila genes and proteins in an accessible database. Before 2000, no database contained entries for more than a small fraction of genes and proteins from Drosophila. Before 1995, no database contained entries for more than a small fraction of genes from any organism. The fact that GenBank and other similar databases provide so rich a source of information results from the hundreds of genome sequencing projects that have sprung up since 1995. One can break up a genome project in many ways. Here's one: In these notes we'll examine the first step, using as an example the elucidation of the Drosophila genome, as described in: Myers EW et al (2000). A whole-genome assembly of Drosophila.You'll want to read this article lots of times, at least the first four pages (up to but not including Characteristics of the Drosophila assembly). To get the article, go to NCBI (perhaps through the Links page on the course web site) and use PubMed to search for the article. If you have a problem getting to the full article solve it!. Librarians are very helpful people. I also know how to do it, as does Sterling and many of your peers. If you choose to print out the article, choose the Full Text (PDF) link under Article Views. As you read the article, generate questions, particularly on issues that are essential for you to understand how genome sequences are elucidated. I have tried to anticipate some questions and provide some ways for you to answer them.
What is shotgun sequencing?
|