Where genome sequences come from Note that the path leading to Stockholm described in the previous set of notes relied on the existence of Drosophila genes and proteins in an accessible database. Before 2000, no database contained entries for more than a small fraction of genes and proteins from Drosophila. Before 1995, no database contained entries for more than a small fraction of genes from any organism. The fact that GenBank and other similar databases provide so rich a source of information results from the hundreds of genome sequencing projects that have sprung up since 1995. One can break up a genome project in many ways. Here's one: In the next set of notes we'll examine the first step, using as an example the elucidation of the Drosophila genome, as described in: Myers EW et al (2000). A whole-genome assembly of Drosophila.You'll eventually want to read this article lots of times, at least the first four pages (up to but not including Characteristics of the Drosophila assembly). For now, just skim those pages to get an idea what's in store. To get the article, go to NCBI (perhaps through the Links page on the course web site) and use PubMed to search for the article. If you have a problem getting to the full article solve it!. Librarians are very helpful people. I also know how to do it, as do many of your peers. If you choose to print out the article, choose the Full Text (PDF) link under Article Views. As you read the article, generate questions, particularly on issues that are essential for you to understand how genome sequences are elucidated. I have tried to anticipate some questions and provide some ways for you to answer them.
The main task for today is to understand some of the techniques used in the paper. I know you are capable of finding background on the web, but I've saved you some trouble by gathering together some useful links (I won't always be so helpful). Use them or anything else you like to get the basic idea. What is shotgun sequencing?
|