Questionnaire: Genome Analysis I

BNFO 301	Introduction to Bioinformatics Questionnaire on Genome Analysis - Tour of Rohwer and Edwards (2002) II (please press SUBMIT button when finished) (and please don't use quotation marks in your answers. They break the questionnaire program!)	Spring 2010

I. Basic Information

A. Your name

II. Tour of Mendel (1866) (and statistics in general)

In bioinformatics, the road to perdition generally goes right through a black box. In any field, if you use a formula you don't understand or rely on a mysterious tool that "just works" (except when it doesn't), you'll sometimes get burned. In bioinformatics, where big data sets magnify the effects of small misunderstandings, you'll find yourself frequently in a conflagration. Don't do it! Repent before it's too late!
Statistics are designed to help us keep our feet on firm ground, but they themselves are black boxes to most, invitations to use formulas you don't understand. Some of us (not many, and certainly not me!) gain understanding into statistical formulas by deriving them from first principles. If you're in the other camp, connecting them to concrete simulations may help you see what they do and when to use them (and when not to).
This is why you've gone through the Mendel tour constructing a simulation that cranks out a Chi-squared result and why you've made an analogous simulation for t-test.

How far were you able to get in simulating Mendel's experiment?
Have you compared the results of your simulation with a computed chi-squared test?
How far were you able to get in simulating a t-test (either comparing the frequencies of "ATG" in overlapping vs nonoverlapping genes or analyzing coral reef phages (in problem set 6)?
Do you see how the two simulations are inherantly different? (because the circumstances in which chi-squared tests are appropriate are inherantly different from circumstances in which t-tests are appropriate)
Any other questions on your mind about the tour or statistics?

III. Tour of Rohwer and Edwards (2002)

Do you understand the utility of constructing phylogenetic trees?
Are you convinced (by computational experiments you have performed) that there is no single protein found in all phage sequences?
Were you able to use BlastP (SEQUENCE-SIMILAR-TO) in order to measure the relatedness of phage genomes in the M13 family?
Can you detect any problems with the Blast method? ... perhaps problems that the authors addressed with the second method?
Any other questions on your mind about the article or related matters?

IV. Miscellaneous

Put here any miscellaneous comments, questions, suggestions, concerns you may have.
Doing anything interesting over break?

Thanks!

REMEMBER TO CLICK SUBMIT! (or click RESET to start over)