Biol 591 |
|
Fall 2002
|
Scenario
: Identification of a possible regulatory site in genomic DNA
Our Story
Nitrogen-fixing cyanobacteria:
Eat air and prosper! Certain cyanobacteria, amongst them Nostoc PCC 7120, are among the only creatures on earth able to survive on CO2 as a source of carbon, N2 as a source of nitrogen, water as a source of electrons, and sunlight as a source of energy. This is quite a trick, because the process of fixing carbon with electrons from water necessarily produces O2 as a byproduct and the process of fixing N2 is irreversibly inactivated by tiny amounts of oxygen. Nostoc is able to protect the machinery of nitrogen-fixation from inactivation by producing specialized cells, called heterocysts, that rigorously exclude oxygen from within them. |
|
The cost of fixing nitrogen: How to pay only when
necessary?
Heterocysts are expensive to make and maintain, however,
and you are interested studying the mechanism by which Nostoc regulates
the appearance of heterocysts. When an alternative source of nitrogen is
present, Nostoc makes no heterocysts. When that source is consumed or removed,
vegetative cells differentiate into heterocysts within about 18 hours. How
do the cells sense nitrogen deprivation and translate that perception into
the induction of the genes necessary for heterocyst differentiation? At present,
the answer to this question is not known.
The discovery: starvation ==> *** NtcA-BINDING
*** ==> heterocyst differentiation
You are studying the regulation of the gene hetR, whose
product is known to be critical in controlling heterocyst differentiation.
You're focusing on the protein HetQ, which you believe regulates the expression
of hetR. Your plan is to make random mutations in hetQ (which encodes HetQ),
hoping to understand from the resulting mutant protein how the regulation
is achieved. In examining the sequence upstream of hetQ, you happen to notice
the presence of the sequence:
atctGTAacatgagaTACacaatagcatttatatttgcttTAgtaTctctThe capital letters, you recognize, meet all the requirements of a binding site for the protein NtcA, known to mediate the expression of many genes sensitive to nitrogen-deprivation. Maybe, just maybe, you have accidently discovered the missing link that connects nitrogen-deprivation to the regulation of heterocyst genes!
The discovery? How do you know?
Unfortunately, you need hard evidence that NtcA actually
binds to that site before anyone will believe your theory. And hard evidence
means spending the better part of a year measuring the binding of NtcA to
your sequence in the test tube. If it DOESN'T bind, then you've wasted a lot
of time. Is there any way to assess the LIKELIHOOD that NtcA will bind to
your sequence without actually having to do time-consuming experiments? How
can you tell whether the sequence you found might not have arisen by chance
without regard to function?
Problem
Use bioinformatic tools to assess the likelihood of encountering
a specific DNA sequence by chance.
Tools
Simulation
Make up a large number of sequences. Ask in each case
whether the sequence fits the criteria for an NtcA binding site. Count how
many times it does, how many times it doesn't.
Pattern recognition
Scan the genome of Nostoc PCC 7120 and count how
many sequences fit the pattern of an NtcA binding site.