Introduction to Bioinformatics - Biol 591             

 

SCENARIO
Prediction of gene class turned on by same stimulus
How to predict binding sites for key regulatory protein?

Exploiting all available information
That consensus sequence threw away a lot of information. The alignment of proven NtcA-binding sites shows that there are biases, some subtle some very evident, at most positions near the consensus sequence. This is illustrated graphically at the right.

In principle, it should be possible to take this snapshot of nucleotide bias and run it along the genome of Anabaena, saving those sequences that best fit the profile. That's easy to imagine as a picture, but computers deal with numbers.

How to quantitate and use positional frequency information?

back to main page     back