The computational Algorithm:
Since small RNA need to be transcribed we expected them to have transcriptional
signals. When investigating the 10 known small RNAs we discovered that
they are all located in regions without annotated open reading frames on
either of the two DNA strands. We also found that 9 of the 10 known small
RNAs are conserved in other bacteria.
Our algorithm therefore is composed of the following steps:
1) extract the "empty" regions (regions without an ORF or a known small
RNA, tRNA or rRNA gene, on either of the two DNA strands) - this was done
based on the Colibri
database.
2) Search the empty regions of the genome for transcriptional signals
- We searched for sigma70 promoters and Rho-independent
terminators.
3) extract sequences with a predicted terminator 50-400 bases downstream
to a predicted promoter.
4) check sequences for conservation. select only sequences with good
conservation
24 candidates were predicted. 23 were checked in the lab and 17 were shown to be true small RNAs, 14 of which were characterized.
A schematic representation of the Escherichia coli circular genome with locations of the small RNAs. The previously known small RNAs are colored orange. Novel small RNAs are colored red.