As above, the prediction was based on 80 experimentally determined Rho-independent
terminators, 75 from a compilation by d’Aubenton Carafa, and five from
the known sRNA genes: oxyS, micF, dsrA, spf (spot42) and csrB. We
characterized the properties of these terminators and applied this knowledge
in the prediction. It is well known that Rho-independent terminators
form a stem-loop structure. Therefore, the sequences of known terminators
were folded by an RNA folding algorithm based on free energy consideration,
which provided both the predicted secondary structure and its stability.
This was carried out by the Mfold program of GCG. Almost all of the
known terminators formed a stem 5 to 10 base-pairs in length with a loop
of 3 to 8 bases. The stems were GC rich, and most of them had at
least 60% GC base-pairs. In most structures the free energy was calculated
to be below –7 kcal/mole. Another known feature of Rho-independent
terminators is a uridine stretch that follows the stem. The average
number of uridine residues in the known terminators was four. In
most cases this stretch was located immediately downstream of the end of
the stem. Based on the above, the search of terminators in the “empty”
regions involved several steps: 1) A search for sequences that could create
a GC-rich stem with a loop, followed by at least four U residues. This
was achieved by searching two sequences of the same length (5-10 bases),
composed of at least 60% G/C, and requiring that the two sequences be separated
by three to eight bases (the loop), and followed by a stretch of at least
four U residues. 2) The candidate sequences were folded by the Mfold
program and structures with free energy values of at most –7 kcal/mole
were selected. These structures were re-examined to validate that
Mfold kept the characteristics of the structure determined in the first
step.