Prediction of Rho-independent terminators

As above, the prediction was based on 80 experimentally determined Rho-independent terminators, 75 from a compilation by d’Aubenton Carafa, and five from the known sRNA genes: oxyS, micF, dsrA, spf (spot42) and csrB.  We characterized the properties of these terminators and applied this knowledge in the prediction.  It is well known that Rho-independent terminators form a stem-loop structure. Therefore, the sequences of known terminators were folded by an RNA folding algorithm based on free energy consideration, which provided both the predicted secondary structure and its stability. This was carried out by the Mfold program of GCG.  Almost all of the known terminators formed a stem 5 to 10 base-pairs in length with a loop of 3 to 8 bases.  The stems were GC rich, and most of them had at least 60% GC base-pairs.  In most structures the free energy was calculated to be below –7 kcal/mole.  Another known feature of Rho-independent terminators is a uridine stretch that follows the stem.  The average number of uridine residues in the known terminators was four.  In most cases this stretch was located immediately downstream of the end of the stem.  Based on the above, the search of terminators in the “empty” regions involved several steps: 1) A search for sequences that could create a GC-rich stem with a loop, followed by at least four U residues. This was achieved by searching two sequences of the same length (5-10 bases), composed of at least 60% G/C, and requiring that the two sequences be separated by three to eight bases (the loop), and followed by a stretch of at least four U residues.  2) The candidate sequences were folded by the Mfold program and structures with free energy values of at most –7 kcal/mole were selected.  These structures were re-examined to validate that Mfold kept the characteristics of the structure determined in the first step.