Open main menu

Opengenome.net β

ORF prediction program

Revision as of 17:27, 17 June 2006 by 210.218.222.82 (talk)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

ORF prediction progrmas are bioinformatics tools for prediction open reading frames.

There have been numerous computational methods for processing prokaryotic genomic ORFs. 
Most of them are ab initio prediction methods. 
They provide the genomic coordinates of candidate ORFs using their own statistical and mathematical algorithms: 
GeneMark
ECOPARSE
GeneHacker
GeneMark.hmm
GLIMMER
GeneMarkS
EasyGene
ZCURVE
GeneLook

However, the ORFs predicted by such programs usually need further time-consuming manual processes. This is mainly due to low accuracy and insufficient evidence. Moreover, their accuracy usually depends on the quality of training sets and/or ‘seed’ ORFs that need manual validation for better performance. Even though GeneMarkS and GeneLook have increased the prediction accuracy, and GeneLook and the modified EasyGene have automated those manual steps, they still do not provide comprehensive information on predicted ORFs. Information such as frame-shifts, homology-based gene evidence, and best pair-wise matches against other prokaryotes is invaluable for professional curation and large-scale comparative analysis. 

To complement such ab initio prediction methods, some ORF prediction programs add homology-based methods: ORPHEUS, Critica, FrameD (17), and YACOP (18). ORPHEUS uses the DPS (DNA-Protein Search) program (19) to compare a given genomic sequence with a non-redundant protein sequence databank. FrameD can utilize a BLASTX (20) output provided by the user and provides predicted frame-shifts and conserved regions with other proteins. Critica uses the BLASTN program (20) to align a given genomic sequence with its related sequences chosen from DNA databases. YACOP combines three gene-predicting programs, Critica, Glimmer, and ZCURVE.