Fgenesh++ rice gene prediction

MT Martin Triska
VS Victor Solovyev
AB Ancha Baranova
AK Alexander Kel
TT Tatiana V. Tatarinova
request Request a Protocol
ask Ask a question
Favorite

Fgenesh++ (Find genes using Hidden Markov Models) [5860] is a HMM-based ab initio gene prediction program [61]. We used the rice chromosomes (version MSU 7, [29]) to make the initial gene prediction set, applying the Fgenesh gene finder with generic parameters for monocot plants. From this set, we selected a subset of predicted genes that encode highly homologous proteins (using BLAST with E-value cut-off 1.0E-10) to known plant proteins from the NCBI non-redundant (NR) database. Based on this subset, we computed gene-finding parameters, optimized for the rice genome, and executed the Fgenesh++ pipeline to annotate the genes in the genomic scaffolds. The Fgenesh++ pipeline used all available supporting data, such as known transcripts and homologous protein sequences. NR plant and, specifically, rice transcripts were mapped to the rice genomic sequences, therefore identifying a set of potential splice sites. Plant proteins were mapped to the rice genomic contigs, and the high scoring matches were selected to generate protein-supported gene predictions, so that only the highly homologous proteins were used in gene identification.

Amino acid sequences from predicted rice genes were then compared to the protein sequences from plant NR database using the 'bl2seq' routine, and the similarity was significant if it had a BLAST percent identity ≥ 50, BLAST score ≥ 100, coverage of predicted protein ≥ 80% and coverage of homologous protein ≥ 80%. BLAST analysis of the predicted sequences was also carried out against the O. sativa mRNA dataset, using an identify cutoff of >90%. Predictions that have both NR plant RefSeq and O. sativa mRNA support, as well as the 5′ UTR longer than 20 nucleotides and shorter than 1000 were selected for the analysis.

GFF file with Fgenesh++ gene prediction is available as a Supplemental Data file.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A