When we perform a BLAST search against a sequence database with a query sequence, the BLAST program returns the sequences producing significant alignment from the target database, which we refer to as hits. Between the query and a hit sequence, there exist many pairwise locally-optimal gapped local alignments, which we refer to as high scoring pairs or HSPs. The definitions of hits and hsps are slightly different from those used by Althshul and colleagues [14], but follows the structural definition from the XML output format produced by NCBI BLAST. In the XML output format, the “Sequences Producing Significant Alignments” are presented as Iteration hits, and the “significant alignments” are listed as HSPs, though these are gapped alignments. One hit can consist of many HSPs. HSPs are scored using some statistical metrics when comparing aligned symbols. The score for a hit is the score of the highest-scoring HSP that belongs to that hit. The e-value for an HSP is computed using the score, the database size, and other statistical parameters. The reported e-value of a hit is the e-value of the highest-scoring HSP of this hit [15].
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.