Each seed sequence is PSI-BLASTed [78] (3 iterations, 0.005 E-value cutoff) against the organism-specific database. The returned full sequences (i.e., not just the parts locally aligned against parts of a seed) and their metadata are stored in the database, with unique entries by accession number. The full sequences may contain additional associated domains not represented among the seed sequences. Thus to cast an even broader net, this expanded set of sequences is clustered to 90% identity using CD-HIT, and the PSI-BLAST process is repeated once more with this larger set of representative sequences.

Note: The content above has been extracted from a research article, so it may not display correctly.

Please log in to submit your questions online.
Your question will be posted on the Bio-101 website. We will send your questions to the authors of this protocol and Bio-protocol community members who are experienced with this method. you will be informed using the email address associated with your Bio-protocol account.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.