The identification of kmers was performed following previously established definitions defined in [18]. Nullomer and nullpeptide detection were performed as previously described in [18] for each species at each kmer length.
Identification of nucleic and peptide quasi-primes.
DNA quasi-prime identification was performed by identifying kmers that were present in each reference genome and nullomers in every other reference genome. Similarly, peptide quasi-prime identification was performed by identifying kmers that were present in each reference proteome and nullomers in every other reference proteome.
Identification of nucleic quasi-primes was performed for kmer length of sixteen bps. This was the shortest kmer length at which we observed DNA quasi-primes. Similarly, for peptide kmers, we performed quasi-prime identification for kmer lengths of six and seven amino acids, since these were the shortest peptide lengths at which we observed quasi-primes.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.