k-mer counts.

NP Nathan B. Pincus
EO Egon A. Ozer
JA Jonathan P. Allen
MN Marcus Nguyen
JD James J. Davis
DW Deborah R. Winter
CC Chih-Hsien Chuang
CC Cheng-Hsun Chiu
LZ Laura Zamorano
AO Antonio Oliver
AH Alan R. Hauser
request Request a Protocol
ask Ask a question
Favorite

k-mer counts (using either 8- or 10-bp k-mers) were determined for each genome using KMC3 (v3.0.0) (56). All k-mers occurring at least once in each genome’s fasta file were identified using the kmc application, and a count file was generated using the kmc_dump application. All unique k-mers identified in the training set of 115 P. aeruginosa genomes were used to construct a data frame of k-mer counts for each genome. This served as the k-mer feature set in subsequent machine learning analyses. This feature set was defined in the 25 test set isolates by considering the counts of all k-mers previously identified in the training set.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A