k-mer counts.

Nathan B. Pincus; Egon A. Ozer; Jonathan P. Allen; Marcus Nguyen; James J. Davis; Deborah R. Winter; Chih-Hsien Chuang; Cheng-Hsun Chiu; Laura Zamorano; Antonio Oliver; Alan R. Hauser

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

k-mer counts.

NP Nathan B. Pincus

EO Egon A. Ozer

JA Jonathan P. Allen

MN Marcus Nguyen

JD James J. Davis

DW Deborah R. Winter

CC Chih-Hsien Chuang

CC Cheng-Hsun Chiu

LZ Laura Zamorano

AO Antonio Oliver

AH Alan R. Hauser

This method is extracted from research article: mBio, Aug 2020

A Genome-Based Model to Predict the Virulence of Pseudomonas aeruginosa Isolates

DOI: 10.1128/mBio.01527-20

Request a Protocol

Ask a question

Favorite

k-mer counts (using either 8- or 10-bp k-mers) were determined for each genome using KMC3 (v3.0.0) (56). All k-mers occurring at least once in each genome’s fasta file were identified using the kmc application, and a count file was generated using the kmc_dump application. All unique k-mers identified in the training set of 115 P. aeruginosa genomes were used to construct a data frame of k-mer counts for each genome. This served as the k-mer feature set in subsequent machine learning analyses. This feature set was defined in the 25 test set isolates by considering the counts of all k-mers previously identified in the training set.

This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol