gkm-SVM

Dongwon Lee; Seong Kyu Han; Or Yaacov; Hanna Berk-Rauch; Prabhu Mathiyalagan; Santhi K. Ganesh; Aravinda Chakravarti

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

gkm-SVM

DL Dongwon Lee

SH Seong Kyu Han

OY Or Yaacov

HB Hanna Berk-Rauch

PM Prabhu Mathiyalagan

SG Santhi K. Ganesh

AC Aravinda Chakravarti

This method is extracted from research article: Cell Rep, Nov 2023

Tissue-specific and tissue-agnostic effects of genome sequence variation modulating blood pressure

DOI: 10.1016/j.celrep.2023.113351

Request a Protocol

Ask a question

Favorite

We built gkm-SVM models following our previously established pipeline with minor modifications.^26,27 For each high-quality sample as determined by gkmQC, we defined the positive training set as follows: starting from the top 100,000 open chromatin regions (ranked by their MACS2 p values obtained from our optimized pipeline described above), we removed from the training set peaks with >1% of N-bases, >70% of repeats, and commonly open regions (defined as regions active in at least 30% of samples across all ENCODE datasets), as previously described.^12,27 We further restricted open chromatin regions to overlapping H3K27ac peaks from the same tissue (Table S5). As a negative training set, we used an equal number of random genomic regions, matched for length, GC content and repeat fraction. To prevent potential bias caused by variable sequence length, we used 600bp fixed-length regions as a training set by extending ±300bp from peak summits. We used LS-GKM²⁶ software for training with l = 11, k = 7, d = 3, and t = 4 (weighted-gkm kernels). For each sample, we averaged ten different models with different random samplings of negative training sets. After training, we combined the models from different samples (i.e., biological replicates) to generate one model per tissue. Training sets and final models are provided in Data S5.

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol