ATAC-seq data analysis

Nicole Oatman; Nupur Dasgupta; Priyanka Arora; Kwangmin Choi; Mruniya V. Gawali; Nishtha Gupta; Sreeja Parameswaran; Joseph Salomone; Julie A. Reisz; Sean Lawler; Frank Furnari; Cameron Brennan; Jianqiang Wu; Larry Sallans; Gary Gudelsky; Pankaj Desai; Angelo D’Alessandro; Kakajan Komurov

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

ATAC-seq data analysis

NO Nicole Oatman

ND Nupur Dasgupta

PA Priyanka Arora

KC Kwangmin Choi

MG Mruniya V. Gawali

NG Nishtha Gupta

SP Sreeja Parameswaran

JS Joseph Salomone

JR Julie A. Reisz

SL Sean Lawler

FF Frank Furnari

CB Cameron Brennan

JW Jianqiang Wu

LS Larry Sallans

GG Gary Gudelsky

PD Pankaj Desai

AD Angelo D’Alessandro

KK Kakajan Komurov

This method is extracted from research article: Sci Adv, Feb 2021

Mechanisms of stearoyl CoA desaturase inhibitor sensitivity and acquired resistance in cancer

DOI: 10.1126/sciadv.abd7459

Ask a question

Favorite

FASTQ files, composed of 222,242,982 and 154,469,827 total sequencing reads for the parental and acquired resistance cell ATAC-seq experiments, respectively, were analyzed using the MARIO Pipeline (69). Briefly, the pipeline first runs QC on the FASTQC files containing the sequence reads using FastQQ (v0.11.2) (www.bioinformatics.babraham.ac.uk/projects/fastqc/). If FastQC detects adapter sequences, the pipeline runs the FASTQ files through Trim Galore (v0.4.2) (www.bioinformatics.babraham.ac.uk/projects/trim_galore/), a wrapper script that runs cutadapt (v1.8.1) (70) to remove the detected adapter sequence from the reads. The quality controlled reads were then aligned to the reference human genome (hg19/GRCh37) using bowtie2 (v2.3.4.1) (71). The aligned reads (in a .BAM format) were then sorted using samtools (v1.8.0) (72), and duplicate reads were removed using picard (v1.89) (https://broadinstitute.github.io/picard/). Last, peaks were called using MACS2 (v2.1.0) (https://github.com/taoliu/MACS), resulting in 54,415 and 60,339 ATAC-seq peaks for the parental and acquired resistance datasets, respectively.

The ATAC-seq experimental design consisted of replicate experiments of parental cells and acquired resistance cells. After independently analyzing the four datasets using the MARIO pipeline, we concluded that the replicates were highly similar (based on peak overlap). The .FASTQ files for the replicates were thus concatenated into a single set of reads for each of the parental and acquired resistance experiments, and alignment and peak calling were performed as described above.

To identify regions of differential chromatin accessibility between the parental and acquired resistance ATAC-seq datasets, we used MAnorm (61) with default parameter settings for read shift size (100), peak width (1000), and distance cutoff (500). To identify peaks unique to each cell types, we used a P value cutoff of 0.01 and a fold change cutoff of 1. These settings resulted in 9202 parental-specific peaks, 16,262 acquired resistance peaks, and 41,727 common peaks. Each peak set was then examined for enriched TF binding site motif instances using the HOMER suite of tools (73), modified to include the set of motifs contained in the Cis-BP database (74).

Copyright and License information: ©2021 The Authors, some rights reserved; exclusive licensee American Association for the Advancement of Science. No claim to original U.S. Government Works. Distributed under a Creative Commons Attribution License 4.0 (CC BY).

This is an open-access article distributed under the terms of the Creative Commons Attribution license, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol