Mouse embryonic stem cell data processing

DL Dohoon Lee
JY Jeewon Yang
SK Sun Kim
request Request a Protocol
ask Ask a question
Favorite

To evaluate the utility of Chromoformer for species other than human, we processed the ENCODE reference epigenome of ES-Bruce4 mouse embryonic stem cell (mESC) line from its raw histone ChIP-seq reads (Supplementary Table 2). To be consistent with human data, the processing pipeline followed that of Roadmap Epigenomics Project as described below. After downloading FASTQ files, histone ChIP-seq reads were first aligned to mm9 reference genome using bwa v0.7.17-r118850. To normalize the effect of read length, each aligned read was then truncated up to 36 bp. Also, the read depths were normalized by subsampling the read alignment up to 3 million reads. Processed alignments were converted to genomewide read depth signals using bedtools genomecov. Besides, to determine the promoter-pCRE interactions that are used for Chromoformer training, we used the normalized interaction frequencies from publicly available Hi-C interaction matrices of mESC41. The bulk RNA-seq gene expression profile for ES-Bruce4 cells was also obtained from ENCODE under file accession ENCFF166EXS [https://www.encodeproject.org/experiments/ENCSR000CGU]. Gencode vM1 gene annotation was used to determine the transcription site and promoter region for each gene.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A