Reads of the same barcode were combined and the quality of reads were assessed using FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Trimming of extra base paired of poor quality on the ends of reads was carried out using Trimmomatic66 and was followed by another round of FastQC to assess read quality. High quality reads were then aligned using Bowtie67,68 with the following settings: -m1 –k1 –v1. After alignment, sam files were converted to bam files and filtered for unmapped reads, chimeric alignments, low quality alignments, and PCR duplicates using Picard69 MarkDuplicates and samtools70 settings: -b -h -F 4 -F 1024 -F 2048 -q 30. SPP cross correlation71 was carried out and all samples passed the ENCODE standards for normalized strand coefficient and relative strand correlation72. Peaks were then called using MACS273,74 with the default settings for NKX2-1, H3K4me3, and H3K27ac, and the broad setting for the broad histone mark H3K4me1. Peaks were then filtered based on the −log10 q value as follows: 10 for NKX2-1, 15 for H3K4me3 and H3K27ac, and 4 for H3K4me1. Subsequent peaks were then filtered for sites overlapping with the mm10 blacklist. Differentially NKX2-1 bound peaks were identified using DiffBind75 normalized for sample read depth and at a fixed peak width of 500 bp between controls versus mutants, AT1 versus AT2, and E14.5 whole lung versus other time points in AT1 and AT2 cells. Quantification of histone marks between AT1 and AT2 cells at NKX2-1 bound sites was carried out also using DiffBind normalized for sample read depth but with a fixed peak width of 1000 bp. Foreground normalization was carried out using the fractions of reads in peaks (Frip) calculated by DiffBind for all peaks in each sample of the same antibody type. These Frip values were then multiplied by the post-filtering library read depth to scale MACS2 output bedgraph files as well as profile plots, tracks, and heatmaps in EA-seq76. HOMER motif analysis77 was carried out to determine possible cofactors interacting with NKX2-1 for all peaks associated with the list. Additional motif analysis was carried out on the bottom 3000 NKX2-1 peaks accessible across all cell types with the lowest H3K4me3 average signal. NKX2-1 binding peaks were consolidated from NKX2-1 ChIP-seq at E14.5, in mature AT1 cells, and in mature AT2 cells and used for differential analysis of NKX2-1 binding sites between E14.5 and AT1 or AT2 cells. This consolidated peakset was then cross-referenced with the 10-week-old AT1 and AT2 cell differential NKX2-1 binding analysis to identify NKX2-1 peaks called only within the progenitor. These peaksets were also used to compare between E14.5 and P7 samples. Raw signal averages and log2 fold change values were computed on the top 10 or 20% for each category over time and between controls and mutants. Binding sites were annotated to genes using ChIPseeker78.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.