CDF curves were generated in the following way. For each gene set plotted, corresponding log2(fold change) values in gene expression for individual genes were looked up within “all genes” using “vlookup” function in Excel. The corresponding values were used to generate CDF curves in Prism 7 with the following workflow: Column analysis (Frequency distribution) > Create (Cumulative frequency distribution) > Tabulate [Relative frequency (fractions)] > Bin range (Auto) > Bin width (No beans. Tabulate exact Cumulative frequency) > New graph (Create a new graph of the results). Histogram of the data was plotted in the final graph. Some representative CDF plots were also cross-checked using the program “R”.

Plots depicting enrichment of ChIP-seq sites on Bcl11b-dependent Up or Down genes were generated by normalizing the number of ChIP-seq peaks within the indicated distance, every 1 kbp upstream or downstream of TSS of genes that fall in each category, against the total number of Tn/Tr-common and Tr-specific Bcl11b peaks (Fig. 4D) or Bcl11b-specific and Bcl11b-Foxp3-Overlapping sites (Fig. 4H).

For motif analysis, Bcl11b binding sites (peaks) located near the TSSs of given genes (±20 kb of TSS) were used to identify transcription factor binding sites (motifs) using HOMER ( The length of the peak size was set to ±100 bp of the center, and repeat sequences in the sequences were masked (-mask option).

