Data on de novo variants in SCZ and controls were obtained from (accessed 11 April 2017) (61). Variants identified in patients with SCZ and control subjects were stratified into four categories (all protein coding, missense, loss of function, and synonymous) based on annotations from the denovo-db. To generate a list of variants affecting protein coding (“all protein coding”), de novo variants outside exonic coding regions were excluded [including those in 5′ untranslated regions (5′UTRs), 3′UTRs, intronic and intergenic regions, and noncoding exons], as well as synonymous variants. The “loss-of-function” list included frameshift, stop gained, stop lost, start lost, splice acceptor, and donor variants. We then compiled a total of six lists containing different sets of differentially expressed genes from the TCF4 knockdown (KD) experiments at day 3 and day 14, as follows: day3_all_sig (all differentially expressed genes FDR < 0.05; 4898 genes), day3_up_sig (all up-regulated genes FDR < 0.05; 2332 genes), day3_down_sig (all down-regulated genes FDR < 0.05; 2566 genes), day14_all_sig (all differentially expressed genes FDR < 0.05; 3156 genes), day14_up_sig (all up-regulated genes FDR < 0.05; 1864 genes), day14_down_sig (all down-regulated genes FDR < 0.05; 1292 genes). Each of the de novo lists for SCZ and controls was intersected with the relevant list of differentially expressed genes from the TCF4 KD experiments to determine the shared number of genes in each comparison. A hypergeometric test was used to test the statistical significance of each overlap (number of shared genes between lists) and using all protein coding genes as a background set (20,338, human genome build GRCh38.p10). The data were displayed as the −log10 (P value) for each comparison.

