Using bcftools (Li, 2011), we performed a locus-wise intersection of the following VCFs: 1000 Genomes (The 1000 Auton et al., 2015), IGDP (Jacobs et al., 2019), the high coverage Denisovan genome (Meyer et al., 2012), and the Altai and Vindija Neanderthal genomes (Prüfer et al., 2014). All VCFs corresponded to the GRCh37/hg19 reference sequence. Genotype matrices were constructed by parsing the output of bcftools query over 100 kbp windows, filtering out sites with sample allele frequency <5% or with more than 10% of genotypes missing, then excluding windows with fewer than 20 segregating sites. Each genotype matrix was then resized and sorted as described for simulations. When data were considered to be phased, as for the CEU/YRI populations, we also treated the Neanderthal genotypes as if they were phased according to REF/ALT columns in the VCF. While this is equivalent to random phasing, both high-coverage Neanderthal individuals are highly inbred, so this is unlikely to be problematic in practice.

