The University of Michigan Imputation Server pipeline was employed for genotype imputation of each study population. Array-generated genotypes were used as the basis for imputation against reference whole genome sequence of 32,488 subjects of the Haplotype Reference Consortium r1.1 201645. Only informative bi-allelic SNPs with a required minimum genotype completion rate of 98% and subject completion rate of 98% were used as the basis for imputation. Because rarer variation is pertinent in familial prostate cancer, it was included rather than filtered. Phasing employed Eagle v2.3 with imputation using Minimac 346. The most probable genotypes for imputed variants of R2 ≥ 0.75 were retained, while those of lower quality were filtered. This yielded 5,517 informative variants within the genomic interval between 17:45,416,600 and 17:46,860,777 (GRCh37/hg19) with complete data in ICPCG, NFPCS, and PLCO study subjects. Hardy–Weinberg equilibrium (HWE) filters were not employed because pertinent mutations could impact population fitness. Genotype data was derived independently of trait status; genotype error as a source of disequilibrium would bias toward a null disease association. Genetic variants of Table Table22 were each of HWE P ≥ 0.05 within each study population.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.