Genotype profiling and imputation methods

MT María Teruel
GB Guillermo Barturen
MM Manuel Martínez-Bueno
OC Olivia Castellini-Pérez
MB Miguel Barroso-Gil
EP Elena Povedano
MK Martin Kerick
FC Francesc Català-Moll
ZM Zuzanna Makowska
AB Anne Buttgereit
JP Jacques-Olivier Pers
CM Concepción Marañón
EB Esteban Ballestar
JM Javier Martin
EC Elena Carnero-Montoro
MA Marta E. Alarcón-Riquelme
request Request a Protocol
ask Ask a question
Favorite

Genomic DNA from whole blood was obtained by standard methods. The samples were genotyped using HumanCore -12-v1-0-B, InfiniumCoreExome-24v1-2 and InfiniumCoreExome-24v1-3, all of them of Illumina (San Diego, CA, USA). Only genetic variants, not indels, present on all three platforms were considered for data cleaning and analysis. Quality controls (QC) were performed using PLINK v1.979. Genetic markers were removed if these had a call rate < 90%, exhibited a significant differential missingness between cases and controls (P < 1 × 10–4), and showed a significant deviation from Hardy–Weinberg equilibrium (P < 0.01 in controls and P < 1 × 10–4 in cases). Variants with a minor allele frequency of less than 5% were excluded from the analysis. Samples were excluded of the study if they had a call rate < 95% and also high heterozygosity rate, i.e., they deviated 6 standard deviations from the centroid. Duplicated or related individuals were identified using identity-by-descent criteria with REAP80. A total of 218,947 variants passed data filtering. Samples were excluded applying a threshold of kinship coefficient < 0.25.

Inference methods based on linkage disequilibrium structure was used in order to increase the number of genetic markers. Imputation was performed using the Michigan Imputation Server [URL: https://imputationserver.sph.umich.edu/index.html]81 and Haplotype Reference Consortium (HRC) as reference panel [URL: http://www.haplotype-reference-consortium.org/]82. We considered the imputed genotypes with a info-value (Minimac R2) higher than 0.7, i.e., 70% of reliability. Imputed variants were also filtered according to the protocol described above.

In order to study the population structure and prevent population stratification the individual admixture frequency for each individual was estimated. In addition, population stratification was also analyzed by principal component analysis (PCA). Ancestry per cent for each individual was estimated using FRAPPE83, and a set of 2,707 independent genetic variants that maximized the differences between populations and clustering by K = 5, i. e, the 5 global populations: American, African, South Asian, East Asian and European. European published populations from 1000 Genomes phase 3 [URL: http://www.internationalgenome.org/category/phase-3/] were included as reference panel. PCAs were also calculated using SMARTPCA from the Eigensoft software84 and the same independent set of markers. In this analysis, only the European population from 1000 Genomes, excluding Finns, was included as reference population of our samples85 (see supplementary Fig. 1). Six standard deviations from the centroid were used as a threshold to filter individuals that deviate from the main European clustering. Moreover, samples with less that 55% European ancestry were excluded from all analyses.

After genotyping QC and filtering of individuals for European ancestry classical HLA alleles were imputed for the PRECISESADS dataset (cases and controls) in the extended MHC region in chromosome 686. The SNP2HLA software87 was used for imputation using a reference panel consisting of 5,225 European individuals in the Type 1 Diabetes Genetic Consortium88 containing data of 8,961 variants across the MHC region, and two and four digit-resolution allelic identities of the HLA class I (HLA-A, HLA-B, and HLA-C) and II genes (HLA-DPA1, HLA-DPB1, HLA-DQA1, HLA-DQB1, and HLA-DRB1). Genotypes were extracted as allele dosages for 298 HLA classical alleles.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A