Imputation of C4 haplotypes

NB Nis Borbye-Lorenzen
ZZ Zhihong Zhu
EA Esben Agerbo
CA Clara Albiñana
MB Michael E. Benros
BB Beilei Bian
AB Anders D. Børglum
CB Cynthia M. Bulik
JD Jean-Christophe Philippe Goldtsche Debost
JG Jakob Grove
DH David M. Hougaard
AM Allan F. McRae
OM Ole Mors
PM Preben Bo Mortensen
KM Katherine L. Musliner
MN Merete Nordentoft
LP Liselotte V. Petersen
FP Florian Privé
JS Julia Sidorenko
KS Kristin Skogstrand
TW Thomas Werge
NW Naomi R. Wray
BV Bjarni J. Vilhjálmsson
JM John J. McGrath
ask Ask a question
Favorite

C4 haplotypes were imputed from reference data9,10 (the database of Genotypes and Phenotypes [dbGaP]: phs001992) using the genotyped SNPs in the iPSYCH2012 sample. The human C4 haplotypes have various copy numbers, including two isotypic polymorphisms, C4A (A) and C4B (B). Each isotype has two length-polymorphisms due to a human endogenous retroviral (HERV) insertion, long form (L, with HERV insertion) and short form (S, without HERV insertion). The isotypic and length polymorphisms lead to four alleles in a C4 copy, AL, AS, BL and BS. Using the genotyped SNPs, the C4 haplotype reference was used to impute the C4 alleles and the number of C4 copies (with a maximum copy number of 4). The C4 haplotype imputation panel comprised whole genome sequencing data from 1,265 individuals of multiple ancestries, which enabled us to identify C4 alleles with high accuracy. We used Beagle software96 for the imputation with the C4 haplotype reference. The imputation results provided the counts of alleles, but were unable to confidently distinguish all combinations of variants, for example, between the haplotypes AS-BL and AL-BS. We counted the two C4 alleles (C4A and C4B) with combination of HERV using a subset of the imputed result, where combinations can be confidently distinguished (details are provided in Method S4). Both counts of C4 allele combinations and reported studies97 indicated that the C4A gene is more likely to carry HERV insertion than the C4B gene. Therefore, the C4 haplotype is assumed to be AL-BS rather than AS-BL, consistent with methods described by Sekar et al.9 The imputed counts were converted to the C4 haplotypes. Eight common C4 haplotypes (allele frequencies ≥ 0.01) were imputed in the iPSYCH2012 study (Table S3). The allele frequencies of the 8 haplotypes were consistent with other studies.10,21 We counted the copy numbers of the C4 alleles (Figure S2) for each participant. 28 individuals (0.04%) carried 4 copies of C4B and 35 individuals (0.05%) carried 6 copies of HERV insertion. Therefore, we excluded these individuals with very rare copy numbers. The C4A copy number is strongly correlated with C4B and HERV copy numbers (Pearson correlation between C4A and C4B = -0.52; between C4A and HERV = 0.73). We imputed the C4 haplotypes and the C4 copy numbers in the iPSYCH2015 extension study using the same method. The copy numbers of C4 alleles were counted from the imputed haplotypes. Details are provided in the Method S5. Since the C4 haplotype imputation reference data included individuals of multiple ancestries, we applied the same method that was used in the European cohort to impute C4 haplotypes in the 159 individuals of African ancestry and the 101 individuals of South Asian ancestry.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A