Statistical methods

Bram P. Prins; Timothy J. Mead; Jennifer A. Brody; Gardar Sveinbjornsson; Ioanna Ntalla; Nathan A. Bihlmeyer; Marten van den Berg; Jette Bork-Jensen; Stefania Cappellani; Stefan Van Duijvenboden; Nikolai T. Klena; George C. Gabriel; Xiaoqin Liu; Cagri Gulec; Niels Grarup; Jeffrey Haessler; Leanne M. Hall; Annamaria Iorio; Aaron Isaacs; Ruifang Li-Gao; Honghuang Lin; Ching-Ti Liu; Leo-Pekka Lyytikäinen; Jonathan Marten; Hao Mei; Martina Müller-Nurasyid; Michele Orini; Sandosh Padmanabhan; Farid Radmanesh; Julia Ramirez; Antonietta Robino; Molly Schwartz; Jessica van Setten; Albert V. Smith; Niek Verweij; Helen R. Warren; Stefan Weiss; Alvaro Alonso; David O. Arnar; Michiel L. Bots; Rudolf A. de Boer; Anna F. Dominiczak; Mark Eijgelsheim; Patrick T. Ellinor; Xiuqing Guo; Stephan B. Felix; Tamara B. Harris; Caroline Hayward; Susan R. Heckbert; Paul L. Huang

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Statistical methods

BP Bram P. Prins

TM Timothy J. Mead

JB Jennifer A. Brody

GS Gardar Sveinbjornsson

IN Ioanna Ntalla

NB Nathan A. Bihlmeyer

MB Marten van den Berg

JB Jette Bork-Jensen

SC Stefania Cappellani

SD Stefan Van Duijvenboden

NK Nikolai T. Klena

GG George C. Gabriel

XL Xiaoqin Liu

CG Cagri Gulec

NG Niels Grarup

JH Jeffrey Haessler

LH Leanne M. Hall

AI Annamaria Iorio

AI Aaron Isaacs

RL Ruifang Li-Gao

HL Honghuang Lin

CL Ching-Ti Liu

LL Leo-Pekka Lyytikäinen

JM Jonathan Marten

HM Hao Mei

MM Martina Müller-Nurasyid

MO Michele Orini

SP Sandosh Padmanabhan

FR Farid Radmanesh

JR Julia Ramirez

AR Antonietta Robino

MS Molly Schwartz

JS Jessica van Setten

AS Albert V. Smith

NV Niek Verweij

HW Helen R. Warren

SW Stefan Weiss

AA Alvaro Alonso

DA David O. Arnar

MB Michiel L. Bots

RB Rudolf A. de Boer

AD Anna F. Dominiczak

ME Mark Eijgelsheim

PE Patrick T. Ellinor

XG Xiuqing Guo

SF Stephan B. Felix

TH Tamara B. Harris

CH Caroline Hayward

SH Susan R. Heckbert

PH Paul L. Huang

This method is extracted from research article: Genome Biol, Jul 2018

Exome-chip meta-analysis identifies novel loci associated with cardiac conduction, including ADAMTS6

DOI: 10.1186/s13059-018-1457-6

Request a Protocol

Ask a question

Favorite

All association analyses were carried out using the R-package seqMeta [50]. Each study ran the “prepScores” function and adjusted their analyses for age, gender, body mass index (BMI), height, principal components, and study-specific covariates when appropriate (details in Additional file 1: Table S1). The output of this function is an R “list” object (“a prepScores object”), stored in an .RData file, where each element corresponds to a gene, and contains the scores and MAFs for variants, as well as a matrix of the covariance between the scores at all pairs of SNPs within a gene. All studies performed both gender combined and separated analyses, in addition to separation by ancestry. Using the prepScores objects from each study, we performed meta-analyses using the “singlesnpMeta()” for single variant meta-analyses, and the “burdenMeta” and “skatMeta()” functions of SeqMeta. Coefficients and standard errors from seqMeta can be interpreted as a “one-step” approximation to the maximum likelihood estimates. Ancestry groups were analyzed both separate and combined at the meta-analysis level.

For single variant meta-analyses, we included all variants with a MAC ≥ 10 in order to have well-calibrated type I error rates [51]. Statistical significance was defined using Bonferroni corrections. For single variants, maximally 162,199 variants were included in five separate analyses after filtering for MAC: European and African ancestry separated and combined (n = 3); and sex-stratified analyses (n = 2), resulting in a Bonferroni corrected P value of α=0.05 / 162,199 variants / 5 analyses = 6.17 × 10⁻⁸.

Suggestive sexually dimorphic associations were identified by performing sex-stratified meta-analyses, totaling 39,907 women and 31,702 men, including only from cohorts that had both male and female samples. Variants were deemed to be suggestive sex-specific when reaching below a P value threshold of exome-wide significance (P < 6.17 × 10⁻⁸) in one sex and above nominal significance in the other (P > 0.05).

For gene-based tests, also performed using seqMeta using the “prepScores” objects from individual cohorts, we assigned variants to genes by annotating all variants on the Exome Chip using ANNOVAR [52] following RefSeq [53] gene definitions mapped to human genome build 37 (hg19). In the collapsed variant tests, we included only variants with MAF < 1% and included only genes for which two or more variants were present (n = 16,085). We performed both SKAT [54] and T1 burden [55] tests, for three different functional sets of variants limited to the following: (I) all variants; (II) missense, nonsense, splice, and indel variants; (III) “damaging”: the same variants as in group II, except for missense only including those that are predicted to be damaging by at least two out of four functional prediction algorithms (Polyphen2 [56], SIFT [57], Mutation Taster [58], and LRT [59]). For the gene-based tests, we used a Bonferroni corrected P value significance threshold of α=0.05 / 16,085 genes / 2 different tests / 3 functional variant classes = 5.18 × 10⁻⁷.

We define a physically independent locus as the genomic region that contains variants within 250 kb on either side of LD-independent lead SNPs (exome-wide significant variants with r2 < 0.1), where LD calculations were based on European ancestry. Following this definition, in certain cases LD-independent lead variants are present in overlapping regions, complicating the definition and reporting of associated genetic loci and harbored genes. Therefore, we annealed loci if LD-independent exome-wide significant variants were < 250 kb from each other. Where lead SNPs from previous analyses were not contained in these regions, we considered these as novel. LD calculations were performed on the Illumina Exome Chip genotype data from the TwinsUK cohort [60] (n = 1194), using PLINK 1.9 [61].

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol