Contiguous segment analysis

Daniel Taliun; Daniel N. Harris; Michael D. Kessler; Jedidiah Carlson; Zachary A. Szpiech; Raul Torres; Sarah A. Gagliano Taliun; André Corvelo; Stephanie M. Gogarten; Hyun Min Kang; Achilleas N. Pitsillides; Jonathon LeFaive; Seung-been Lee; Xiaowen Tian; Brian L. Browning; Sayantan Das; Anne-Katrin Emde; Wayne E. Clarke; Douglas P. Loesch; Amol C. Shetty; Thomas W. Blackwell; Albert V. Smith; Quenna Wong; Xiaoming Liu; Matthew P. Conomos; Dean M. Bobo; François Aguet; Christine Albert; Alvaro Alonso; Kristin G. Ardlie; Dan E. Arking; Stella Aslibekyan; Paul L. Auer; John Barnard; R. Graham Barr; Lucas Barwick; Lewis C. Becker; Rebecca L. Beer; Emelia J. Benjamin; Lawrence F. Bielak; John Blangero; Michael Boehnke; Donald W. Bowden; Jennifer A. Brody; Esteban G. Burchard; Brian E. Cade; James F. Casella; Brandon Chalazan; Daniel I. Chasman; Yii-Der Ida Chen

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Contiguous segment analysis

DT Daniel Taliun

DH Daniel N. Harris

MK Michael D. Kessler

JC Jedidiah Carlson

ZS Zachary A. Szpiech

RT Raul Torres

ST Sarah A. Gagliano Taliun

AC André Corvelo

SG Stephanie M. Gogarten

HK Hyun Min Kang

AP Achilleas N. Pitsillides

JL Jonathon LeFaive

SL Seung-been Lee

XT Xiaowen Tian

BB Brian L. Browning

SD Sayantan Das

AE Anne-Katrin Emde

WC Wayne E. Clarke

DL Douglas P. Loesch

AS Amol C. Shetty

TB Thomas W. Blackwell

AS Albert V. Smith

QW Quenna Wong

XL Xiaoming Liu

MC Matthew P. Conomos

DB Dean M. Bobo

FA François Aguet

CA Christine Albert

AA Alvaro Alonso

KA Kristin G. Ardlie

DA Dan E. Arking

SA Stella Aslibekyan

PA Paul L. Auer

JB John Barnard

RB R. Graham Barr

LB Lucas Barwick

LB Lewis C. Becker

RB Rebecca L. Beer

EB Emelia J. Benjamin

LB Lawrence F. Bielak

JB John Blangero

MB Michael Boehnke

DB Donald W. Bowden

JB Jennifer A. Brody

EB Esteban G. Burchard

BC Brian E. Cade

JC James F. Casella

BC Brandon Chalazan

DC Daniel I. Chasman

YC Yii-Der Ida Chen

This method is extracted from research article: Nature, Feb 2021

Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program

DOI: 10.1038/s41586-021-03205-y

Request a Protocol

Ask a question

Favorite

We excluded indels and multi-allelic variants, and categorized the remaining variants as common (allele frequency ≥ 0.005) or rare (allele frequency < 0.005), and as coding or noncoding based on protein-coding exons from Ensembl 94⁹². Variant counts were analysed across 2,739 non-empty (that is, with at least one variant) contiguous 1-Mb chromosomal segments, and counts in segments at the end of chromosomes with length L < 10⁶ bp were scaled up proportionally by the factor 10⁶ × L⁻¹. For each segment, the coding proportion, C, was calculated as the proportion of bases overlapping protein-coding exons. The distribution of C is fairly narrow, with 80% of segments having C ≤ 0.0195, 99% of segments have C ≤ 0.067 and only 3 segments having C ≥ 0.10. Owing to the significant negative correlation between C and the number of variants in a segment, and potential mapping effects, we use linear regression to adjust the variant counts per segment according to the model count = β × C + A + count_adj, where A is the proportion of segment bases overlapping the accessibility mask (Supplementary Information 1.5). Unless otherwise noted, we present analyses and results that use these adjusted count values.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol