VGP Trio Pipeline v1.0–v1.6

AR Arang Rhie
SM Shane A. McCarthy
OF Olivier Fedrigo
JD Joana Damas
GF Giulio Formenti
SK Sergey Koren
MU Marcela Uliano-Silva
WC William Chow
AF Arkarachai Fungtammasan
JK Juwan Kim
CL Chul Lee
BK Byung June Ko
MC Mark Chaisson
GG Gregory L. Gedman
LC Lindsey J. Cantin
FT Francoise Thibaud-Nissen
LH Leanne Haggerty
IB Iliana Bista
MS Michelle Smith
BH Bettina Haase
JM Jacquelyn Mountcastle
SW Sylke Winkler
SP Sadye Paez
JH Jason Howard
SV Sonja C. Vernes
TL Tanya M. Lama
FG Frank Grutzner
WW Wesley C. Warren
CB Christopher N. Balakrishnan
DB Dave Burt
JG Julia M. George
MB Matthew T. Biegler
DI David Iorns
AD Andrew Digby
DE Daryl Eason
BR Bruce Robertson
TE Taylor Edwards
MW Mark Wilkinson
GT George Turner
AM Axel Meyer
AK Andreas F. Kautt
PF Paolo Franchini
HI H. William Detrich, III
HS Hannes Svardal
MW Maximilian Wagner
GN Gavin J. P. Naylor
MP Martin Pippel
MM Milan Malinsky
MM Mark Mooney
MS Maria Simbirsky
request Request a Protocol
ask Ask a question
Favorite

The trio pipeline is similarly designed to the standard pipeline, except for the use of parental data (Extended Data Fig. Fig.3b).3b). When parental genomes are available, the child’s CLR reads are binned to maternal and paternal haplotypes, and assembled separately as haplotype-specific contigs (haplotigs) using TrioCanu20. In brief, parental specific marker k-mers were collected using Meryl23 from the parental Illumina WGS reads of the parents. These markers were filtered and used to bin the child’s CLR read. A haplotype was assigned given the markers observed, normalized by the total markers in each haplotype. The subsequent purging, scaffolding, and polishing steps were similarly updated with the use of Purge_Dups14 (v1.6). We extended binning to linked reads and Hi-C reads, by excluding read pairs that had any parental-specific marker. The binned Hi-C reads were used to scaffold its haplotype assembly, and polished with the binned linked reads from the observation of haplotype switching using the standard polishing approach. During curation, one of the haplotype assemblies with the higher QV and/or contiguity was chosen as the representative haplotype. The heterogametic sex chromosome from the unchosen haplotype was added to the representative assembly. However, while curating several trios, we found that in regions of low divergence between shared parental homogametic sex chromosomes (that is, X or Z), a small fraction of offspring CLR data was mis-assigned to the wrong haplotype. This mis-alignment resulted in a duplicate, low-coverage offspring X or Z assembly in the paternal (for mammals) or maternal (for birds) haplotype, respectively, which required removal during curation. We are working on methods to improve the binning accuracy for resolution of this issue going forward.

For the female zebra finch in particular, contigs were generated before the binning was automated in the Canu assembler as TrioCanu1.7, and therefore a manual binning process was applied as described in the original Trio-binning paper20 (Supplementary Methods). Contigs were assembled for each haplotype using the binned reads, excluding unclassified reads. The contigs were polished with two rounds of Arrow polishing using the binned reads, and scaffolded following the v1.0 pipeline with no purging. Additional scaffolding rounds with Bionano (s4) and Hi-C were applied. Scaffolds were renamed according to the primary scaffold assembly of the same individual (s5), with sex chromosomes grouped as Z in the paternal assembly and W in the maternal assembly following synteny to the Z chromosome from the curated male zebra finch VGP assembly. Two rounds of SR polishing were applied using linked reads, by mapping on both haplotypes. After haplotype switches were discovered, additional rounds of polishing were applied using binned linked reads (Supplementary Methods).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A