Inference of gene conversion

CW Chendan Wei
ZW Zhenyi Wang
JW Jianyu Wang
JT Jia Teng
SS Shaoqi Shen
QX Qimeng Xiao
SB Shoutong Bao
YF Yishan Feng
YZ Yan Zhang
YL Yuxian Li
SS Sangrong Sun
YY Yuanshuai Yue
CW Chunyang Wu
YW Yanli Wang
TZ Tianning Zhou
WX Wenbo Xu
JY Jigao Yu
LW Li Wang
JW Jinpeng Wang
request Request a Protocol
ask Ask a question
Favorite

To infer possible gene conversion between duplicated genes, the multiple sequence alignment software ClustalW [54] was employed to compare the identity between homologous gene sequences in quartets. Highly divergent quartets were removed to eliminate potential problems created by inferring gene conversion from unreliable sequences. Quartets showing gaps in the pairwise alignments exceeding 50% of the alignment length or with amino acid identity between homologous sequences of less than 40% were removed.

Whole-gene conversion (WCV) inference: Since paralogues were produced prior to subspecies divergence, we anticipate that the orthologues between two subspecies should be more similar than the paralogues in each subspecies. If the paralogues were more similar to one another than to their respective orthologues across subspecies, we inferred that gene conversion occurred after species divergence. For this comparison, phylogenetic analyses of homologous gene quartets were performed to infer potential whole gene conversion between paralogues according to the gene topology changes (Fig. (Fig.1b-e).1b-e). To measure the similarity of the homologous genes in each quartet, we characterized the Ks values and ratios of amino acid locus identity between paralogues and orthologues. First, the Ks values between paralogous and orthologous gene pairs were used to infer possible whole gene conversion (called WCV-I). To assess the confidence level of the inferred conversions, bootstrap tests were performed on each gene tree with 1000 repetitive samplings to produce a bootstrap frequency [28, 33]. Second, the sequences in quartets were compared site by site and then used to calculate the ratios of amino acid locus identity between paralogous and orthologous gene pairs. The ratios were used to infer unexpected changes in gene tree topology in quartets, depending on whether the paralogues were more similar to each other than the orthologues [28]. This is a strict criterion used for the detection of whole gene conversion (called WCV-II), as paralogues were produced at least 100 mya, whereas orthologues have diverged more recently. The similarity between sequences representing different rice subspecies is often very high due to the relatively close genetic relationship between these three rice genomes, as in our previous study of the conversion between hexaploid wheat subgenomes [32].

Partial-gene conversion (PCV) inference: Quartets were used to identify possible gene conversion among partial gene sequences that may occur after species divergence. A combination of dynamic planning and phylogenetic analysis was used to document the differences between two aligned bases from paralogues within genomes and orthologues between genomes, as previously reported [28]. The main steps for inferring PCV include: 1) defining arrays to reflect the difference or distance between the homologues, and 2) averaging the distance arrays of the orthologous gene pairs and comparing the averaged distance between paralogues and orthologues, as the paralogues should be more distant if no PCV is involved. 3) Dynamic programming was used to reveal high-scoring segmental sequences and infer the extension of paralogues. Then, partially affected regions ≥10 nucleotides in length were identified. 4) A bootstrap test was used to assess the identification of high-scoring segments with shorter lengths and smaller scores. 5) After masking some of the larger segments, a recursive procedure revealed shorter high-scoring fragments, which helped to reveal genes affected by multiple conversion events.

The scripts of gene conversions inference have deposited in Github (https://github.com/weichendan312/gene-conversion).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A