To identify the causal gene responsible for the mpl1-1 mutation, we performed a cross between WT (ICCV 96029) and the mpl1-1 mutant (PI587041) as reported62 and generated an F2 population. The WT and mpl1-1 DNA mix pools were prepared by mixing equal amounts of genomic DNAs from 28 WT F2 individuals and 28 mpl1-1 F2 individuals, respectively. The two DNA mix pools and two parental DNA samples (the female parent ICCV 96029 and the male parent PI587041) were extracted using a EasyPure Plant Genomic DNA Kit (EE111, TransGen, China). Subsequently, we sequenced the DNA samples using the Illumina HiSeq4000 platform (Novogene, Beijing, China).
To ensure reliability and eliminate artificial bias in the reads for downstream analyses, we processed the raw data (raw reads) through several quality control (QC) procedures using the Fastp software63. Next, we aligned the clean reads of each sample to the reference genome using BWA (Burrows-Wheeler Aligner)64. We performed variant calling for all samples using the Unified Genotyper function in GATK3.865, and annotated SNP or InDel based on the GFF3 files of the chickpea reference genome (CDC Frontier v1.0) using ANNOVAR66,67.
To calculate the SNP/InDel index, we obtained read depth information for homozygous SNPs/InDels above in the two extreme pools. We used a window size of 1 Mb and a step size of 10 kb as default settings to average all SNP/InDel indexes in each window, which was then used as the SNP/InDel index for that window. We calculated the difference of the SNP/InDel index between the two pools as the delta SNP/InDel (ΔSNP/InDel) index68.
To narrow down the list of candidate genes in this region, we filtered the SNPs and InDels by simultaneously satisfying the following criteria: (1) deleterious variant in the exon that greatly influenced the function of protein; (2) homozygous mutation (aa) in both mutant mix pool and PI587041 parent sample, heterozygous mutation (Aa) in WT mix pool, and WT genotype (AA) in ICCV 96029 parent sample. After applying these filters, we identified five genes with genomic variants that met all these criteria in the candidate region, which are listed in Supplementary Data 1.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.