Association Analysis

Saba Naz; Vinay Kumar Nandicoori

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Preprint

Association Analysis

SN Saba Naz

VN Vinay Kumar Nandicoori

Last updated date: Nov 23, 2023 Views: 2691 Forks: 0

An abbreviated version of this protocol was published in eLife in Jan, 2023

GWAS and functional studies suggest a role for altered DNA repair in the evolution of drug resistance in Mycobacterium tuberculosis

Download PDF

Ask a question

How to cite

Favorite

Association analysis of Mtb genome with MDR -TB phenotypes

We have used mixed linear model for associating the SNPs with multi-drug resistant phenotype using GAPIT. GAPIT is a package runs in the R software environment, which is freely downloaded from http://www.r-project.org and https://zzlab.net/GAPIT/. We preferred mixed linear model (MLM) for association mapping over General Linear Model (GLM) because the genotypes were segregated in multiple lineages.
For the covariant analysis, the simplest model (t test) is to directly detect the association between a phenotype (y) and markers (Si) one at a time, where i=1 to m, and m is number of markers.
The mixed linear model adds the genetic effects as random cofactor effects with variance structure defined by the kinship (K) among individuals. In both Q or Q+K models, Q and K stay the same. There are no cofactors that are adjusted by the marker tests.
MLM includes both fixed and random effects. Including individuals as random effects gives an MLM the ability to incorporate information about relationships among individuals. This information about relationships is conveyed through the kinship (K) matrix, which is used in an MLM as the variance - covariance matrix between the individuals. When a genetic marker- based kinship matrix (K) is used jointly with population structure (commonly called the “Q” matrix, and can be obtained through STRUCTURE or conducting a principal component analysis), the “Q+K” approach improves statistical power compared to “Q” only.

An MLM can be described using Henderson’s matrix notation as follows:
Y = Xβ + Zu + e, (1)

where Y is the vector of observed phenotypes; β is an unknown vector containing fixed effects, including the genetic marker, population structure (Q), and the intercept; u is an unknown vector of random additive genetic effects from multiple background QTL for individuals/lines; X and Z are the known design matrices; and e is the unobserved vector of residuals.
To run the GAPIT, minimally, we need
1. Genotype file – we used hapmap file format.
2. Phenotype file – As given in Supplementary file (1 & 2)
Change to the folder where analysis has to be done
setwd(path_to_the_folder_where_analysis_has_to_be_done)
Import the following packages under R environment:
library(multtest)
library(“gplots”)
library (“LDheatmap”)
library(“genetics”)
library(“compiler”)
Install the GAPIT functions from the sourcecode given below
source("http://zzlab.net/GAPIT/GAPIT.library.R")
source("http://zzlab.net/GAPIT/gapit_functions.txt")
Load the genotypeand phenotype files
genotype_file <- read.table(“path_of_file/genotype_file.txt”, header=F) phenotype_file <- read.table(“path_of_file/phenotype_file.txt”, header=T)
Check the phenotypedata as follows
str(phenotype_file) mean(phenotype_file$mdr)
range(phenotype_file$mdr) which(is.na(phenotype_file$mdr))
Analyse the association with following commands
analysis <- GAPIT(y = phenotype_file, G= genotype_file, PCA.total= 4, Major.allele.zero=1)
A lot of data will be generated from the analysis. Parameters should be adjusted based on the results. In the manuscript, we selecteda corrected p-valueof 10^–5 as the threshold for selecting associated genes and used Bonferroni algorithm for FDR correction.

Related files

Association Analysis_Bioprotocol.pdf download

How to cite：

Readers should cite both the Bio-protocol preprint and the original research article where this protocol was used:

Naz, S and Nandicoori, V(2023). Association Analysis. Bio-protocol Preprint. bio-protocol.org/prep2509.
Naz, S., Paritosh, K., Sanyal, P., Khan, S., Singh, Y., Varshney, U. and Nandicoori, V. K.(2023). GWAS and functional studies suggest a role for altered DNA repair in the evolution of drug resistance in Mycobacterium tuberculosis. eLife. DOI: 10.7554/eLife.75860

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

This protocol preprint was submitted via the "Request a Protocol" track.

Share your protocol with your peers.

Submit a Preprint Protocol