Detection of outliers using LFMM

Joanna Meger; Bartosz Ulaszewski; Jaroslaw Burczyk

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Detection of outliers using LFMM

JM Joanna Meger

BU Bartosz Ulaszewski

JB Jaroslaw Burczyk

This method is extracted from research article: BMC Genomics, Jul 2021

Genomic signatures of natural selection at phenology-related genes in a widely distributed tree species Fagus sylvatica L

DOI: 10.1186/s12864-021-07907-5

Request a Protocol

Ask a question

Favorite

Because phenotypic variables were represented by population averages instead of individual measures, we applied the same approaches (LFMM and RDA) to detect loci with significant effects related to geographic, climate, and phenotypic variables. We used the latent factor mixed model (LFMM) approach [173] to find candidate loci under selection. According to P de Villemereuil, É Frichot, É Bazin, O François and OE Gaggiotti [76], LFMM is expected to provide the best compromise between power and error rate across different analytical scenarios. LFMM is also known to be less susceptible to both false negatives and false positives [173, 174] than other genotype-environment association (GEA) methods, such as Bayenv2 [175], because it does not rely on a specific demographic model when accounting for population structure [76, 174].

We employed an MCMC algorithm for regression analysis whereby potentially confounding population structure is modeled with unobserved (latent) factors [176]. As missing data can reduce the power of association studies [177, 178], we imputed the missing data based on the ancestry coefficients estimated by sNMF, using the “impute” function from the R package LEA [176]. In sNMF, we set K based on the number of distinct genetic clusters identified following the population genetic structure analysis and kept the best out of 10 runs based on a cross-entropy criterion. The MCMC algorithm was used for each of the geographic, climate, and phenotypic variables (i.e. longitude, latitude, altitude, PC1-PC3, spring and autumn phenology and height), using 50,000 steps for burn-in and 100,000 additional steps to compute LFMM parameters (z-scores) for all loci. The number of latent factors was set at the identified value of K. In order to compensate for run-to-run variation, the analysis was repeated over 10 independent runs and z-scores across runs were then combined in R using the LEA package [176]. The LEA package was also used to adjust p-values for multiple testing using the Benjamini–Hochberg procedure, and to calculate the genomic inflation factor to modify z-scores allowing for the control of the FDR, as described in E Frichot and O François [176]. A list of candidate loci with an FDR of 1% and adjusted p-values of < 0.001 was then generated for each explanatory variable.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol