All proteomics analyses, including data pre-processing and differential expression analysis, were conducted using R. For all downstream analyses, MaxQuant label-free quantitation (LFQ) values41 were log10-transformed and protein groups flagged as reverse or contaminant hits were excluded. To normalize for general loading effects between samples, a scaling factor was subtracted from the log10-transformed LFQ intensity values for each sample in all data sets. This scaling factor was calculated for each sample by subtracting the overall median of log10 LFQ values from each sample median log10 LFQ value.
To account for potential technical variability between the two sample processing batches (B1, B2), the log10-transformed LFQ intensities were adjusted for corresponding batch effects using the ‘ComBat’ function in the R package sva to perform a parametric batch adjustment for the known processing batch covariate separately for each tissue42,43. Principal component analysis (PCA) was performed on normalized and batch-corrected log10 LFQ intensities, considering only protein groups with no missing values. To test for significant protein expression differences between the ‘early’ (1 week and 3 weeks) and the ‘transition’ (6 weeks) or ‘late’ (10 weeks and 15 weeks) time points, the R package limma44 was used to fit the following linear model to the normalized and batch-adjusted log10 LFQ intensity values separately for each tissue:
Normalised intensities ~ 0 + Time point
Based on this linear regression model moderated t-tests between the ‘transition’ or ‘late’ time points and the early time point were performed for each protein group and tissue. The resulting p values were corrected for multiple hypothesis testing over all protein groups using the Benjamin-Hochberg procedure with an FDR threshold of 5%.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.