Protein inference

TB Tim Van Den Bossche
BK Benoit J. Kunath
KS Kay Schallert
SS Stephanie S. Schäpe
PA Paul E. Abraham
JA Jean Armengaud
MA Magnus Ø. Arntzen
AB Ariane Bassignani
DB Dirk Benndorf
SF Stephan Fuchs
RG Richard J. Giannone
TG Timothy J. Griffin
LH Live H. Hagen
RH Rashi Halder
CH Céline Henry
RH Robert L. Hettich
RH Robert Heyer
PJ Pratik Jagtap
NJ Nico Jehmlich
MJ Marlene Jensen
CJ Catherine Juste
MK Manuel Kleiner
OL Olivier Langella
TL Theresa Lehmann
EL Emma Leith
PM Patrick May
BM Bart Mesuere
GM Guylaine Miotello
SP Samantha L. Peters
OP Olivier Pible
PQ Pedro T. Queiros
UR Udo Reichl
BR Bernhard Y. Renard
HS Henning Schiebenhoefer
AS Alexander Sczyrba
AT Alessandro Tanca
KT Kathrin Trappe
JT Jean-Pierre Trezzi
SU Sergio Uzzau
PV Pieter Verschaffelt
MB Martin von Bergen
PW Paul Wilmes
MW Maximilian Wolf
LM Lennart Martens
TM Thilo Muth
request Request a Protocol
ask Ask a question
Favorite

To allow protein group comparison, groups were created using the combined peptide evidence of all compared samples. Two different protein grouping methods were tested: MPA28 and PAPPSO61, and analyses were made on protein groups and subgroups (Supplementary Note 3).

Assigning peptides to their correct protein can be a difficult task, notably due to the protein inference issue3, i.e., the same peptide can be found in different homologous proteins. This is particularly challenging in metaproteomics where the diversity and number of homologous proteins are much higher compared to single-species proteomics. To overcome this issue, most bioinformatic pipelines tend to automatically group homologous protein sequences into protein groups. However, each tool handles protein inference and protein groups in its own way, which prevents a straightforward output comparison at the protein group level. In order to allow robust comparison between approaches, the PSM output files of the four bioinformatic pipelines were combined. The peptides were then assigned to protein sequences in the FASTA file and the data was prepared for subsequent protein grouping. Two approaches of protein grouping were used and evaluated in this study: PAPPSO grouping61, which excludes proteins based on the rule of maximum parsimony, and grouping from MPA28, which does not exclude proteins. All data processing was done using a custom Java program except for PAPPSO grouping for which data was exported and imported using the appropriate XML format.

For both methods, protein groups were created using the loose rule “share at least one peptide” (groups) and the strict rule “share a common set of peptides” (subgroups), resulting in a total of four protein grouping analyses: (1) PAPPSO groups, (2) MPA groups, (3) PAPPSO subgroups, and (4) MPA subgroups. Finally, the resulting protein groups and subgroups were exported for further analysis (Supplementary Note 3). These algorithms are also implemented in Pout2Prot95 for independent use.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A