CPTAC human kidney tissue proteomics – data generation, quality control and quantification

XX Xiaoguang Xu
CK Chachrit Khunsriraksakul
JE James M. Eales
SR Sebastien Rubin
DS David Scannali
SS Sushant Saluja
DT David Talavera
HM Havell Markus
LW Lida Wang
MD Maciej Drzal
AM Akhlaq Maan
AL Abigail C. Lay
PP Priscilla R. Prestes
JR Jeniece Regan
AD Avantika R. Diwadkar
MD Matthew Denniff
GR Grzegorz Rempega
JR Jakub Ryszawy
RK Robert Król
JD John P. Dormer
MS Monika Szulinska
MW Marta Walczak
AA Andrzej Antczak
PM Pamela R. Matías-García
MW Melanie Waldenberger
AW Adrian S. Woolf
BK Bernard Keavney
EZ Ewa Zukowska-Szczechowska
WW Wojciech Wystrychowski
JZ Joanna Zywiec
PB Pawel Bogdanski
AD A. H. Jan Danser
NS Nilesh J. Samani
TG Tomasz J. Guzik
AM Andrew P. Morris
DL Dajiang J. Liu
FC Fadi J. Charchar
MT Maciej Tomaszewski
request Request a Protocol
ask Ask a question
Favorite

The abundance of human kidney tissue proteins were obtained from the CPTAC pre-processed protein-level assembly136 and downloaded from the proteomic data commons (data accessed May 2022, https://pdc.cancer.gov/pdc/study/PDC000127). The details of the biochemical analyses are reported in full in the original publication136. In brief, we made use of the NAT samples (taken from regions of the kidney adjacent to renal clear cell tumours) and reviewed by a board-certified pathologist (to confirm the histological status)136. Each kidney tissue sample was homogenised, lysed, digested and trypsinised. Samples were then multiplexed using tandem mass tag (TMT) and fractionated by basic reversed-phase liquid chromatography (bRPLC)136. Peptides were then separated by ultra-high-performance liquid chromatography (UHPLC) and analysed using the Thermo Fusion Lumos mass spectrometer136. The protein-level assembly of spectra and peptides into estimates of protein abundance was performed by a software workflow using the Philosopher pipeline172 including spectral search by MSFragger173 and refinement by PeptideProphet174. Peptide spectral match data for each sample were then normalised by log2-transformed reference intensities determined for each TMT channel. A total of 83 NAT samples had raw protein-level abundance data; 72 of these had genotype information from whole genome sequencing available and 65 of these also had poly-A RNA-sequencing data. All samples used in this analysis were collected from patients of European ancestry. A total of 7291 proteins were quantifiable in all 72 samples and 7036 of these from 65 samples were available with measurable expression of their source gene in the RNA-sequencing data.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A