4.1. Data Collection, Preprocessing, and Identification of Differentially Expressed Genes

SS Shamim Sarhadi
MD Mehdi Damaghi
NZ Nosratollah Zarghami
HH Hedayatollah Hosseini
request Request a Protocol
ask Ask a question
Favorite

Gene expression microarray raw data were retrieved from the Gene Expression Omnibus (GEO) database (see GEO accession numbers in Figure S1a). We then discarded low-quality samples (102 samples were excluded) to focus on changes that triggered transitions between normal and DCIS, and DCIS and IDC or metastasis without extraneous interferences. All samples with no clear clinical annotation were also removed. Afterwards, Affy batch objects were created by the affy package in R [41] and the GCRMA package [42] to be used for preprocessing the datasets (including background correction, normalization, log transformation, and quality control). The identification of differentially expressed genes (DEG) was done in four states: normal vs. DCIS, normal vs. IDC, DCIS vs. IDC, and primary vs. metastasis via limma [43] through NetworkAnalyst (https://www.networkanalyst.ca/faces/home.xhtml) [44]. The combined p-values for the DEGs’ identification were obtained by using the state-combined Fisher’s method (adjusted p-values for DEGs from each study combined by Fisher’s method). DEGs within the mentioned states were named DEG1, DEG2, DEG3, and DEG4 for normal vs. DCIS, normal vs. IDC, DCIS vs. IDC states, and primary vs. metastasis, respectively (see all DEGs in Table S1). In order to evaluate the results obtained by microarray technology with RNA-seq data, the DEG2 signature was compared with the analogous DEG list from the largest available RNA-sequencing dataset in breast cancer and we calculated the significance via exact hypergeometric probability [26].

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A