4.5. Analysis of Enriched Transcription Factor Binding Sites in Promoters of Differentially Expressed Genes

AO Alexander N. Orekhov
VS Vasily N. Sukhorukov
NN Nikita G. Nikiforov
IS Igor A. Sobenin
SP Sergey Pintus
AK Alexander Kel
AP Anastasia V. Poznyak
AK Artem S. Kasianov
VM Vsevolod Y. Makeev
request Request a Protocol
ask Ask a question
Favorite

The Match™ algorithm that helps to search the overrepresented known DNA-binding motifs was used for the analysis of the transcription factor-binding sites in the promoters of the identified differentially expressed genes [68]. Position weight matrices were selected from the TRANSFAC® database and used to determine the motifs.

Then, the frequencies of the transcription factor-binding sites in the promoters of differentially expressed genes (foreground sequence set, denoted as “yes” set) and of genes that were not expressed differentially (background sequence set, denoted as “no” set) were compared. The standard-length promoter sequences, including -1000 bp to +100 bp around the transcription start site, were used for the analysis. Estimating the adjusted p-value was used to control the site enrichment error rate (Benjamini–Hochberg correction procedure was used) (adj. p-value < 0.01).

The composite module analyst algorithm [69] was used to perform the analysis of the specific combinations of transcription factor-binding sites clustering inside the promoter regions, so-called composite modules, within the promoters of the “yes” and “no” sets. The search object was a composite module, which consists of clusters of sites for a maximum of 10 transcription factors in a sliding window of 200–300 bp that significantly separated the sequences of the “yes” and “no” sets (minimizing Wilcoxon p-value).

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A