Determining priors for expression analysis

John C. Stansfield; Matthew Rusay; Roger Shan; Conor Kelton; Daria A. Gaykalova; Elana J. Fertig; Joseph A. Califano; Michael F. Ochs

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Determining priors for expression analysis

JS John C. Stansfield

MR Matthew Rusay

RS Roger Shan

CK Conor Kelton

DG Daria A. Gaykalova

EF Elana J. Fertig

JC Joseph A. Califano

MO Michael F. Ochs

This method is extracted from research article: Cancer Inform, Feb 2016

Toward Signaling-Driven Biomarkers Immune to Normal Tissue Contamination

DOI: 10.4137/CIN.S32468

Request a Protocol

Ask a question

Favorite

In order to set priors on the potential expression of genes that are targets of HNSCC network shown in Figure 1, information on protein activity is needed. For this, an outlier analysis was performed on the methylation and copy number data. Outliers were counted for the hypomethylation of promoters or amplification of genes that coded signaling proteins. A rank outlier method was used,¹⁴ where an outlier for a gene was defined such that the methylation of a tumor was below the normal by at least 0.1 or the copy number of the tumor was above the normal by at least 0.5. For each gene, this resulted in a count, C, for each tumor capturing how many normals it exceeded in methylation and copy number. We converted this to an empirical P-value with P = (N − C + 1) / N, so the more times a tumor exceeded the normals, the lower the P-value. We did this separately for methylation and copy number and then counted the number of significant P-values for each gene across the 44 tumors and two molecular types at the significance level of α = 0.05. This method of counting outliers was shown to be robust to changes in the minimum difference for copy number and methylation level previously.¹⁴ The number of outliers was then linearly scaled to provide a value for each protein between 0.9 (many outliers) and 0.5 (no outliers).

The network of Figure 1 was then propagated with these values to the TFs as follows. For receptors and other root nodes with no parents, the relative probability of activity was set equal to the value. For any node x with only activating parents pa(x),

where p_pa(_x₎ is the maximum relative probability of all parent nodes and p_p is the value calculated from outliers. For cases including the repressors of x, which compete with the activators, the relative probability was given by

where p_pr(_x₎ is the maximum relative probability of the repressors being active. This provided for repressors dominating activators overall and for a single activation or repression step to tend to have a dominant effect.

Finally, the relative probability of a TF being active was then used as the prior relative probability of a target being expressed. The implementation of the prior scaled all values to have equal overall prior probability assigned to each pattern, so these values effectively just set the relative probability within one pattern (one column of the A matrix – see next section).

This is an open-access article distributed under the terms of the Creative Commons CC-BY-NC 3.0 License.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol