The modeling approach implemented in this study is a thermodynamic-based modeling approach, similar to models used in previous studies (Zinzen et al., 2006; Segal et al., 2008; Fakhouri et al 2010; He et al., 2010;). These models are derived using the law of mass action and thermodynamic equilibrium assumptions. They take information regarding the number and arrangement of TF binding sites, as well as TF concentrations, and output predicted levels of gene expression.
Here, we use thermodynamic models that assume RNA polymerase (RNAP) is recruited by bound TFs, and thus model transcriptional output as proportional to the probability of the enhancer being in an ‘active state’. Other assumptions used by all models tested in this manuscript include:
An ‘active state’ is defined as any state of the enhancer with at least one activator bound and any repressor(s) bound are not actively repressing (quenching) the bound activator(s),
TF binding affinities are directly proportional to PWM scores obtained using MAST, with one proportionality (scaling) constant per TF,
interactions (i.e. cooperativity and quenching) only occur between adjacently bound TFs,
and TFs can not bind simultaneously to overlapping binding sites; competitive binding occurs.
To test different hypotheses about biochemical mechanisms of transcription factor activity on enhancers, several different schemes involving transcription factor cooperativity and short-range repression were implemented in our modeling effort. To create models that considered the diverse cooperativity and repression (referred to as quenching) relationships we propose, all possible pair-wise combinations of the fifteen cooperativity and eight quenching approaches were considered, generating 120 different models.
For short-range repression, we used three continuous functions (Linear-Q2, Logistic-Q3 and Gaussian Decay-Q4) to describe change in repressor activity the percentage of time that the repressor is actively repressing (or quenching) an adjacently bound activator, as a function of the distance, d, in base pairs, from the repressor binding site to the activator binding site.
Linear f(d)= a+bd
Logistic Decay f(d) = 2a/(1+e(d/b))
Gaussian Decay f(d) = ae(-dd/b)
When implemented, a=1 and b>0 is a model parameter for quenching functions. For cooperativity functions, ‘a’ and ‘b’ are both model parameters. An alternative approach involved 'binning' distances between activators and repressors. We fit quenching parameters (Q) for each of the bins. We also used the non-monotonic 'quenching' function (Q1) derived from our analysis of short-range repression by the Giant protein in synthetic enhancer constructs (Hansen et al., 2003; Fakhouri et al., 2010; Suleimenov et al., 2013).
The binned quenching schemes are described as follows. The distances between binding sites were calculated from the center of the binding sites. Because of minimal center-to-center distances between Snail and Twist or Dorsal, the actual minimal distance possible is 11 bp in the wild-type rho enhancer sequence.
Scheme Q5: q1: 1–25 bp, q2: 26–50 bp, q3: 51–75 bp, q4: 76–100 bp
Scheme Q6: q1: 1–35 bp, q2: 36–70 bp, q3: 71–105, q4: 106–140 bp
Scheme Q7: q1: 1–45 bp, q2: 46–90 bp, q3: 91–135, q4: 136–180 bp
Scheme Q8: q1: 1–10 bp, q2: 11–20 bp… q9: 81–90 bp, q10: 91–100 bp
For cooperativity functions, we use the same functions as above (1–3) to describe the multiplicative effect of cooperative binding between two adjacently bound activators, as a function of the distance, d, in base pairs, between the activator binding sites. When implemented as cooperativity functions, a>0 and b>0 are both model parameters.
We considered two different ways of estimating cooperativity between transcription factors: heterotypic (between Dorsal and Twist) and homotypic (Dorsal-Dorsal, Twist-Twist, or Snail-Snail). We tested three different continuous functions (Linear-C1, Logistic-C2 and Gaussian Decay-C3), which were parameterized with a single pair of parameters for all homotypic interactions, and separate values for Dorsal-Twist cooperativity. Additional models with 'binned' distances were also considered. For each of the binned schemes, we used a simpler form in which all homotypic interactions are parameterized with the same values, and a more complex form where each type of protein interaction for a given bin size receives distinct parameters. Each of these schemes therefore generates two model forms – binned and protein-binned respectively.
Schemes C4 and C10: c1: 1–25 bp, c2: >25 bp
Schemes C5 and C11: c1: 1–50 bp, c2: >50 bp
Schemes C6 and C12: c1: 1–75 bp, c2: >75 bp
Schemes C7 and C13: c1: 1–50 bp, c2: 51–100 bp, c3: >101 bp
Schemes C8 and C14: c1: 1–60 bp, c2: 61–120 bp, c3: >121 bp
Schemes C9 and C15: c1: 1–70 bp, c2: 71–140 bp, c3: >141 bp
For a summary of parameters in each model, see Table 1.
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.