Motif design

LV Ludovica Vanzan
HS Hadrien Soldati
VY Victor Ythier
SA Santosh Anand
SB Simon M. G. Braun
NF Nicole Francis
RM Rabih Murr
request Request a Protocol
ask Ask a question
Favorite

Criteria for choosing WT TF motifs were the following: (1) when available, motifs identified from ChIP-Seq data were selected. (2) if no such data are available, Position Frequency Matrices (PFMs) were obtained from the JASPAR Core Vertebrate 2016 database21, alternatively, from the UniProbe22 or TRANSFAC23 databases. WT motifs were chosen mainly as the consensus sequence found in JASPAR database. To minimize the cross-matching between motifs, we checked that the WT core motifs (e.g. GAATGTTTGTTT) and the combination “restriction site–barcode–motif–barcode–restriction site” (e.g. catgtaGCATGCtgagaaGAATGTTTGTTTtgagaaGCTAGCcatgta) did not match with JASPAR motifs other than intended. This was done using the countPWM() function of the R BioStrings package using min.score = “90%”. Scrambled (Sc) motifs were created by random shuffling of the WT motif except for the CG dinucleotides. The number and position of CG dinucleotides were maintained in WT and Sc motifs. For example, WT: CCGTAGTCGA and Sc: TCGAGCAGTC. Sc motif–barcode combinations were also checked for cross-matching with other JASPAR motifs as for WT sequences. To ascertain how closely the WT or Sc sequences match with the respective motif’s PFM, a “normalized score” was defined. At each position in WT or Sc sequence, the probability of corresponding nucleotide in the PFM was taken as the match score for that position. The average of match scores for all positions was defined as “normalized score”. Normalized scores of WT sequences were high (>0.7) and only those Sc motifs whose normalized score were at least 0.3 lower than the corresponding WT motif were used.

To confirm the specificity of the chosen motifs and that we are not creating any unwanted TFBS by adding the barcode and the restriction sites used for insertion, each motif–barcode–restriction site combination was screened using HOMER tools to predict for TFBSs. The non-redundant TF list from JASPAR 2018 database was used and log-odds score threshold was fixed to 6, above which a motif would be considered significant. Then TFBS predicted in WT sequences were normalized to those found in scramble sequences.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A