High-throughput CRISPR-Cas9 splicing reporter system

WL Wilson Louie
MS Max W. Shen
ZT Zakir Tahiry
SZ Sophia Zhang
DW Daniel Worstell
CC Christopher A. Cassa
RS Richard I. Sherwood
DG David K. Gifford
request Request a Protocol
ask Ask a question
Favorite

To model exon splicing changes representative of the human genome, we curated human intron-exon junction sequences for the lib-SA target library. This was done by selecting exons that contain at least one pathogenic indel in the Human Gene Mutation Database (HGMD, professional release 2018.1) [31] with the following properties: a basal frameshift rate of 66% or more, which are likely to disrupt normal protein function; a length evenly divisible by 3, to preserve the reading frame when skipped; are not constitutive, as defined by less than 100% presence in Ensembl (release 92) transcripts [32]; and do not contain an annotated protein domain in Pfam (release 31) [33]. This resulted in 6,805 intron-exon sequences. Those without a suitable gRNA for effective targeting (described in the next section) were filtered out, and resulted in the final lib-SA target library of 1,927 human intron-exon sequences (S1 Table).

For each of the candidate intron-exon sequences, candidate gRNAs were identified by considering gRNAs with suitable CRISPR Cas9 cut sites, as defined by the existence of an NGG PAM sequence in a 6-bp window surrounding and including the AG splice site acceptor motif (or the intron-exon boundary if the splice site was not a canonical AG). Azimuth [41] and BOTM (see S2 Text) were used to retain only those gRNAs predicted to have high on-target editing efficiency (Azimuth score above 0.2 and BOTM score above 0.65).

This candidate set of gRNAs was then heuristically reduced using inDelphi [22] and MaxEntScan [29]. inDelphi, initialized to use its mESC models, was used to predict the frequency distribution of repair genotypes for each gRNA. As per inDelphi’s default settings, only predictions for 1-bp insertions and between 1-bp and 60-bp microhomology deletions were considered. For each predicted repair genotype, MaxEntScan’s score3ss module was used to estimate the splice site acceptor motif strength. A genotype was classified as motif disrupting if its MaxEntScan score is less than 0.9, and no effect otherwise, as per previous studies on this classification ruleset [43]. The total frequency of all motif-disrupted repair genotypes weighted by the inDelphi predicted repair genotype frequency was taken to be the predicted frequency of splice site disruption. The top 4,000 gRNAs, and their 1,927 associated intron-exon sequences with high predicted frequencies were selected for our library (S1 Table).

lib-SA consists of a CAGGS (CAG) promoter [44] driving a fixed Exon A with a strong splice-donor site, an intron, a variable Exon B, a fixed Exon C, and a polyA sequence (S1 Fig). Our lib-SA of 61-bp intron-exon sequences span the intron-Exon B boundary with 37-nt of intronic sequence and 24-nt of exonic sequence. A highly diverse 15-bp barcode is embedded in Exon C. A U6 promoter on the same DNA molecule drives the corresponding SpCas9 gRNA spacer. The nucleotide resolution description of this construct is provided in S1 Text.

We used a similar library cloning and cell culture procedure to the inDelphi study [22]. lib-SA was constructed through a multistep process, cloned into a plasmid backbone allowing Tol2 transposon-based integration into the genome (S3 Text), and integrated into the genomes of mESCs using Lipofectamine 3000 transfection along with equal molar Tol2 transposase followed by one week of Hygromycin selection to ensure genomic integration.

One week after library integration, the mESCs were transfected with p2T CAG Cas9 BlastR (Addgene 107190) and Tol2 transposase using Lipofectamine 3000 followed by one week of Blasticidin selection to maximize Cas9 activity. After one week, genomic DNA (gDNA) (Purelink Genomic DNA mini kit) and RNA (Qiagen RNEasy Maxi kit) were extracted from separate aliquots of each replicate culture. gDNA and RNA from cells prior to Cas9 treatment were also extracted as control. Samples were prepared for Illumina Nextseq using PCR-based methods (S3 Text), and paired-end high-throughput DNA sequencing (Illumina Nextseq 2 x 75-nt kit) was then performed on the gDNA and RNA (S1 Fig primer locations, and S3 Text). Technical replicate sequencing was performed on samples from post Cas9 exposed cells.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

0/150

tip Tips for asking effective questions

+ Description

Write a detailed description. Include all information that will help others answer your question including experimental processes, conditions, and relevant images.

post Post a Question
0 Q&A