METHOD DETAILS

SP SungHee Park
MB Mattia Brugiolo
MA Martin Akerman
SD Shipra Das
LU Laura Urbanski
AG Adam Geier
AK Anil K. Kesarwani
MF Martin Fan
NL Nathan Leclair
KL Kuan-Ting Lin
LH Leo Hu
IH Ian Hua
JG Joshy George
SM Senthil K. Muthuswamy
AK Adrian R. Krainer
OA Olga Anczuków
request Request a Protocol
ask Ask a question
Favorite

T7-tagged SRSF2, SRSF3, SRSF4, SRSF6, SRSF9, HNRNPA1 and TRA2β cDNA were subcloned from pcGT plasmids (Cáceres et al., 1997) into a pWZL-Hygro retroviral vector (a gift from S. Lowe, Cold Spring Harbor Laboratory) as described (Anczuków et al., 2012). PWZL-T7-SRSF1 (Anczuków et al., 2012) and pBabe-Puro-MYC.ER (Eilers et al., 1989) were described previously. SF-targeting shRNAs were designed using DSIR and sensor rules (Table S7A), and were cloned into TRMPV-Neo retroviral doxycycline-inducible shRNA expression vectors as described (Zuber et al., 2011). All vectors and inserts were verified and authenticated by Sanger sequencing.

Populations of MCF-10A cells expressing either T7-tagged SRSF1, SRSF2, SRSF3, SRSF4, SRSF6, SRSF9, HNRNPA1 or TRA2β cDNA, with or without MYC.ER overexpression, were generated by retroviral transduction and selection with 2μg/ml puromycin or 200 μg/ml hygromycin as described (Anczuków et al., 2012). Population of MDA-MB231-luciferase-GFP or SUM-159PT cells expressing rTTA3-puro and SF-shRNA-TRMPV-Neo were generated by retroviral transduction and selection with 2μg/ml puromycin or 1mg/ml G418 as described (Zuber et al., 2011).

MCF-10A or MDA-MB231 stable cell lines were seeded on an 8-well glass chamber slide coated with Matrigel Growth Factor Reduced (BD Biosciences) as described (Debnath et al., 2003a) at a density of 5,000 cells per well and maintained in their respective media. At least 100 acini or structures were imaged at indicated time points using a Zeiss Axiovert 200M microscope and AxioVision 4.5 software (Zeiss). MYC.ER acini were stimulated with 1 μM 4-hydroxy-tamoxifen (Sigma) on day 3, and the medium with inducer was replaced every 3 days, as described (Zhan et al., 2008). For shRNA induction, MDA-MB231 or SUM159PT cell media was supplemented with 1-2 mg ml−1 doxycycline at the indicated time points, and the media was replaced every 3 days. For high-content imaging of 3D structures, 6,000 MDA-MB231 or 8,000 SUM159PT cells were plated per well in 48-well plates, and fluorescent images were acquired using a high-content imaging Opera Phoenix instrument (Perkin Elmer) and assembled using the Harmony software (Perkin Elmer) as a maximum projection image composed of 30-35 Z stack images taken every 55 μm. For downstream RNA or protein extraction from 3D cultured MCF-10A or MDA-MB231 cells were washed with PBS, and Matrigel was dissolved by incubating slides at 4°C in Cell Recovery Solution (BD).

All procedures were performed as described (Anczuków et al., 2012). Microscopy was performed on a Zeiss Axiovert 200M instrument using the ApoTome imaging system (Zeiss) or the Dragonfly Spinning Disk system (Andor). T7 (CSHL antibody facility), cleaved caspase-3 (Cell Signaling) and ki67 (Zymed) primary antibodies were used at 1/50, 1/100 and 1/100 dilutions, respectively. Alexa Fluor 568 anti-mouse and 488 anti-rabbit secondary antibodies (Invitrogen) were used at 1/500 dilution. Acini were scored positive for ki67 when at least five cells within the acini were stained and positive for cleaved caspase-3 when at least one cell in the lumen was stained.

3D cultured MCF-10A cells were harvested as described above and RNA was extracted using an RNAeasy kit (QIAGEN) including DNase I treatment. 1 μg of RNA was reverse-transcribed with Superscript III reverse transcriptase (Invitrogen). QPCR was used to amplify endogenous transcripts with the SF-specific primers listed in Table S7B using cDNA corresponding to 5ng of RNA. QPCR was performed with iTaq Universal SYBR green Supermix (Bio-Rad) in 384-well plates (Life Technologies) using a ViiA7 Real-Time PCR system (Life Technologies) per manufacturer instructions. Results were analyzed with QuantStudio Real-Time PCR software, and SF expression was normalized to housekeeping genes GAPDH and HPRT.

Cells were lysed in Laemmli buffer (50 mM Tris-HCl pH 6.2, 5% (v/v) β-mercaptoethanol, 10% (v/v) glycerol, 3% (w/v) SDS). Equal amounts of total protein were loaded on a 12% SDS-polyacrylamide gel, transferred onto a nitrocellulose membrane (Millipore) and blocked in 5% (w/v) milk in Tween 20-TBST (50 mM Tris pH 7.5, 150 mM NaCl, 0.05% (v/v) Tween 20). Blots were incubated with anti-T7 (EMD Millipore #69522-3), anti-SRSF4 (Bethyl Laboratories #A303-670A), anti SRSF6 (CSHL antibody facility AK9-156), anti-SRSF9 (CSHL antibody facility #AK251-24), anti-TRA2B (Abcam #ab31353), or anti-β Tubulin III (Genescript #A01203-40) primary antibodies. IR-Dye 680 anti-mouse or IR-Dye 800 anti-rabbit immunoglobulin G (IgG) secondary antibodies (LI-COR) were used for infrared detection and quantification with an Odyssey imaging system (LI-COR).

3D cultured MCF-10A or MDA-MB231 cells were harvested as described above and total RNA was extracted using an RNAeasy kit (QIAGEN) including DNase I treatment. RNA libraries were prepared and barcoded using a TrueSeq stranded mRNA kit with polyA selection (Illumina), and quantified using a Bioanalyzer DNA 1000 chip (Agilent). Equal amounts of libraries were pooled (3 libraries per lane) and sequenced as 101bp (MCF-10A) or 150bp (MDA-MB231) paired-ended reads on an Illumina HiSeq instrument at >80-100 million reads per library. At least 3 independent libraries were generated for each experimental condition.

MCF-10A, MDA-MB231 or SUM159PT cells were harvested as described above and RNA was extracted using an RNAeasy kit (QIAGEN) including DNase I treatment. 1 μg of total RNA was reverse-transcribed with Superscript III reverse transcriptase (Invitrogen). Semiquantitative PCR was used to amplify endogenous transcripts with the primers listed in Table S7C with cDNA from 5-20ng of RNA. Optimal PCR conditions were defined for each primer pair by testing amplification from 26-30 cycles to select semiquantitative conditions. PCR products were separated by 2% agarose gel stained with SYBRSafe (Invitrogen), and bands were quantified with a ChemiDoc MP Imaging System (Bio-Rad). The ratio of each isoform was first normalized to the sum of the different isoforms, and changes were then expressed as the fold increase compared to the levels obtained for cells or acini expressing the control vector.

250,000 MCF-10A or 100,000 MDA-MB231 cells were grown in 6-well plates for 48 hours. For shRNA induction, the media was supplemented with 2μ ml−1 doxycycline at 24 hours before the start of the experiment and maintained until the end. A scratch was performed in the confluent monolayer with a P200 tip and the wells were washed twice with media. The same field was imaged with a Zeiss Observer microscope at 0h and 16h after induction of the wound. The size of the gap was measured using the Axiovision digital image processing software (Zeiss).

150,000 MCF-10A or 25,000 MDA-MB231 cells in serum-free media were seeded on top of an 8-μm PET membrane transwell (BD Biosciences) in a 24-well format and allowed to migrate into the lower compartment containing complete media with serum and growth factors for 16 or 4 hours for MCF-10A or MDA-MB231 cells, respectively. For shRNA induction, the media was supplemented with 2μ ml−1 doxycycline at 24 hours before the start of the experiment and maintained until the end. After removal of the cells on top of the filter, the remaining cells were fixed with 5% formalin, permeabilized with 0.5% Triton X-100 and stained with DAPI. DAPI-positive nuclei were imaged using a Zeiss Observer microscope, and counted using the ImageJ digital image processing software.

Collagen-matrigel invasion assays were performed as described (Xiang and Muthuswamy, 2006). Briefly, 5,000 MCF-10A cells were seeded on a 1:2 mix of collagen:Matrigel in 8-well glass slide chamber. Media was replaced every 4 days, and acini were imaged at day-8 and day-16 using a Zeiss Observer microscope.

Cell proliferation assays were performed with an ATCC MTT Assay kit (ATTC). Briefly 2,000 MDA-MB231 cells were plated into 96-well plates in triplicates in cell growth media with or without doxycycline. Cells were grown for 1-4 days and stained with the MTT reagent as recommended by the manufacturer. Wavelength was read on a SpectraMax M plate reader.

Animal experiments were carried out in the Cold Spring Harbor Laboratory Animal Shared Resource in accordance with Cold Spring Harbor Laboratory Institutional Animal Care and Use Committee-approved procedures. 0.5-1×106 MDA-MB231-luciferase-GFP control or TRA2β shRNA inducible cells were injected either into the tail vein or into the mammary fat pad of 8-week-old NOD.Cg-Prkdcscid Il2rgtm1Wjl/SzJ (The Jackson Laboratory #5557) female mice. For shRNA induction, half of the mice were treated with doxycycline in both drinking water (1.5 mg ml–1 with 2% sucrose; RPI Corporation and Sigma-Aldrich) and food (625 mg kg–1, Harlan Laboratories). Doxycycline water was replaced every 3 days. Whole-body bioluminescent imaging was performed using an IVIS100 system (Caliper LifeSciences) as described (Zuber et al., 2011). Briefly, primary tumor growth and metastasis formation were monitored every 3-4 days by bioluminescence imaging following intraperitoneal injection of D-Luciferin (25 mg kg–1, Goldbio). Animal weight and primary tumor size was recorded bi-weekly. Animals were euthanized 50-60 days post-injection by cardiac perfusion of a 4% paraformaldehyde (PFA) solution. Primary tumors were collected and flash-frozen for RNA and protein extraction, or fixed in 4% PFA, washed with PBS, embedded in paraffin, sectioned, and stained with hematoxylin & eosin. Following a detailed necropsy, lung and liver tissues were fixed in 4% PFA, cryo-protected in sucrose gradient, embedded in OCT solution (Tissue-Tek) and frozen. Serial transversal sections of lung and liver were then performed through the whole tissue, every 2mm, and were re-embedded horizontally to obtain serial sections of the entire organ, and frozen in fresh OCT. Lung and liver sections were staining with hematoxylin & eosin and slides were imaged using an Aperio slide scanner (Leica). Micro- and macro-metastasis areas were then quantified relative to the whole organ area using an Aperio Scanscope (Leica).

We used the SpliceCore® software platform (https://www.envisagenics.com/platform/) to identify cancer-related differential splicing events (DSEs). SpliceCore is operated through a user interface built on the Microsoft Azure cloud to facilitate data trafficking, storage, HIPAA compliance and compute scalability. To identify DSEs changes co-occurring in cancer, SpliceCore utilizes a reference database called TXdb™, which incorporates over 5M exon-trio models (Wu et al., 2011) derived from the entire TCGA database, including ~1.5K breast cancer datasets. To prioritize reproducible DSEs, we complemented SpliceCore with additional in-house data analysis. Our in-house pipeline implemented STAR (v.2.5.1b) (Dobin et al., 2013), Cufflinks Suite (v.2.2.1) (Trapnell et al., 2012) and rMATS (v.3.2.5) (Shen et al., 2014). Paired-end reads were preprocessed for trimming of low-quality regions by Trimmomatic (v. 0.36) (Bolger et al., 2014) and mapped to the human reference genome using STAR in 2-pass mode with the Gencode GRCh37 v.25 reference transcript annotation. To include novel exons and introns in the analysis, we also performed an annotation-guided transcriptome reconstruction and merged the resulting transcriptome from each sample into a comprehensive transcript annotation (GTF) with Cufflinks and Cuffmerge (Cufflinks suites v.2.2.1) (Trapnell et al., 2012) using the “–multi-read-correct” and “–library-type fr-firststrand” (strand-specific library for dUTP protocol) parameters. Each RNA-seq replicate was processed independently in SpliceCore and in-house to produce individual “percent spliced in” (PSI) scores. In this manner we increased the sensitivity of the analysis by ensuring a larger number of AS events to be detected by at least one method. Next, we integrated both SpliceCore’s and in-house individual PSI scores to compute DSEs as “differential percent spliced in” (ΔPSI) scores. Global ΔPSI scores were estimated as the difference of the mean PSIs detected across RNA-seq replicates by at least one method: ΔPSIglobal = mean(PSIcase)-mean(PSIcontral). To eliminate inconsistent results, we performed a quality control whereby we estimated individual ΔPSIs for each “i”, where “i”s are the list of all case datasets analyzed by every method. For each “i” we estimated individual ΔPSIi = PSIi-mean(PSIcontrol). If the sign of the geometric mean of all “(i=1i=nΔPSIi) did not equal sgn(ΔPSIglobal) then the splicing event was called inconsistent and filtered out. For each analyzed dataset (i.e., TCGA, MCF-10A and MDA-MB231) we applied further filtering criteria explained below.

The RNA-seq data from TCGA breast tumors were retrieved and processed on the ISB Cancer Genomics Cloud Platform (https://isb-cgc.appspot.com/). Sample IDs are listed in Table S2A.

CNVs and expression changes in SFs were assessed by DNA-and RNA-seq, respectively, from a collection of human breast tumors from The Cancer Genome Atlas (TCGA) dataset (n = 960) (The Cancer Genome Atlas Network, 2012) as described previously (Cerami et al., 2012; Gao et al., 2013). Graphical Oncoprint representations were generated using the Cbio portal (http://www.cbioportal.org/) for all breast tumors, as well as for samples annotated as triple negative breast tumors. Pre-computed Z-score and clinical information, e.g., PAM50 signatures, were retrieved from Cbio for downstream analysis.

Human TCGA breast tumors (n = 960) (The Cancer Genome Atlas Network, 2012) were classified into SF-low (Z-score ≤ 0) or SF-high (Z-score ≥ 1.5) tumors according to their SF expression for SRSF1, SRSF2, SRSF3, SRSF4, SRSF5, SRSF6, SRSF7, SRSF8, SRSF9, SRSF10, SRSF11, SRSF12, TRA2β and HNRNPA1 using pre-computed Z-score values from cBioPortal (http://www.cbioportal.org/)(Cerami et al., 2012; Gao et al., 2013)(Table S2A). Splicing events were defined using both splice junction read counts and alternatively spliced exon body counts for each event to calculate a PSI score for each local event, as well as a ΔPSI in SFlow versus. SFhigh tumors using SpliceCore and in-house methods as described above. The in-house pipeline was implemented using corresponding dockers (https://www.docker.com/products/docker-hub) on a cloud instance. Significant DSEs were selected as follows: i) ΔPSI = ∣mean PSI SFhigh – mean PSI SFlow∣ ≥ 0.1; and ii) FDR ≤ 0.05; and iii) at least an average of 5 reads per dataset detected in both SFlow and SFhigh tumors that support either exon skipping or inclusion, i.e., (inclusion count > = 5 in either control OR case) AND (skipping count > = 5 in either control OR case). To correct for missing data due to sequencing depth below 100 million reads per sample in the TCGA dataset, we focused only on DSEs detected in at least 10 tumors from both groups.

DSEs between two groups, such as control versus SF-OE/KD, were determined using SpliceCore and in-house methods as described above. Significant DSEs were selected as follows: i) ΔPSI = ∣ mean PSISF-OE/KD – mean PSIcontrol ∣ ≥ 0.1; and ii) FDR ≤ 0.05; and iii) at least 5 reads (averaged across all biological replicates) detected in both the control and SF-OE/KD that support either exon skipping or inclusion, i.e., (inclusion count > = 5 in either control OR case) AND (skipping count > = 5 in either control OR case). For MDA-MB231 cells, the significant DSEs from TRA2βsh +DOX versus −DOX were discarded if they also appeared as significant and altered in the same direction in CTLsh +DOX versus −DOX cells.

To compare the overlap in DSEs between two studies, A and B, we performed a Fisher’s exact test (with ‘two-sided’ test to the null hypothesis) in which: i) DA and DB are all the detected AS events in study A or study B, respectively; ii) SA and SB are the statistically significant DSEs detected in study A or study B, respectively (∣ΔPSI∣ ≥ 0.1; FDR ≤ 0.05; ≥ 5 reads averaged across all replicates); iii) the intersection of significant DSEs with ΔPSI in the same direction (i.e., either both ≥ 0.1 or both ≤ −0.1) is defined as iS = SA & SB; and iv) the total splicing “universe” is defined by the union of splicing events as D = DA ∣ DB. Thus, the strength of overlap of DSEs in two studies is calculated using a Fisher’s exact test computed from the following 2×2 contingency table, along with odds ratio and P-values.

To compare the similarity in DSEs between two studies, the Jaccard similarity index is calculated as J = SA & SB / SA ∣ SB. The analysis was conducted either on splicing events (DSEs) or one genes that exhibit at least one DSE (DSGs). P-values were corrected using Bonferroni’s method.

Preprocessing of reads and mapping steps was performed as mentioned for the differential splicing analysis. Using mapped files (BAM), differential expression of annotated genes (Gencode GRCh37) in control versus SF-OE/KD was determined with Cuffdiff (v.2.2.1) (Trapnell et al., 2012) using the “–multi-read-correct” and “–library-type fr-firststrand” (strand-specific library) parameters. Significant genes were selected as follows: i) FDR ≤ 0.05; ii) ∣log2 fold change∣ ≥ 0.3; and iii) test status = “OK.” For MDA-MB231 cells, genes with significant differential expression in TRA2βsh +DOX versus −DOX were discarded, if they also appeared as significant and altered in the same direction in CTLsh +DOX versus −DOX cells.

Proteins domain were annotated using the InterProScan Pfam predictive module (Zdobnov and Apweiler, 2001). SpliceCore’s exon trios were translated to protein in three different reading frames and the optimal one was chosen for domain annotation. Optimal reading frames were selected based on alignment score to known protein sequences expressed from the coding locus.

Genes associated with breast carcinoma based on GWAS, functional genomics, and literature data mining were retrieved from the Open Targets Platform (https://www.opentargets.org/).

Gene enrichment analysis was performed using a Bioconductor package, enrichR (Chen et al., 2013; Kuleshov et al., 2016) for both i) genes that exhibit altered splicing (significant DSEs), ii) genes that exhibit differential splicing (DSGs), and iii) genes that exhibit changes in expression in SF-OE versus control MCF-10A cells or SF-KD versus control MDA-MB231 cells. We reported the top 100 enriched targets sorted based on FDR.

MYC and TRA2β amplification status and expression were calculated using the Cbio portal (http://www.cbioportal.org/) for human tumor RNA-seq data from The Cancer Genome Atlas Project as described (Cerami et al., 2012; Gao et al., 2013) or using microarray data from GEO GSE2109 (https://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GSE2109) as described (Anczuków et al., 2012). The association between MYC and TRA2β was computed using a two-tailed Fisher’s exact test and corrected for multiple testing.

MYC ChIP-seq datasets from MCF-7 (ENCSR000DMJ) and MCF-10A (ENCSR000DOS) cells were analyzed as described on https://www.encodeproject.org/ and visualized using http://genome.ucsc.edu. Plotted tracks are fold changes over controls for pooled replicates, as well as representative ChIP-seq peaks called using conservative IDR thresholds.

Patients were stratified in TRA2β-high or -low expression groups using the minimum P-value approach as described (Mizuno et al., 2009). Correlations between TRA2β expression and overall survival or distant-metastasis-free-survival for multiple breast cancer patient cohorts were retrieved from the Prognoscan database (Mizuno et al., 2009) (http://gibk21.bse.kyutech.ac.jp/PrognoScan/index.html).

Graphs were generated using GraphPad Prism 5 software. Figures were generated using Adobe CC 2018 Photoshop and Illustrator software in compliance with the Nature Publishing Group policy concerning image integrity.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A