Building isoform coexpression networks

Jun Ma; Jenny Wang; Laleh Soltan Ghoraie; Xin Men; Linna Liu; Penggao Dai

Improve Research Reproducibility A Bio-protocol resource

Home
Protocols

Concise Method

Building isoform coexpression networks

JM Jun Ma

JW Jenny Wang

LG Laleh Soltan Ghoraie

XM Xin Men

LL Linna Liu

PD Penggao Dai

This method is extracted from research article: Sci Rep, Sep 2019

Network-based method for drug target discovery at the isoform level

DOI: 10.1038/s41598-019-50224-x

Request a Protocol

Ask a question

Favorite

We created a coexpression network at the isoform level through the following steps introduced in our previous publications^²⁴,³⁶. First, expression data for isoforms from overlapping cell lines of the same cancer type in the CCLE and gCSI datasets were downloaded from the PharmacoGx platform (version 1.12.0)^³⁷, which comprises pharmacological profiles for several hundred cell lines. The updated CCLE and gCSI PharmacoSets contain isoform-level expression data processed from raw RNA-seq profiles extracted from CGHub^³⁸ and NCBI GEO^³⁹. Zhaleh et al.^⁴⁰ aligned the RNA-seq reads to the Ensembl Genome Reference Consortium release GRCh38^⁴¹ using HISAT2^⁴², annotated the isoforms and calculated their expression with StringTie^⁴³. A total of 58,037 genes, including 19,950 protein-coding genes, 15,767 long noncoding RNAs (lncRNAs) and 14,650 pseudogenes, was annotated by Gencode (version 25)^⁴⁴. Then, the FPKM values (the number of fragments per kilobase per million mapped reads units) were converted to log2 (FPKM + 1) to obtain the expression values of the isoforms. Noncoding isoforms were removed based on Ensembl identifiers using the R package BiomaRt (version 2.34.3)^⁴⁵. We calculated the Pearson correlation coefficients of two isoform expression values for each dataset as follows:

where E is the expression value of protein isoforms i and j. The value log2 (FPKM + 1) was ≥1 in at least 30 cancer cell line types in each dataset. Protein isoforms i and j are also common isoforms in both the gCSI and CCLE datasets.

Interactions between isoforms in the same genes were removed. To find a balance between removing weak interactions and keeping more isoforms in the network, the isoform network was filtered by the threshold s = 0.5, which was calculated as follows:

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

Post a Question

0 Q&A

Share your protocol with your peers.

Submit a Preprint Protocol