Additional miRNA Data Sources: the Intrinsic miRNA and Pleiotropy Properties (IMAPP) database

KW K. Rowan Wang
JH Julian Hecker
MM Michael J. McGeachie
ask Ask a question
Favorite

We also combined the number of diseases each miRNA caused with other public data sources into a single database for use in our analyses. These factors are shown in Table 1 and included the maximum and average expression levels, number of tissues present (miRMine17), validated targets, predicted targets from three different miRNA-target providers (miRanda, TargetScan, MirDB; accessed through multiMiR1821), evolutionary conservation, year discovered (miRBase9), and percentage cancer (miRAIDD).

IMAPP

To obtain lists of validated and predicted targets, we used mulitMiR to retrieve results from miRanda, TargetScan and MiRDB, which were considered as three separate variables. Due to formatting differences, we first queried multiMiR with a pre-miRNA (without −3p and −5p) name. If this returned no results for a particular target database, we then queried multiMiR with both the −3p and −5p mature miRNA names, summing the results together. Queries were conducted including both conserved and non-conserved target sites in target genes. The top 20% of genes returned were retained as high-confidence targets. MultiMiR was also used to obtain a count of validated gene targets, which are published, experimentally validated miRNA-target interactions.

To obtain expression levels and number of tissues expressed, we queried miRMine using the pre-miRNA name to get a list of expression values and tissues associated with each pre-miRNA. If no expression values were found, we imputed a value of 0.

We computed miRNA evolutionary conservation as the sum of the miRNA family size in miRbase, following our previous work22. This is a count of the number of species where the miRNA has been observed without sequence changes, and was previously found to be predictive of disease causality.

The year a miRNA was discovered is set to be the date of the seminal publication in miRBase about that miRNA. Percentage cancer is calculated to be the percentage of diseases caused by the miRNA that contain cancer-related words in their name (lymphoma, carcinoma, etc.) in miRAIDD. The full list of qualifying cancer-related terms can be found in the Supplementary Appendix.

The above data were combined with the miRNA’s list of diseases caused, pleiotropy count, and number of papers mentioning the miRNA from the miRAIDD into our Intrinsic miRNA and Pleiotropy Properties (IMAPP) database, which is freely available for download at https://github.com/Wanff/miraidd.

Do you have any questions about this protocol?

Post your question to gather feedback from the community. We will also invite the authors of this article to respond.

post Post a Question
0 Q&A