MATCHA seeks to prioritize transcription factors regulating arbitrary, user-specified gene programs, while leveraging multi -omic information on context-specific regulatory relationships (e.g., cell type- or tissue-specific gene-enhancer regulation).
As inputs, MATCHA accepts: 1) one or more arbitrary gene programs; 2) a TF motif database (e.g., JASPAR 2020); 3) multi -omic sc/snRNA-seq data; and optionally, 4) external datasets with relevance to the user’s context (e.g., prior bulk or sc/snRNA-seq atlases).
As outputs, MATCHA provides: 1) prioritization scores of the predicted strength and directionality of a TF’s regulatory effect on each arbitrary gene program (along with contributions of each input dataset to the overall prioritization score); 2) a bipartite network of which TFs regulate which gene programs (and in what direction, along with TF and gene program network centrality metrics); and, 3) rankings of which TFs may co-regulate multiple gene programs simultaneously (if multiple gene programs were provided).
Towards inference of gene program – co-accessible enhancer – causal TF triads, MATCHA is based on the principle that robust, strong regulatory relationships should be reflected across multiple -omic layers and across datasets. Therefore, MATCHA follows the following steps:
Do you have any questions about this protocol?
Post your question to gather feedback from the community. We will also invite the authors of this article to respond.