Computational Biology and Bioinformatics


Categories

Protocols in Current Issue
Protocols in Past Issues
0 Q&A 159 Views May 20, 2025

Normative mapping is a framework used to map population-level features of health-related variables. It is widely used in neuroscience research, but the literature lacks established protocols in modalities that do not support healthy control measurements, such as intracranial electroencephalograms (icEEG). An icEEG normative map would allow researchers to learn about population-level brain activity and enable the comparison of individual data against these norms to identify abnormalities. Currently, no standardised guide exists for transforming clinical data into a normative, regional icEEG map. Papers often cite different software and numerous articles to summarise the lengthy method, making it laborious for other researchers to understand or apply the process. Our protocol seeks to fill this gap by providing a dataflow guide and key decision points that summarise existing methods. This protocol was heavily used in published works from our own lab (twelve peer-reviewed journal publications). Briefly, we take as input the icEEG recordings and neuroimaging data from people with epilepsy who are undergoing evaluation for resective surgery. As final outputs, we obtain a normative icEEG map, comprising signal properties localised to brain regions. Optionally, we can also process new subjects through the same pipeline and obtain their z-scores (or centiles) in each brain region for abnormality detection and localisation. To date, a single, cohesive dataflow pipeline for generating normative icEEG maps, along with abnormality mapping, has not been created. We envisage that this dataflow guide will not only increase understanding and application of normative mapping methods but will also improve the consistency and quality of studies in the field.

0 Q&A 409 Views May 5, 2025

The accurate quantification of nucleic acid–based biomarkers, including long non-coding RNAs (lncRNAs), messenger RNAs (mRNAs), and microRNAs (miRNAs), is essential for disease diagnostics and risk assessment across the biological spectrum. Quantitative reverse transcription PCR (qRT-PCR) is the gold standard assay for the quantitative measurement of RNA expression levels, but its reliability depends on selecting stable reference targets for normalization. Yet, the lack of consensus on a universally accepted reference gene for a given sample type or species, despite being necessary for accurate quantification, presents a challenge to the broad application of such biomarkers. Various tools are currently being used to identify a stably expressed gene by using qRT-PCR data of a few potential normalizer genes. However, existing tools for normalizer gene selection are fraught with both statistical limitations and inadequate graphical user interfaces for data visualization. gQuant, the tool presented here, essentially overcomes these limitations. The tool is structured in two key components: the preprocessing component and the data analysis component. The preprocessing addresses missing values in the given dataset by the imputation strategies. After data preprocessing, normalizer genes are ranked using democratic strategies that integrate predictions from multiple statistical methods. The effectiveness of gQuant was validated through data available online as well as in-house data derived from urinary exosomal miRNA expression datasets. Comparative analysis against existing tools demonstrated that gQuant delivers more stable and consistent rankings of normalizer genes. With its promising performance, gQuant enhances the precision and reproducibility in the identification of normalizer genes across diverse research scenarios, addressing key limitations of RNA biomarker–based translational research.

0 Q&A 160 Views May 5, 2025

Quantitative proteomic analysis plays a crucial role in understanding microbial co-culture systems. Traditional techniques, such as label-free quantification (LFQ) and label-based proteomics, provide valuable insights into the interactions and metabolic exchanges of microbial species. However, the complexity of microbial co-culture systems often leads to challenges in data normalization, especially when dealing with comparative LFQ data where ratios of different organisms can vary across experiments. This protocol describes the application of LFQRatio normalization, a novel normalization method designed to improve the reliability and accuracy of quantitative proteomics data obtained from microbial co-cultures. The method was developed following the analysis of factors that affect both the identification of proteins and the quantitative accuracy of co-culture proteomics. These include peptide physicochemical characteristics such as isoelectric point (pI), molecular weight (MW), hydrophobicity, dynamic range, and proteome size, as well as shared peptides between species. We then created a normalization method based on LFQ intensity values named LFQRatio normalization. This approach was demonstrated by analysis of a synthetic co-culture of two bacteria, Synechococcus elongatus cscB/SPS and Azotobacter vinelandii ΔnifL. Results showed enhanced accuracy of differentially expressed proteins, allowing for more reliable biological interpretation. This protocol provides a reliable and effective tool with wider application to analyze other co-culture systems to study microbial interactions.

0 Q&A 1166 Views May 5, 2025

RNA sequencing (RNA-Seq) has transformed transcriptomic research, enabling researchers to perform large-scale inspection of mRNA levels in living cells. With the growing applicability of this technique to many scientific investigations, the analysis of next-generation sequencing (NGS) data becomes an important yet challenging task, especially for researchers without a bioinformatics background. This protocol offers a beginner-friendly step-by-step guide to analyze NGS data (starting from raw .fastq files), providing the required codes with an explanation of the different steps and software used. We outline a computational workflow that includes quality control, trimming of reads, read alignment to the genome, and gene quantification, ultimately enabling researchers to identify differentially expressed genes and gain insights on mRNA levels. Multiple approaches to visualize this data using statistical and graphical tools in R are also described, allowing the generation of heatmaps and volcano plots to represent genes and gene sets of interest.

0 Q&A 388 Views May 5, 2025

Chromatin immunoprecipitation with high-throughput sequencing (ChIP-seq) is a widely used technique for genome-wide analyses of protein–DNA interactions. This protocol provides a guide to ChIP-seq data processing in Saccharomyces cerevisiae, with a focus on signal normalization to address data biases and enable meaningful comparisons within and between samples. Designed for researchers with minimal bioinformatics experience, it includes practical overviews and refers to scripting examples for key tasks, such as configuring computational environments, trimming and aligning reads, processing alignments, and visualizing signals. This protocol employs the sans-spike-in method for quantitative ChIP-seq (siQ-ChIP) and normalized coverage for absolute and relative comparisons of ChIP-seq data, respectively. While spike-in normalization, which is semiquantitative, is addressed for context, siQ-ChIP and normalized coverage are recommended as mathematically rigorous and reliable alternatives.

0 Q&A 295 Views Apr 20, 2025

Bayesian phylogenetic analysis is essential for elucidating evolutionary relationships among organisms. Traditional methods often rely on fixed models and manual parameter settings, which can limit accuracy and efficiency. This protocol presents an integrated workflow that leverages GUIDANCE2 for rigorous sequence alignment, ProtTest and MrModeltest for robust model selection, and MrBayes for phylogenetic tree estimation through Bayesian inference. By automating key steps and providing detailed command-line instructions, this protocol enhances the reliability and reproducibility of phylogenetic studies.

0 Q&A 424 Views Apr 20, 2025

With reduced genotyping costs, genome-wide association studies (GWAS) face more challenges in diverse populations with complex structures to map genes of interest. The complex structure demands sophisticated statistical models, and increased marker density and population size require efficient computing tools. Many statistical models and computing tools have been developed with varied properties in statistical power, computing efficiency, and user-friendly accessibility. Some statistical models were developed with dedicated computing tools, such as efficient mixed model analysis (EMMA), multiple loci mixed model (MLMM), fixed and random model circulating probability unification (FarmCPU), and Bayesian-information and linkage-disequilibrium iteratively nested keyway (BLINK). However, there are computing tools (e.g., GAPIT) that implement multiple statistical models, retain a constant user interface, and maintain enhancement on input data and result interpretation. In this study, we developed a protocol utilizing a minimal set of software tools (BEAGLE, BLINK, and GAPIT) to perform a variety of analyses including file format conversion, missing genotype imputation, GWAS, and interpretation of input data and outcome results. We demonstrated the protocol by reanalyzing data from the Rice 3000 Genomes Project and highlighting advancements in GWAS model development.

0 Q&A 718 Views Apr 5, 2025

Confocal microscopy is integral to molecular and cellular biology, enabling high-resolution imaging and colocalization studies to elucidate biomolecular interactions in cells. Despite its utility, challenges in handling large datasets, particularly in preprocessing Z-stacks and calculating colocalization metrics like the Manders coefficient, limit efficiency and reproducibility. Manually processing large numbers of imaging data for colocalization analysis is prone to observer bias and inefficiencies. This study presents an automated workflow integrating Python-based preprocessing with Fiji ImageJ's BIOP-JACoP plugin to streamline Z-stack refinement and colocalization analysis. We generated an executable Windows application and made it publicly available on GitHub (https://github.com/weiyue99/Yue-Colocalization), allowing even those without Python experience to directly run the Python code required in the current protocol. The workflow systematically removes signal-free Z-slices that sometimes exist at the beginning and/or end of the Z-stacks using auto-thresholding, creates refined substacks, and performs batch analysis to calculate the Manders coefficient. It is designed for high-throughput applications, significantly reducing human error and hands-on time. By ensuring reproducibility and adaptability, this protocol addresses critical gaps in confocal image analysis workflows, facilitating efficient handling of large datasets and offering broad applicability in protein colocalization studies.

0 Q&A 202 Views Apr 5, 2025

Laboratory-developed tests (LDTs) are optimal molecular diagnostic modalities in circumstances such as public health emergencies, rare disease diagnosis, limited budget, or where existing commercial alternatives are unavailable, limited in supply, or withdrawn, either temporarily or permanently. These tests reduce access barriers and enhance equitable clinical practice and healthcare delivery. Despite recommendations for the development of nucleic acid amplification tests, procedural details are often insufficient, inconsistent, and arbitrary. This protocol elucidates the methodology used in the development of a fully automated real-time polymerase chain reaction (qPCR)-based test, using the Panther Fusion® Open AccessTM functionality, for the detection of Streptococcus agalactiae in pregnant women, using selectively enriched rectovaginal swabs. In addition, guidelines are provided for oligonucleotide design (primers and TaqMan probes), in silico and in vitro evaluation of design effectiveness, optimization of the physicochemical conditions of the amplification reaction, and result analysis based on experimental designs and acceptance criteria. Furthermore, recommendations are provided for the analytical and clinical validation of the intended use. Our approach is cost-effective, particularly during the design and optimization phases. We primarily used open-source bioinformatics software and tools for in silico evaluations for the test design. Subsequently, the process was manually optimized using a CFX96 Dx analyzer, whose technical specifications and performance are homologous to that of the final platform (Panther Fusion®). Unlike Panther Fusion®, the CFX96 Dx does not require excess volumes of reagents, samples, and evaluation materials (dead volume) to accommodate potential robotic handling-associated imprecisions. The utilization of the CFX96 Dx analyzer represents a strategic approach to enhancing the efficiency of resources and the optimization of time during LDT optimization.

0 Q&A 277 Views Mar 5, 2025

Non-small cell lung cancer (NSCLC) is the most common type of lung cancer. According to 2020 reports, globally, 2.2 million cases are reported every year, with the mortality number being as high as 1.8 million patients. To study NSCLC, systems biology offers mathematical modeling as a tool to understand complex pathways and provide insights into the identification of biomarkers and potential therapeutic targets, which aids precision therapy. Mathematical modeling, specifically ordinary differential equations (ODEs), is used to better understand the dynamics of cancer growth and immunological interactions in the tumor microenvironment. This study highlighted the dual role of the cyclic GMP-AMP synthase–stimulator of interferon genes (cGAS/STING) pathway's classical involvement in regulating type 1 interferon (IFN I) and pro-inflammatory responses to promote tumor regression through senescence and apoptosis. Alternative signaling was induced by nuclear factor kappa B (NF-κB), mutated tumor protein p53 (p53), and programmed death-ligand1 (PD-L1), which lead to tumor growth. We identified key regulators in cancer progression by simulating the model and validating it with the following model estimation parameters: local sensitivity analysis, principal component analysis, rate of flow of metabolites, and model reduction. Integration of multiple signaling axes revealed that cGAS-STING, phosphoinositide 3-kinases (PI3K), and Ak strain transforming (AKT) may be potential targets that can be validated for cancer therapy.




We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.