Elsevier eBooks, Journal Year: 2024, Volume and Issue: unknown
Published: Jan. 1, 2024
Language: Английский
Elsevier eBooks, Journal Year: 2024, Volume and Issue: unknown
Published: Jan. 1, 2024
Language: Английский
Journal Of Big Data, Journal Year: 2025, Volume and Issue: 12(1)
Published: Jan. 6, 2025
When applying data mining or machine learning techniques to large and diverse datasets, it is often necessary construct descriptive predictive models. Descriptive models are used discover relationships between the attributes of while identify characteristics that will be collected in future. Bioinformatics high-dimensional, making practically impossible apply majority "classical" algorithms for classification clustering. Even if useful, training with multidimensional significantly increases processing time. The specialized working high-dimensional cannot process containing sets several thousand dimensions (features). Dimension reduction methods (such as PCA) do not provide satisfactory results, also obscure meaning original data. For constructed usable, they must fulfill requirement scalability, amount bioinformatics increasing rapidly. Furthermore, significance individual features can differ from source source. This paper describes an attribute selection method efficient (30,698) transcriptomics different sources. proposed was tested 22 algorithms. results selected comparable complete set.
Language: Английский
Citations
1Food Research International, Journal Year: 2025, Volume and Issue: 202, P. 115757 - 115757
Published: Jan. 16, 2025
Language: Английский
Citations
0PeerJ Computer Science, Journal Year: 2025, Volume and Issue: 11, P. e2528 - e2528
Published: Feb. 7, 2025
The computational and interpretational difficulties caused by the ever-increasing dimensionality of biological data generated new technologies pose a significant challenge. Feature selection (FS) methods aim to reduce dimension, feature grouping has emerged as foundation for FS techniques that seek detect strong correlations among features identify irrelevant features. In this work, we propose Recursive Cluster Elimination with Intra-Cluster (RCE-IFE) method utilizes iterates elimination steps in supervised context. We assess reduction discriminatory capabilities RCE-IFE on various high-dimensional datasets from different domains. For set gene expression, microRNA (miRNA) methylation datasets, performance is comparatively evaluated RCE-IFE-SVM (the SVM-adapted version RCE-IFE) SVM-RCE. On average, attains an area under curve (AUC) 0.85 tested expression fewest shortest running time, while SVM-RCE achieve similar AUCs 0.84 0.83, respectively. yield 0.79 0.68, respectively when averaged over seven metagenomics significantly reducing subsets. Furthermore, surpasses several state-of-the-art methods, such Minimum Redundancy Maximum Relevance (MRMR), Fast Correlation-Based Filter (FCBF), Information Gain (IG), Conditional Mutual Maximization (CMIM), SelectKBest (SKB), eXtreme Gradient Boosting (XGBoost), obtaining average AUC 0.76 five datasets. Compared tool, Multi-stage, gives accuracy rate 89.27% using fewer four cancer-related comparability also verified other domain knowledge-based Grouping-Scoring-Modeling (G-S-M) tools, including mirGediNET, 3Mint, miRcorrNet. Additionally, relevance selected evaluated. proposed exhibits high consistency terms across multiple runs. Our experimental findings imply provides robust classifier reduces size maintaining consistency.
Language: Английский
Citations
0International Journal of Intelligent Systems, Journal Year: 2025, Volume and Issue: 2025(1)
Published: Jan. 1, 2025
This study addresses the pressing need for improved lung cancer diagnosis and treatment by leveraging computational methods omics data analysis. Lung remains a leading cause of cancer‐related deaths globally, highlighting urgency more effective diagnostic therapeutic approaches. Current methods, such as imaging biopsies, suffer from limitations in sensitivity, specificity, accessibility, often due to factors poor quality, small sample sizes, variability sources. These highlight necessity development advanced noninvasive techniques. Computational utilizing have shown promise overcoming these challenges comprehensively understanding molecular pathways involved cancer. We propose novel approach that utilizes RNA‐Seq employs LASSO regression with attention mechanisms identify biomarkers. Our results demonstrate effectiveness this identifying potential biomarkers cancer, including well‐known genes TP53, EGFR, KRAS, ALK, PIK3CA, validating model’s ability uncover key associated progression. Gene Ontology (GO) Kyoto Encyclopedia Genes Genomes (KEGG) pathway enrichment analyses revealed significant associations identified critical biological processes pathways, protein synthesis, folding, cell adhesion, gene regulation, immune responses. The PPI network analysis, constructed using STRING database Cytoscape application, highlighted highly interconnected interaction landscape, central hub playing pivotal roles RPSA emerged crucial gene, consistently across different centrality measures. sheds light on analysis improving treatment, offering new insights future research directions personalized medicine strategies.
Language: Английский
Citations
0Cytotechnology, Journal Year: 2025, Volume and Issue: 77(2)
Published: Feb. 19, 2025
Language: Английский
Citations
0American Journal of Translational Research, Journal Year: 2025, Volume and Issue: 17(2), P. 913 - 926
Published: Jan. 1, 2025
To investigate the cellular function of SOX18 in nasopharyngeal carcinoma (NPC) by analyzing its effects on tumor cell proliferation, apoptosis, migration and invasion, to verify expression prognostic significance clinical samples, thereby providing a basis for precise diagnosis treatment. was analyzed NPC lines samples. Gene silencing techniques were utilized reduce cells, followed assays evaluate migration, invasion. Additionally, changes Wnt/β-catenin signaling pathway examined. High correlated with poor survival patients. Silencing significantly inhibited increased suppressed invasion capabilities. Furthermore, downregulated key genes proteins associated pathway. plays critical role progression affecting behaviors. Targeting may offer new therapeutic strategies improve assessments patients, highlighting potential as valuable molecular marker cancer
Language: Английский
Citations
0International Journal of Molecular Sciences, Journal Year: 2025, Volume and Issue: 26(5), P. 2347 - 2347
Published: March 6, 2025
Ischemic stroke is a multifactorial disease that leads to brain tissue damage and severe neurological deficit. Transient middle cerebral artery occlusion (tMCAO) models are actively used for the molecular, genetic study of stroke. Previously, using high-throughput RNA sequencing (RNA-Seq), we revealed 3774 differentially expressed genes (DEGs) in penumbra-associated region frontal cortex (FC) rats 24 h after applying tMCAO model. Here, studied gene expression pattern striatum contained an ischemic focus. Striatum samples were obtained from same which previously FC samples. Therefore, compared DEG profiles between two rat tissues tMCAO. Tissues selected based on magnetic resonance imaging (MRI) histological examination (HE) data. As result, 4409 DEGs identified striatum. Among them, 2609 overlapped FC, whereas more than one thousand specific each tissue. Furthermore, 54 exhibited opposite changes at mRNA level Thus, spatial regulation process ipsilateral hemisphere transcriptome was revealed. We believe targeted adjustment genome responses can be key induction regeneration processes cells
Language: Английский
Citations
0Briefings in Bioinformatics, Journal Year: 2025, Volume and Issue: 26(2)
Published: March 1, 2025
Abstract Identifying genes causally linked to cancer from a multi-omics perspective is essential for understanding the mechanisms of and improving therapeutic strategies. Traditional statistical machine-learning methods that rely on generalized correlation approaches identify often produce redundant, biased predictions with limited interpretability, largely due overlooking confounding factors, selection biases, nonlinear activation function in neural networks. In this study, we introduce novel framework identifying across multiple omics domains, named ICGI (Integrative Causal Gene Identification), which leverages large language model (LLM) prompted causality contextual cues prompts, conjunction data-driven causal feature selection. This approach demonstrates effectiveness potential LLMs uncovering comprehending disease mechanisms, particularly at genomic level. However, our findings also highlight current may not capture comprehensive information all levels. By applying proposed module transcriptomic datasets six types The Cancer Genome Atlas comparing its performance state-of-the-art methods, it superior capability distinguish between cancerous normal samples. Additionally, have developed an online service platform allows users input gene interest specific type. provides automated results indicating whether plays significant role cancer, along clear accessible explanations. Moreover, summarizes inference outcomes obtained learning methods.
Language: Английский
Citations
0Research Square (Research Square), Journal Year: 2025, Volume and Issue: unknown
Published: March 24, 2025
Language: Английский
Citations
0PLoS ONE, Journal Year: 2025, Volume and Issue: 20(3), P. e0319205 - e0319205
Published: March 31, 2025
COVID-19, severe acute respiratory syndrome coronavirus 2, rapidly spread worldwide. Severe and critical patients are expected to deteriorate. Although several studies have attempted uncover the mechanisms underlying COVID-19 severity, most focused on perturbations of single genes. However, complex mechanism involves numerous perturbed genes in a molecular network rather than abnormal gene. Thus, we aimed identify severity-specific markers Japanese population using gene analysis. In order reveal interplays, developed novel computational biology strategy that measures dissimilarity between networks based comprehensive information (i.e., expression levels structure) by Kullback–Leibler divergence. Monte Carlo simulations demonstrated effectiveness our for differential We applied this method publicly available whole blood RNA-seq data from Japan disease 2019 Task Force identified differentially regulated interplays 368 105 non-severe samples. Our analysis suggests HLA class II, CIITA, CD74 as severity specific marker. association II has been demonstrated, revealed interplay with its target and/or regulator is crucial marker severity. findings suggest suppression activation provide clues
Language: Английский
Citations
0