Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation DOI Creative Commons
Jin Li, Kleanthis Malialis, Marios M. Polycarpou

et al.

arXiv (Cornell University), Journal Year: 2023, Volume and Issue: unknown

Published: Jan. 1, 2023

In our digital universe nowadays, enormous amount of data are produced in a streaming manner variety application areas. These often unlabelled. this case, identifying infrequent events, such as anomalies, poses great challenge. This problem becomes even more difficult non-stationary environments, which can cause deterioration the predictive performance model. To address above challenges, paper proposes an autoencoder-based incremental learning method with drift detection (strAEm++DD). Our proposed strAEm++DD leverages on advantages both and detection. We conduct experimental study using real-world synthetic datasets severe or extreme class imbalance, provide empirical analysis strAEm++DD. further comparative study, showing that significantly outperforms existing baseline advanced methods.

Language: Английский

A survey on learning from imbalanced data streams: taxonomy, challenges, empirical study, and reproducible experimental framework DOI

Gabriel Aguiar,

Bartosz Krawczyk, Alberto Cano

et al.

Machine Learning, Journal Year: 2023, Volume and Issue: 113(7), P. 4165 - 4243

Published: June 29, 2023

Language: Английский

Citations

70

Machine learning-assisted structure annotation of natural products based on MS and NMR data DOI
Guilin Hu, Ming‐Hua Qiu

Natural Product Reports, Journal Year: 2023, Volume and Issue: 40(11), P. 1735 - 1753

Published: Jan. 1, 2023

This review presents a summary of the recent advancements in machine learning-assisted structure elucidation (MLASE) to establish structures natural products (NPs).

Language: Английский

Citations

17

SiameseDuo++: Active learning from data streams with dual augmented siamese networks DOI
Kleanthis Malialis, Stylianos Filippou, Christos G. Panayiotou

et al.

Neurocomputing, Journal Year: 2025, Volume and Issue: unknown, P. 130083 - 130083

Published: March 1, 2025

Language: Английский

Citations

0

A novel framework for improving class imbalance learning using feature space identification and fast informative resampling techniques DOI

Vivek Kadiyala,

Sabuzima Nayak, Ripon Patgiri

et al.

International Journal of Information Technology, Journal Year: 2025, Volume and Issue: unknown

Published: May 11, 2025

Language: Английский

Citations

0

Integrated Digital Twin Architecture for Water Contamination Emergency Management DOI
Δημήτριος Γ. Ηλιάδης, Stelios G. Vrachimis, Marios Kyriakou

et al.

World Environmental and Water Resources Congress 2011, Journal Year: 2025, Volume and Issue: unknown, P. 808 - 823

Published: May 15, 2025

Language: Английский

Citations

0

OTL-CE : Online transfer learning for data streams with class evolution DOI
Botao Jiao, Shihui Liu

Neurocomputing, Journal Year: 2025, Volume and Issue: unknown, P. 129470 - 129470

Published: Jan. 1, 2025

Language: Английский

Citations

0

Autoencoder-based Anomaly Detection in Streaming Data with Incremental Learning and Concept Drift Adaptation DOI
Jin Li, Kleanthis Malialis, Marios M. Polycarpou

et al.

2022 International Joint Conference on Neural Networks (IJCNN), Journal Year: 2023, Volume and Issue: unknown, P. 1 - 8

Published: June 18, 2023

In our digital universe nowadays, enormous amount of data are produced in a streaming manner variety application areas. These often unlabelled. this case, identifying infrequent events, such as anomalies, poses great challenge. This problem becomes even more difficult non-stationary environments, which can cause deterioration the predictive performance model. To address above challenges, paper proposes an autoencoder-based incremen-tal learning method with drift detection (strAEm++DD). Our proposed strAEm++DD leverages on advantages both incremental and detection. We conduct experimental study using real-world synthetic datasets severe or extreme class imbalance, provide empirical analysis strAEm++DD. further comparative study, showing that significantly outper-forms existing baseline advanced methods.

Language: Английский

Citations

7

Data Augmentation On-the-fly and Active Learning in Data Stream Classification DOI
Kleanthis Malialis, Dimitris Papatheodoulou, Stylianos Filippou

et al.

2021 IEEE Symposium Series on Computational Intelligence (SSCI), Journal Year: 2022, Volume and Issue: unknown

Published: Dec. 4, 2022

There is an emerging need for predictive models to be trained on-the-fly, since in numerous machine learning applications data are arriving online fashion. A critical challenge encountered that of limited availability ground truth information (e.g., labels classification tasks) as new observed one-by-one online, while another significant class imbalance. This work introduces the novel Augmented Queues method, which addresses dual-problem by combining a synergistic manner active learning, augmentation, and multi-queue memory maintain separate balanced queues each class. We perform extensive experimental study using image time-series augmentations, we examine roles budget, size, imbalance level, neural network type. demonstrate two major advantages Queues. First, it does not reserve additional space generation synthetic occurs only at training times. Second, have access more labelled without increase budget / or original size. Learning on-the-fly poses challenges which, typically, hinder deployment models. significantly improves performance terms quality speed. Our code made publicly available.

Language: Английский

Citations

10

Learning From Few Cyber-Attacks: Addressing the Class Imbalance Problem in Machine Learning-Based Intrusion Detection in Software-Defined Networking DOI Creative Commons
Seyed Mohammad Hadi Mirsadeghi, Hayretdin Bahşi, Risto Vaarandi

et al.

IEEE Access, Journal Year: 2023, Volume and Issue: 11, P. 140428 - 140442

Published: Jan. 1, 2023

The class imbalance problem negatively impacts learning algorithms' performance in minority classes which may constitute more severe attacks than the majority ones. This study investigates benefits of balancing strategies and imbalanced approaches on intrusion data from Software Defined Networking (SDN). Although research community has covered machine learning-based detection, addressing this SDN is novel powerful. Addressing over InSDN (the only publicly available detection dataset as recent) significant impact future area SDN. We address through data-level classifier-level techniques. Our objective to determine suitable methods propose custom deep architectures based GANs Siamese Neural Networks for generative modeling similarity-based detection. paper provides benchmarking results classification with Random Oversampling (ROS), SMOTE, GANs, weighted Forest, Siamese-based one-shot learning. have found that Forest (RF) outperforms models instances. supports notion RF can handle well. also observe widely-used techniques, ROS drastically decrease False Positive Rate (FPR) but increase Negative (FNR) classes. Conclusively, while improve models, they, fact, degrade RF's performance, i.e. cause higher numbers false predictions. Therefore, does not need additional get performance. work addresses data, it a well-designed benchmark be exemplary any network data. Thus, studies respective domain.

Language: Английский

Citations

5

A Hybrid Active-Passive Approach to Imbalanced Nonstationary Data Stream Classification DOI
Kleanthis Malialis, Manuel Roveri, Cesare Alippi

et al.

2021 IEEE Symposium Series on Computational Intelligence (SSCI), Journal Year: 2022, Volume and Issue: unknown

Published: Dec. 4, 2022

In real-world applications, the process generating data might suffer from nonstationary effects (e.g., due to seasonality, faults affecting sensors or actuators, and changes in users' behaviour). These changes, often called concept drift, induce severe (potentially catastrophic) impacts on trained learning models that become obsolete over time, inadequate solve task at hand. Learning presence of drift aims designing machine deep are able track adapt drift. Typically, techniques handle either active passive, traditionally, these have been considered be mutually exclusive. Active use an explicit detection mechanism, re-train algorithm when is detected. Passive implicit method deal with continually update model using incremental learning. Differently what present literature, we propose a hybrid alternative which merges two approaches, hence, leveraging their advantages. The proposed Hybrid-Adaptive REBAlancing (HAREBA) significantly outperforms strong baselines state-of-the-art methods terms quality speed; experiment how it effective under class imbalance levels too.

Language: Английский

Citations

7