Cited by Spatial-Temporal Information-Based Littering Action Detection in Natural Environment

A New Efficient Hybrid Technique for Human Action Recognition Using 2D Conv-RBM and LSTM with Optimized Frame Selection DOI

Majid Joudaki, Mehdi Imani, Hamid R. Arabnia

et al.

Technologies, Journal Year: 2025, Volume and Issue: 13(2), P. 53 - 53

Published: Feb. 1, 2025

Recognizing human actions through video analysis has gained significant attention in applications like surveillance, sports analytics, and human–computer interaction. While deep learning models such as 3D convolutional neural networks (CNNs) recurrent (RNNs) deliver promising results, they often struggle with computational inefficiencies inadequate spatial–temporal feature extraction, hindering scalability to larger datasets or high-resolution videos. To address these limitations, we propose a novel model combining two-dimensional restricted Boltzmann machine (2D Conv-RBM) long short-term memory (LSTM) network. The 2D Conv-RBM efficiently extracts spatial features edges, textures, motion patterns while preserving relationships reducing parameters via weight sharing. These are subsequently processed by the LSTM capture temporal dependencies across frames, enabling effective recognition of both short- long-term action patterns. Additionally, smart frame selection mechanism minimizes redundancy, significantly lowering costs without compromising accuracy. Evaluation on KTH, UCF Sports, HMDB51 demonstrated superior performance, achieving accuracies 97.3%, 94.8%, 81.5%, respectively. Compared traditional approaches RBM CNN, our method offers notable improvements accuracy efficiency, presenting scalable solution for real-time security, analytics.

Language: Английский

Citations

A Combined Model Based on Deep Learning for Littering Detection DOI

Cu Vinh Loc,

Truong Xuan Viet,

Tran Hoang Viet

et al.

Communications in computer and information science, Journal Year: 2025, Volume and Issue: unknown, P. 274 - 288

Published: Jan. 1, 2025

Language: Английский

Citations

HOG Ensembled Boosting Machine Learning Approach for Violent Video Classification DOI

Snehil G. Jaiswal,

Sharad W. Mohod,

Dinesh Sharma

et al.

Indian Journal of Science and Technology, Journal Year: 2023, Volume and Issue: 16(34), P. 2709 - 2718

Published: Sept. 15, 2023

Background: With the proliferation of machine learning and its applications in a variety spheres that are important to humans their day-to-day lives, there is pressing need for automatic detection models can identify abnormal behaviors or acts violence. Methods: This study examines model uses ensemble boosting histograms oriented gradients (HOG) detect violent content from feature vector with single parameter. Findings: The tests performed on two benchmark datasets, such as Hockey Dataset Peliculas dataset, reveal high level performance accuracy classification videos. experiment findings show suggested violence performs well terms average metrics, accuracy, precision, recall being 90.50%, 91.80%, 89.70%, respectively. Novelty applications: proposed method capable striking balance between limited number parameters, result, it be implemented minimal investment computational resources. Keywords: Violence detection; Computer Vision; Action Recognition; Machine Learning; Histogram Oriented Gradients (HOG); Ensemble Boosting

Language: Английский

Citations

CSI-F: A Human Motion Recognition Method Based on Channel-State-Information Signal Feature Fusion DOI

Juan Niu,

Xiuqing He, Bei Fang

et al.

Sensors, Journal Year: 2024, Volume and Issue: 24(3), P. 862 - 862

Published: Jan. 29, 2024

The recognition of human activity is crucial as the Internet Things (IoT) progresses toward future smart homes. Wi-Fi-based motion-recognition stands out due to its non-contact nature and widespread applicability. However, channel state information (CSI) related movement in indoor environments changes with direction movement, which poses challenges for existing Wi-Fi movement-recognition methods. These include limited directions that can be detected, short detection distances, inaccurate feature extraction, all significantly constrain wide-scale application action-recognition. To address this issue, we propose a direction-independent CSI fusion sharing model named CSI-F, one combines Convolutional Neural Networks (CNN) Gated Recurrent Units (GRU). Specifically, have introduced series signal-processing techniques utilize antenna diversity eliminate random phase shifts, thereby removing noise influences unrelated motion information. Later, by amplifying Doppler frequency shift effect through cyclic actions generating spectrogram, further enhance impact on CSI. demonstrate effectiveness method, conducted experiments datasets collected natural environments. We confirmed superposition periodic improve accuracy process. CSI-F achieve higher compared other methods monitoring coverage up 6 m.

Language: Английский

Citations

Deep Learning-Driven Culvert Monitoring: A Novel Approach for Flash Flood Mitigation through Visual Blockage Analysis DOI

Betty Elezebeth Samuel, Sultan Alghamdi,

T. K. S. Lakshmi Priya

et al.

2022 IEEE 7th International conference for Convergence in Technology (I2CT), Journal Year: 2024, Volume and Issue: unknown

Published: April 5, 2024

The primary cause of urban flash floods is often cited as trash clogging culverts. Flash can be avoided with the help intelligent video analytic (IVA) methods that extract information about blockages in order to make maintenance-related decisions a timely manner. Knowing percentage culverts are visually blocked prioritise maintenance at heavily locations. In this paper, we introduce deep learning-powered segmentation-classification pipeline for automatically detecting and segmenting culvert openings then classifying them into one four categories based on degree which they blocked. learning models underwent training using datasets sourced from Visual Hydraulics Blockage Dataset (VHD) Images Culverts (ICOB). outcomes revealed Mask R-CNN attains highest performance segmentation, achieving an mAP@75 score 77.2%. contrast, NASNet excelled classification tasks, remarkable 81.2% accuracy test data. To underscore practical importance these findings, potential application demonstrated, involves monitoring visual obstructions.

Language: Английский

Citations

A Review of Deep Learning Based Video Action Recognition Techniques DOI

Mingyuan Zhu

Science and Technology of Engineering Chemistry and Environmental Protection, Journal Year: 2024, Volume and Issue: 1(8)

Published: Aug. 14, 2024

Video action recognition is an important research direction in the field of computer vision and pattern recognition, with extensive applications intelligent video surveillance, human-computer interaction, sports analysis. The development data storage computing hardware over past decade has driven a shift from traditional feature extraction machine learning algorithms to deep learning-based approaches. This paper reviews current state development, problems, future directions techniques. Traditional methods are gradually being replaced by such as convolutional neural networks (CNNs), recurrent (RNNs), long-short-term memory (LSTMs). These automatically extract features handle time-dependency, significantly improving accuracy robustness recognition. In particular, models based on attention mechanism further enhance performance dynamically adjusting focus attention, hot spot research. Despite many advances, still faces several challenges, including high computational resource requirements, complex model training, dataset bias issues, variations real-world application scenarios viewpoint changes, lighting occlusion. Future can explore multi-modal fusion, lightweight models, self-supervised learning, cross-domain transfer improve accuracy, robustness, generalization review provided aims offer researchers comprehensive perspective technology.

Language: Английский

Citations

OTM-HC: Enhanced Skeleton-Based Action Representation via One-to-Many Hierarchical Contrastive Learning DOI

Muhammad Usman, Wenming Cao, Zhao Huang

et al.

AI, Journal Year: 2024, Volume and Issue: 5(4), P. 2170 - 2186

Published: Nov. 1, 2024

Human action recognition has become crucial in computer vision, with growing applications surveillance, human–computer interaction, and healthcare. Traditional approaches often use broad feature representations, which may miss subtle variations timing movement within sequences. Our proposed One-to-Many Hierarchical Contrastive Learning (OTM-HC) framework maps the input into multi-layered vectors, creating a hierarchical contrast representation that captures various granularities human skeleton sequence temporal spatial domains. Using sequence-to-sequence (Seq2Seq) transformer encoders downsampling modules, OTM-HC can distinguish between multiple levels of such as instance, domain, clip, part levels. Each level contributes significantly to comprehensive understanding representations. The model design is adaptable, ensuring smooth integration advanced Seq2Seq encoders. We tested across four datasets, demonstrating improved performance over state-of-the-art models. Specifically, achieved improvements 0.9% 0.6% on NTU60, 0.4% 0.7% NTU120, 0.3% PKU-MMD I II, respectively, surpassing previous leading these datasets. These results showcase robustness adaptability our for skeleton-based tasks.

Language: Английский

Citations

GCTT: Graph Convolution and Time-Frequency Integration Network for 3D Human Pose Estimation DOI

Aolei Yang, Xiaobin Wang, Banghua Yang

et al.

Communications in computer and information science, Journal Year: 2024, Volume and Issue: unknown, P. 351 - 361

Published: Jan. 1, 2024

Language: Английский

Citations

Spatial-Temporal Information-Based Littering Action Detection in Natural Environment DOI

Cu Vinh Loc,

Lê Thị Kim Thoa,

Truong Xuan Viet

et al.

Communications in computer and information science, Journal Year: 2023, Volume and Issue: unknown, P. 247 - 261

Published: Oct. 30, 2023

Language: Английский

Citations