Communications in computer and information science, Journal Year: 2023, Volume and Issue: unknown, P. 247 - 261
Published: Oct. 30, 2023
Language: Английский
Communications in computer and information science, Journal Year: 2023, Volume and Issue: unknown, P. 247 - 261
Published: Oct. 30, 2023
Language: Английский
Technologies, Journal Year: 2025, Volume and Issue: 13(2), P. 53 - 53
Published: Feb. 1, 2025
Recognizing human actions through video analysis has gained significant attention in applications like surveillance, sports analytics, and human–computer interaction. While deep learning models such as 3D convolutional neural networks (CNNs) recurrent (RNNs) deliver promising results, they often struggle with computational inefficiencies inadequate spatial–temporal feature extraction, hindering scalability to larger datasets or high-resolution videos. To address these limitations, we propose a novel model combining two-dimensional restricted Boltzmann machine (2D Conv-RBM) long short-term memory (LSTM) network. The 2D Conv-RBM efficiently extracts spatial features edges, textures, motion patterns while preserving relationships reducing parameters via weight sharing. These are subsequently processed by the LSTM capture temporal dependencies across frames, enabling effective recognition of both short- long-term action patterns. Additionally, smart frame selection mechanism minimizes redundancy, significantly lowering costs without compromising accuracy. Evaluation on KTH, UCF Sports, HMDB51 demonstrated superior performance, achieving accuracies 97.3%, 94.8%, 81.5%, respectively. Compared traditional approaches RBM CNN, our method offers notable improvements accuracy efficiency, presenting scalable solution for real-time security, analytics.
Language: Английский
Citations
1Communications in computer and information science, Journal Year: 2025, Volume and Issue: unknown, P. 274 - 288
Published: Jan. 1, 2025
Language: Английский
Citations
0Indian Journal of Science and Technology, Journal Year: 2023, Volume and Issue: 16(34), P. 2709 - 2718
Published: Sept. 15, 2023
Background: With the proliferation of machine learning and its applications in a variety spheres that are important to humans their day-to-day lives, there is pressing need for automatic detection models can identify abnormal behaviors or acts violence. Methods: This study examines model uses ensemble boosting histograms oriented gradients (HOG) detect violent content from feature vector with single parameter. Findings: The tests performed on two benchmark datasets, such as Hockey Dataset Peliculas dataset, reveal high level performance accuracy classification videos. experiment findings show suggested violence performs well terms average metrics, accuracy, precision, recall being 90.50%, 91.80%, 89.70%, respectively. Novelty applications: proposed method capable striking balance between limited number parameters, result, it be implemented minimal investment computational resources. Keywords: Violence detection; Computer Vision; Action Recognition; Machine Learning; Histogram Oriented Gradients (HOG); Ensemble Boosting
Language: Английский
Citations
1Sensors, Journal Year: 2024, Volume and Issue: 24(3), P. 862 - 862
Published: Jan. 29, 2024
The recognition of human activity is crucial as the Internet Things (IoT) progresses toward future smart homes. Wi-Fi-based motion-recognition stands out due to its non-contact nature and widespread applicability. However, channel state information (CSI) related movement in indoor environments changes with direction movement, which poses challenges for existing Wi-Fi movement-recognition methods. These include limited directions that can be detected, short detection distances, inaccurate feature extraction, all significantly constrain wide-scale application action-recognition. To address this issue, we propose a direction-independent CSI fusion sharing model named CSI-F, one combines Convolutional Neural Networks (CNN) Gated Recurrent Units (GRU). Specifically, have introduced series signal-processing techniques utilize antenna diversity eliminate random phase shifts, thereby removing noise influences unrelated motion information. Later, by amplifying Doppler frequency shift effect through cyclic actions generating spectrogram, further enhance impact on CSI. demonstrate effectiveness method, conducted experiments datasets collected natural environments. We confirmed superposition periodic improve accuracy process. CSI-F achieve higher compared other methods monitoring coverage up 6 m.
Language: Английский
Citations
02022 IEEE 7th International conference for Convergence in Technology (I2CT), Journal Year: 2024, Volume and Issue: unknown
Published: April 5, 2024
The primary cause of urban flash floods is often cited as trash clogging culverts. Flash can be avoided with the help intelligent video analytic (IVA) methods that extract information about blockages in order to make maintenance-related decisions a timely manner. Knowing percentage culverts are visually blocked prioritise maintenance at heavily locations. In this paper, we introduce deep learning-powered segmentation-classification pipeline for automatically detecting and segmenting culvert openings then classifying them into one four categories based on degree which they blocked. learning models underwent training using datasets sourced from Visual Hydraulics Blockage Dataset (VHD) Images Culverts (ICOB). outcomes revealed Mask R-CNN attains highest performance segmentation, achieving an mAP@75 score 77.2%. contrast, NASNet excelled classification tasks, remarkable 81.2% accuracy test data. To underscore practical importance these findings, potential application demonstrated, involves monitoring visual obstructions.
Language: Английский
Citations
0Science and Technology of Engineering Chemistry and Environmental Protection, Journal Year: 2024, Volume and Issue: 1(8)
Published: Aug. 14, 2024
Video action recognition is an important research direction in the field of computer vision and pattern recognition, with extensive applications intelligent video surveillance, human-computer interaction, sports analysis. The development data storage computing hardware over past decade has driven a shift from traditional feature extraction machine learning algorithms to deep learning-based approaches. This paper reviews current state development, problems, future directions techniques. Traditional methods are gradually being replaced by such as convolutional neural networks (CNNs), recurrent (RNNs), long-short-term memory (LSTMs). These automatically extract features handle time-dependency, significantly improving accuracy robustness recognition. In particular, models based on attention mechanism further enhance performance dynamically adjusting focus attention, hot spot research. Despite many advances, still faces several challenges, including high computational resource requirements, complex model training, dataset bias issues, variations real-world application scenarios viewpoint changes, lighting occlusion. Future can explore multi-modal fusion, lightweight models, self-supervised learning, cross-domain transfer improve accuracy, robustness, generalization review provided aims offer researchers comprehensive perspective technology.
Language: Английский
Citations
0AI, Journal Year: 2024, Volume and Issue: 5(4), P. 2170 - 2186
Published: Nov. 1, 2024
Human action recognition has become crucial in computer vision, with growing applications surveillance, human–computer interaction, and healthcare. Traditional approaches often use broad feature representations, which may miss subtle variations timing movement within sequences. Our proposed One-to-Many Hierarchical Contrastive Learning (OTM-HC) framework maps the input into multi-layered vectors, creating a hierarchical contrast representation that captures various granularities human skeleton sequence temporal spatial domains. Using sequence-to-sequence (Seq2Seq) transformer encoders downsampling modules, OTM-HC can distinguish between multiple levels of such as instance, domain, clip, part levels. Each level contributes significantly to comprehensive understanding representations. The model design is adaptable, ensuring smooth integration advanced Seq2Seq encoders. We tested across four datasets, demonstrating improved performance over state-of-the-art models. Specifically, achieved improvements 0.9% 0.6% on NTU60, 0.4% 0.7% NTU120, 0.3% PKU-MMD I II, respectively, surpassing previous leading these datasets. These results showcase robustness adaptability our for skeleton-based tasks.
Language: Английский
Citations
0Communications in computer and information science, Journal Year: 2024, Volume and Issue: unknown, P. 351 - 361
Published: Jan. 1, 2024
Language: Английский
Citations
0Communications in computer and information science, Journal Year: 2023, Volume and Issue: unknown, P. 247 - 261
Published: Oct. 30, 2023
Language: Английский
Citations
0