Cognitive science and technology, Journal Year: 2025, Volume and Issue: unknown, P. 69 - 80
Published: Jan. 1, 2025
Language: Английский
Cognitive science and technology, Journal Year: 2025, Volume and Issue: unknown, P. 69 - 80
Published: Jan. 1, 2025
Language: Английский
Scientific Reports, Journal Year: 2022, Volume and Issue: 12(1)
Published: June 9, 2022
Abstract With time, textual data is proliferating, primarily through the publications of articles. this rapid increase in data, anonymous content also increasing. Researchers are searching for alternative strategies to identify author an unknown text. There a need develop system actual texts based on given set writing samples. This study presents novel approach ensemble learning, DistilBERT , and conventional machine learning techniques authorship identification. The proposed extracts valuable characteristics using count vectorizer bi-gram Term frequency-inverse document frequency (TF-IDF). An extensive detailed dataset, “All news” used experimentation. dataset divided into three subsets (article1, article2, article3). We limit scope selected ten authors first 20 second experimental results provide better performance all dataset. In scope, prove that from 10 provides accuracy gain 3.14% 2.44% article1 Similarly, authors, 5.25% 7.17% which than previous state-of-the-art studies.
Language: Английский
Citations
43Scientific Reports, Journal Year: 2024, Volume and Issue: 14(1)
Published: Feb. 23, 2024
Abstract
Speech
emotion
recognition
(SER)
has
gained
an
increased
interest
during
the
last
decades
as
part
of
enriched
affective
computing.
As
a
consequence,
variety
engineering
approaches
have
been
developed
addressing
challenge
SER
problem,
exploiting
different
features,
learning
algorithms,
and
datasets.
In
this
paper,
we
propose
application
graph
theory
for
classifying
emotionally-colored
speech
signals.
Graph
provides
tools
extracting
statistical
well
structural
information
from
any
time
series.
We
to
use
mentioned
novel
feature
set.
Furthermore,
suggest
setting
unique
feature-based
identity
each
belonging
speaker.
The
classification
is
performed
by
Random
Forest
classifier
in
Leave-One-Speaker-Out
Cross
Validation
(LOSO-CV)
scheme.
proposed
method
compared
with
two
state-of-the-art
involving
known
hand-crafted
features
deep
architectures
operating
on
mel-spectrograms.
Experimental
results
three
datasets,
EMODB
(German,
acted)
AESDD
(Greek,
acted),
DEMoS
(Italian,
in-the-wild),
reveal
that
our
outperforms
comparative
methods
these
Specifically,
observe
average
UAR
increase
almost
$$18\%$$
Language: Английский
Citations
10Balkan Journal of Electrical and Computer Engineering, Journal Year: 2024, Volume and Issue: 12(1), P. 36 - 46
Published: March 1, 2024
Emotion recognition using multimodal data is a widely adopted approach due to its potential enhance human interactions and various applications. By leveraging for emotion recognition, the quality of can be significantly improved. We present Multimodal Lines Dataset (MELD) novel method bi-lateral gradient graph neural network (Bi-LG-GNN) feature extraction pre-processing. The dataset uses fine-grained labeling textual, audio, visual modalities. This work aims identify affective computing states successfully concealed in textual audio sentiment analysis. use pre-processing techniques improve consistency increase dataset’s usefulness. process also includes noise removal, normalization, linguistic processing deal with variances background discourse. Kernel Principal Component Analysis (K-PCA) employed extraction, aiming derive valuable attributes from each modality encode labels array values. propose Bi-LG-GCN-based architecture explicitly tailored effectively fusing Bi-LG-GCN system takes modality's feature-extracted pre-processed representation as input generator network, generating realistic synthetic samples that capture relationships. These generated samples, reflecting relationships, serve inputs discriminator which has been trained distinguish genuine data. With this approach, model learn discriminative features make accurate predictions regarding subsequent emotional states. Our was evaluated on MELD dataset, yielding notable results terms accuracy (80%), F1-score (81%), precision recall (81%) when dataset. steps discrimination. featuring synthesis, outperforms contemporary techniques, thus demonstrating practical utility.
Language: Английский
Citations
10IEEE Access, Journal Year: 2021, Volume and Issue: 9, P. 166518 - 166530
Published: Jan. 1, 2021
Human speech is not only a verbose medium of communication but it also conveys emotions. The past decade has seen lot research going on with data which becomes especially important for human-computer interaction and healthcare, security entertainment. This paper proposes the TLEFuzzyNet model, three-stage pipeline emotion recognition from speech. first stage includes feature extraction by augmentation signals Mel spectrograms, followed use three pre-trained transfer learning CNN models namely, ResNet18, Inception_v3 GoogleNet whose prediction scores are fed to third stage. In final stage, we assign Fuzzy Ranks using modified Gompertz function gives after considering individual models. We have used Surrey Audio-Visual Expressed Emotion (SAVEE), Ryerson Database Emotional Speech Song (RAVDESS) Berlin (EmoDB) datasets evaluate model achieved state-of-the-art performance hence dependable framework recognition(SER). All codes available GitHub link: https://github.com/KaramSahoo/SpeechEmotionRecognitionFuzzy.
Language: Английский
Citations
53Speech Communication, Journal Year: 2022, Volume and Issue: 145, P. 21 - 35
Published: Sept. 15, 2022
Language: Английский
Citations
36Computational Intelligence and Neuroscience, Journal Year: 2022, Volume and Issue: 2022, P. 1 - 11
Published: March 10, 2022
With the continuous development of Internet, social media based on short text has become popular. However, sparsity and shortness essays will restrict accuracy classification. Therefore, Bert model, we capture mental feature reviewers apply them for classification to improve its accuracy. Specifically, construct a model at language level fine tune better embed features. To verify this method, compare variety machine learning methods, such as support vector machine, convolution neural networks, recurrent networks. The results show following: (1) Through comparison, it is found that features can significantly (2) Combining input vectors provide more than separating two independent vectors. (3) be integrate text. results. This help promote
Language: Английский
Citations
35Journal of Control and Decision, Journal Year: 2022, Volume and Issue: 10(1), P. 54 - 63
Published: June 13, 2022
Automated Speech Emotion Recognition (SER) becomes more popular and has increased applicability. SER concentrates on the automatic identification of emotional state a human being using speech signals. It mainly depends upon in-depth analysis signal, extracts features containing details from utilises pattern recognition techniques for identification. The major problem in is to extract discriminate, powerful, salient acoustical content proposed model aims detect classify three states such as happy, neutral, sad. presented makes use Convolution neural network – Gated Recurrent unit (CNN-GRU) based feature extraction technique which derives set vectors. A comprehensive simulation takes place Berlin German Database SJTU Chinese comprises numerous audio files under collection different emotion labels.
Language: Английский
Citations
31Sensors, Journal Year: 2022, Volume and Issue: 22(7), P. 2461 - 2461
Published: March 23, 2022
Machine Learning (ML) algorithms within a human–computer framework are the leading force in speech emotion recognition (SER). However, few studies explore cross-corpora aspects of SER; this work aims to feasibility and characteristics cross-linguistic, cross-gender SER. Three ML classifiers (SVM, Naïve Bayes MLP) applied acoustic features, obtained through procedure based on Kononenko’s discretization correlation-based feature selection. The system encompasses five emotions (disgust, fear, happiness, anger sadness), using Emofilm database, comprised short clips English movies respective Italian Spanish dubbed versions, for total 1115 annotated utterances. results see MLP as most effective classifier, with accuracies higher than 90% single-language approaches, while cross-language classifier still yields 80%. show tasks be more difficult those involving two languages, suggesting greater differences between expressed by male versus female subjects different languages. Four domains, namely, RASTA, F0, MFCC spectral energy, algorithmically assessed effective, refining existing literature approaches standard sets. To our knowledge, is one first encompassing cross-linguistic assessments
Language: Английский
Citations
26Knowledge-Based Systems, Journal Year: 2022, Volume and Issue: 242, P. 108360 - 108360
Published: Feb. 9, 2022
Language: Английский
Citations
24Security and Communication Networks, Journal Year: 2022, Volume and Issue: 2022, P. 1 - 10
Published: Feb. 28, 2022
In the recent past, handling high dimensionality demonstrated in auditory features of speech signals has been a primary focus for machine learning (ML-)based emotion recognition. The incorporation high-dimensional characteristics training datasets phase ML models influences contemporary approaches to prediction with significant false alerting. curse excessive corpus is addressed majority models. Modern models, on other hand, place greater emphasis merging many classifiers, which can only increase recognition accuracy even when contains data points. “Ensemble Learning by High-Dimensional Acoustic Features (EL-HDAF)” an innovative ensemble model that leverages diversity assessment feature values spanned over diversified classes recommend best features. Furthermore, proposed technique employs one-of-a-kind clustering process limit impact values. experimental inquiry evaluates and compares forecasting using spoken audio current methods use Fourfold cross-validation used performance analysis standard corpus.
Language: Английский
Citations
23