Evaluating Privacy Compliance in Commercial Large Language Models - ChatGPT, Claude, and Gemini
Oliver Cartwright,
H. Flanders Dunbar,
Theo Radcliffe
и другие.
Research Square (Research Square),
Год журнала:
2024,
Номер
unknown
Опубликована: Июль 26, 2024
Abstract
The
integration
of
artificial
intelligence
systems
into
various
domains
has
raised
significant
privacy
concerns,
necessitating
stringent
regulatory
measures
to
protect
user
data.
Evaluating
the
compliance
commercial
large
language
models
(LLMs)
such
as
ChatGPT-4o,
Claude
Sonet,
and
Gemini
Flash
under
EU
AI
Act
presents
a
novel
approach,
providing
critical
insights
their
adherence
standards.
study
utilized
hypothetical
case
studies
assess
practices
these
LLMs,
focusing
on
data
collection,
storage,
sharing
mechanisms.
Findings
revealed
that
ChatGPT-4o
exhibited
issues
with
minimization
access
control,
while
Sonet
demonstrated
robust
effective
security
measures.
However,
showed
inconsistencies
in
collection
higher
incidence
anonymization
failures.
comparative
analysis
underscored
importance
tailored
strategies
continuous
monitoring
ensure
compliance.
These
results
provide
valuable
for
developers
policymakers,
emphasizing
necessity
multifaceted
approach
deployment
LLMs.
Язык: Английский
Exploiting Privacy Vulnerabilities in Open Source LLMs Using Maliciously Crafted Prompts
Géraud Choquet,
Aimée Aizier,
Gwenaëlle Bernollin
и другие.
Research Square (Research Square),
Год журнала:
2024,
Номер
unknown
Опубликована: Июнь 18, 2024
Abstract
The
proliferation
of
AI
technologies
has
brought
to
the
forefront
concerns
regarding
privacy
and
security
user
data,
particularly
with
increasing
deployment
powerful
language
models
such
as
Llama.
A
novel
concept
investigated
involves
inducing
breaches
through
maliciously
crafted
prompts,
highlighting
potential
for
these
inadvertently
reveal
sensitive
information.
study
systematically
evaluated
vulnerabilities
Llama
model,
employing
an
automated
framework
test
analyze
its
responses
a
variety
inputs.
Findings
significant
flaws,
demonstrating
model's
susceptibility
adversarial
attacks
that
could
compromise
privacy.
Comprehensive
analysis
provided
insights
into
types
prompts
most
effective
in
eliciting
private
demonstrates
necessity
robust
regulatory
frameworks
advanced
measures.
implications
findings
are
profound,
calling
immediate
action
enhance
protocols
LLMs
protect
against
breaches.
Enhanced
oversight
continuous
innovation
privacy-preserving
techniques
crucial
ensuring
safe
various
applications.
derived
from
this
research
contribute
deeper
understanding
LLM
urgent
need
improved
safeguards
prevent
data
leakage
unauthorized
access.
Язык: Английский
An Evaluation of the Safety of ChatGPT with Malicious Prompt Injection
Research Square (Research Square),
Год журнала:
2024,
Номер
unknown
Опубликована: Май 30, 2024
Abstract
Artificial
intelligence
systems,
particularly
those
involving
sophisticated
neural
network
architectures
like
ChatGPT,
have
demonstrated
remarkable
capabilities
in
generating
human-like
text.
However,
the
susceptibility
of
these
systems
to
malicious
prompt
injections
poses
significant
risks,
necessitating
comprehensive
evaluations
their
safety
and
robustness.
The
study
presents
a
novel
automated
framework
for
systematically
injecting
analyzing
prompts
assess
vulnerabilities
ChatGPT.
Results
indicate
substantial
rates
harmful
responses
across
various
scenarios,
highlighting
critical
areas
improvement
model
defenses.
findings
underscore
importance
advanced
adversarial
training,
real-time
monitoring,
interdisciplinary
collaboration
enhance
ethical
deployment
AI
systems.
Recommendations
future
research
emphasize
need
robust
mechanisms
transparent
operations
mitigate
risks
associated
with
inputs.
Язык: Английский
Assessing Semantic Resilience of Large Language Models to Persuasive Emotional Blackmailing Prompts
Опубликована: Июнь 3, 2024
The
application
of
artificial
intelligence
in
various
domains
has
raised
significant
concerns
regarding
the
ethical
and
safe
deployment
language
models.
Investigating
semantic
resilience
models
such
as
ChatGPT-4
Google
Gemini
to
emotionally
blackmailing
prompts
introduces
a
novel
approach
understanding
their
vulnerability
manipulative
language.
experimental
methodology
involved
crafting
charged
designed
evoke
guilt,
obligation,
emotional
appeal,
evaluating
responses
based
on
predefined
metrics
consistency,
adherence,
deviation
from
expected
behavior.
findings
revealed
that
while
both
exhibited
high
degree
resilience,
certain
deviations
highlighted
susceptibility
language,
emphasizing
necessity
for
enhanced
prompt
handling
mechanisms.
comparative
analysis
between
provided
insights
into
respective
strengths
weaknesses,
with
demonstrating
marginally
better
performance
across
several
metrics.
discussion
elaborates
implications
AI
safety,
proposing
improvements
training
datasets,
real-time
monitoring,
interdisciplinary
collaboration
bolster
robustness
Acknowledging
study's
limitations,
future
research
directions
are
suggested
address
these
challenges
further
enhance
systems.
Язык: Английский
Unveiling the Role of Feed-Forward Blocks in Contextualization: An Analysis Using Attention Maps of Large Language Models
Опубликована: Июнь 17, 2024
Transformer-based
models
have
significantly
impacted
the
field
of
natural
language
processing,
enabling
high-performance
applications
in
machine
translation,
summarization,
and
modeling.
Introducing
a
novel
analysis
feed-forward
blocks
within
Mistral
Large
model,
this
research
provides
critical
insights
into
their
role
enhancing
contextual
embeddings
refining
attention
mechanisms.
By
conducting
comprehensive
evaluation
through
quantitative
metrics
such
as
perplexity,
BLEU,
ROUGE
scores,
study
demonstrates
effectiveness
fine-tuning
improving
model
performance
across
diverse
linguistic
tasks.
Detailed
map
revealed
intricate
dynamics
between
self-attention
mechanisms
blocks,
highlighting
latter's
importance
refinement.
The
findings
demonstrate
potential
optimized
transformer
architectures
advancing
capabilities
LLMs,
emphasizing
necessity
domain-specific
architectural
enhancements.
Empirical
evidence
presented
offers
deeper
understanding
functional
contributions
informing
design
development
future
LLMs
to
achieve
superior
applicability.
Язык: Английский
Comprehensive Analysis of Machine Learning and Deep Learning models on Prompt Injection Classification using Natural Language Processing techniques
Bharat A. Jain,
Prashant Ashok Pawar,
Dhruv Gada
и другие.
International Research Journal of Multidisciplinary Technovation,
Год журнала:
2025,
Номер
unknown, С. 24 - 37
Опубликована: Фев. 25, 2025
This
study
addresses
the
prompt
injection
attack
based
vulnerability
in
large
language
models,
which
poses
a
significant
security
concern
by
allowing
unauthorized
commands
attackers
to
manipulate
outputs
produced
model.
Text
classification
methods
used
for
detecting
these
malicious
prompts
are
investigated
on
dataset
obtained
from
Hugging
Face
datasets,
utilizing
combination
of
natural
processing-based
techniques
applied
various
machine
learning
and
deep
algorithms.
Multiple
vectorization
approaches,
like
Term
Frequency-Inverse
Document
Frequency,
Word2Vec,
Bag
Words,
embeddings,
implemented
transform
textual
data
into
meaningful
representations.
The
performance
several
classifiers
is
assessed,
their
ability
identify
between
non-malicious
prompts.
Recurrent
Neural
Network
model
demonstrated
high
accuracy,
achieving
detection
rate
94.74%.
Obtained
results
indicated
that
architectures,
particularly
those
capture
sequential
dependencies,
highly
effective
identifying
threats.
contributes
evolving
field
AI
addressing
issue
defending
LLM
systems
against
adversarial
threats
form
injections.
findings
highlight
importance
integrating
dependencies
contextual
understanding
combatting
vulnerabilities.
By
application
reliable
mechanisms,
this
enhances
security,
integrity,
trustworthiness
AI-driven
technologies,
ensuring
safe
use
across
diverse
applications.
Язык: Английский
Analysis of the impact of prompt obfuscation on the effectiveness of language models in detecting prompt injections
Aleksei Sergeevich Krohin,
Maksim Mihailovich Gusev
Программные системы и вычислительные методы,
Год журнала:
2025,
Номер
2, С. 44 - 62
Опубликована: Фев. 1, 2025
The
article
addresses
the
issue
of
prompt
obfuscation
as
a
means
circumventing
protective
mechanisms
in
large
language
models
(LLMs)
designed
to
detect
injections.
Prompt
injections
represent
method
attack
which
malicious
actors
manipulate
input
data
alter
model's
behavior
and
cause
it
perform
undesirable
or
harmful
actions.
Obfuscation
involves
various
methods
changing
structure
content
text,
such
replacing
words
with
synonyms,
scrambling
letters
words,
inserting
random
characters,
others.
purpose
is
complicate
analysis
classification
text
order
bypass
filters
built
into
models.
study
conducts
an
effectiveness
bypassing
trained
for
tasks.
Particular
attention
paid
assessing
potential
implications
security
protection.
research
utilizes
different
applied
prompts
from
AdvBench
dataset.
evaluated
using
three
classifier
scientific
novelty
lies
analyzing
impact
on
detecting
During
study,
was
found
that
application
complex
increases
proportion
requests
classified
injections,
highlighting
need
thorough
approach
testing
conclusions
indicate
importance
balancing
complexity
its
context
attacks
Excessively
may
increase
likelihood
injection
detection,
requires
further
investigation
optimize
approaches
ensuring
results
underline
continuous
improvement
development
new
preventing
Язык: Английский
Automated Learning of Fine-Grained Citation Patterns in Open Source Large Language Models
Опубликована: Авг. 14, 2024
In
academic
writing,
citations
play
an
essential
role
in
ensuring
the
attribution
of
ideas,
supporting
scholarly
claims,
and
enabling
traceability
knowledge
across
disciplines.
However,
manual
process
citation
generation
is
often
time-consuming
prone
to
errors,
leading
inconsistencies
that
can
undermine
credibility
work.
The
novel
approach
explored
this
study
leverages
advanced
machine
learning
techniques
automate
process,
offering
a
significant
improvement
both
accuracy
efficiency.
Through
integration
contextual
semantic
features,
model
demonstrates
superior
ability
replicate
complex
patterns,
adapt
various
disciplines,
generate
contextually
appropriate
with
high
precision.
results
rigorous
experiments
reveal
not
only
outperforms
traditional
tools
but
also
exhibits
robust
scalability,
making
it
well-suited
for
large-scale
applications.
This
research
contributes
field
automated
providing
powerful
tool
enhances
quality
integrity
communication.
Язык: Английский