ACM Transactions on Intelligent Systems and Technology,
Journal Year:
2025,
Volume and Issue:
unknown
Published: Feb. 18, 2025
In
the
rapidly
evolving
field
of
Natural
Language
Processing
(NLP),
optimizing
methods
for
fine-tuning
Large
Models
(LLMs)
is
increasingly
critical
improving
generalization
and
performance.
Fine-tuning
LLMs
challenging
due
to
high
costs,
overfitting,
difficulty
adapting
diverse
tasks.
These
challenges
grow
as
scale,
making
traditional
inefficient
expensive.
To
address
these
issues,
a
novel
Information
Bottleneck
(IB)
method
proposed,
focusing
on
retaining
only
most
relevant
information
in
model’s
internal
representations.
By
striking
balance
between
compression
predictive
relevance,
IB
aims
reduce
overfitting
enhance
generalization.
This
approach
also
integrates
reinforcement
learning
continual
LLM
performance
further.
The
proposed
framework
considers
two
key
metrics:
(1)
effectiveness,
which
reduces
redundancy
improves
generalization,
(2)
ensures
task-specific
scheme
achieves
scalable
across
NLP
tasks
using
lightweight
proxy
model
computational
efficiency.
empirical
evaluations
ablation
studies
show
that
accuracy
while
significantly
reducing
enabling
efficient,
interpretable,
adaptable
optimization
increasing
convergence.
IEEE Access,
Journal Year:
2024,
Volume and Issue:
12, P. 54608 - 54649
Published: Jan. 1, 2024
The
Generative
Pre-trained
Transformer
(GPT)
represents
a
notable
breakthrough
in
the
domain
of
natural
language
processing,
which
is
propelling
us
toward
development
machines
that
can
understand
and
communicate
using
manner
closely
resembles
humans.
GPT
based
on
transformer
architecture,
deep
neural
network
designed
for
processing
tasks.
Due
to
their
impressive
performance
tasks
ability
effectively
converse,
have
gained
significant
popularity
among
researchers
industrial
communities,
making
them
one
most
widely
used
effective
models
related
fields,
motivated
conduct
this
review.
This
review
provides
detailed
overview
GPT,
including
its
working
process,
training
procedures,
enabling
technologies,
impact
various
applications.
In
review,
we
also
explored
potential
challenges
limitations
GPT.
Furthermore,
discuss
solutions
future
directions.
Overall,
paper
aims
provide
comprehensive
understanding
applications,
emerging
challenges,
solutions.
Healthcare,
Journal Year:
2023,
Volume and Issue:
11(20), P. 2776 - 2776
Published: Oct. 20, 2023
Generative
artificial
intelligence
(AI)
and
large
language
models
(LLMs),
exemplified
by
ChatGPT,
are
promising
for
revolutionizing
data
information
management
in
healthcare
medicine.
However,
there
is
scant
literature
guiding
their
integration
non-AI
professionals.
This
study
conducts
a
scoping
review
to
address
the
critical
need
guidance
on
integrating
generative
AI
LLMs
into
medical
practices.
It
elucidates
distinct
mechanisms
underpinning
these
technologies,
such
as
Reinforcement
Learning
from
Human
Feedback
(RLFH),
including
few-shot
learning
chain-of-thought
reasoning,
which
differentiates
them
traditional,
rule-based
systems.
requires
an
inclusive,
collaborative
co-design
process
that
engages
all
pertinent
stakeholders,
clinicians
consumers,
achieve
benefits.
Although
global
research
examining
both
opportunities
challenges,
ethical
legal
dimensions,
offer
advancements
enhancing
management,
retrieval,
decision-making
processes.
Continued
innovation
acquisition,
model
fine-tuning,
prompt
strategy
development,
evaluation,
system
implementation
imperative
realizing
full
potential
of
technologies.
Organizations
should
proactively
engage
with
technologies
improve
quality,
safety,
efficiency,
adhering
guidelines
responsible
application.
Journal of the American Medical Informatics Association,
Journal Year:
2024,
Volume and Issue:
31(9), P. 1844 - 1855
Published: Feb. 27, 2024
Abstract
Objective
In
this
study,
we
investigate
the
potential
of
large
language
models
(LLMs)
to
complement
biomedical
knowledge
graphs
in
training
semantic
for
and
clinical
domains.
Materials
Methods
Drawing
on
wealth
Unified
Medical
Language
System
graph
harnessing
cutting-edge
LLMs,
propose
a
new
state-of-the-art
approach
obtaining
high-fidelity
representations
concepts
sentences,
consisting
3
steps:
an
improved
contrastive
learning
phase,
novel
self-distillation
weight
averaging
phase.
Results
Through
rigorous
evaluations
diverse
downstream
tasks,
demonstrate
consistent
substantial
improvements
over
previous
state
art
textual
similarity
(STS),
concept
representation
(BCR),
clinically
named
entity
linking,
across
15+
datasets.
Besides
our
model
English,
also
distill
release
multilingual
compatible
with
50+
languages
finetuned
7
European
languages.
Discussion
Many
pipelines
can
benefit
from
latest
models.
Our
enables
range
advancements
learning,
opening
avenue
bioinformatics
researchers
around
world.
As
result,
hope
see
BioLORD-2023
becoming
precious
tool
future
applications.
Conclusion
article,
introduced
BioLORD-2023,
STS
BCR
designed
domain.
This
study
presents
a
novel
approach
to
enhance
Large
Language
Models
(LLMs)
like
Alpaca
by
dynamically
integrating
real-time
information.
method
addresses
the
issue
of
content
hallucination
and
data
relevancy
automatically
collecting
current
from
credible
sources
into
model
prompts.
Experiments
show
significant
improvement
in
accuracy
decrease
hallucination,
with
manageable
increase
response
time.
The
research
underscores
potential
integration
making
LLMs
more
accurate
contextually
relevant,
setting
foundation
for
future
advancements
dynamic
processing
AI.
Research Square (Research Square),
Journal Year:
2024,
Volume and Issue:
unknown
Published: May 22, 2024
Abstract
Recent
advancements
in
large
language
models
(LLMs)
such
as
ChatGPT
and
LLaMA
have
hinted
at
their
potential
to
revolutionize
medical
applications,
yet
application
clinical
settings
often
reveals
limitations
due
a
lack
of
specialized
training
on
medical-specific
data.
In
response
this
challenge,
study
introduces
Me-LLaMA,
novel
LLM
family
that
includes
foundation
–
Me-LLaMA
13/70B,
along
with
chat-enhanced
versions
13/70B-chat,
developed
through
continual
pre-training
instruction
tuning
LLaMA2
using
datasets.
Our
methodology
leverages
comprehensive
domain-specific
data
suite,
including
large-scale,
dataset
129B
tokens,
an
214k
samples,
new
evaluation
benchmark
(MIBE)
across
six
critical
tasks
12
extensive
the
MIBE
shows
achieve
overall
better
performance
than
existing
open-source
LLMs
zero-shot,
few-shot
supervised
learning
abilities.
With
task-specific
tuning,
outperform
7
out
8
datasets
GPT-4
5
addition,
we
investigated
catastrophic
forgetting
problem,
our
results
show
other
mitigating
issue.
is
one
largest
use
both
biomedical
It
exhibits
superior
general
compared
LLMs,
rendering
it
attractive
choice
for
AI
applications.
We
release
models,
datasets,
scripts
at:
https://github.com/BIDS-Xu-Lab/Me-LLaMA.
Connection Science,
Journal Year:
2024,
Volume and Issue:
36(1)
Published: May 16, 2024
In
2022,
OpenAI's
unveiling
of
generative
AI
Large
Language
Models
(LLMs)-
ChatGPT,
heralded
a
significant
leap
forward
in
human-machine
interaction
through
cutting-edge
technologies.
With
its
surging
popularity,
scholars
across
various
fields
have
begun
to
delve
into
the
myriad
applications
ChatGPT.
While
existing
literature
reviews
on
LLMs
like
ChatGPT
are
available,
there
is
notable
absence
systematic
(SLRs)
and
bibliometric
analyses
assessing
research's
multidisciplinary
geographical
breadth.
This
study
aims
bridge
this
gap
by
synthesising
evaluating
how
has
been
integrated
diverse
research
areas,
focussing
scope
distribution
studies.
Through
review
scholarly
articles,
we
chart
global
utilisation
scientific
domains,
exploring
contribution
advancing
paradigms
adoption
trends
among
different
disciplines.
Our
findings
reveal
widespread
endorsement
multiple
fields,
with
implementations
healthcare
(38.6%),
computer
science/IT
(18.6%),
education/research
(17.3%).
Moreover,
our
demographic
analysis
underscores
ChatGPT's
reach
accessibility,
indicating
participation
from
80
unique
countries
ChatGPT-related
research,
most
frequent
keyword
occurrence,
USA
(719),
China
(181),
India
(157)
leading
contributions.
Additionally,
highlights
roles
institutions
such
as
King
Saud
University,
All
Institute
Medical
Sciences,
Taipei
University
pioneering
dataset.
not
only
sheds
light
vast
opportunities
challenges
posed
pursuits
but
also
acts
pivotal
resource
for
future
inquiries.
It
emphasises
that
(LLM)
role
revolutionising
every
field.
The
insights
provided
paper
particularly
valuable
academics,
researchers,
practitioners
disciplines,
well
policymakers
looking
grasp
extensive
impact
technologies
community.
European Radiology,
Journal Year:
2024,
Volume and Issue:
unknown
Published: Oct. 23, 2024
Abstract
Structured
reporting
(SR)
has
long
been
a
goal
in
radiology
to
standardize
and
improve
the
quality
of
reports.
Despite
evidence
that
SR
reduces
errors,
enhances
comprehensiveness,
increases
adherence
guidelines,
its
widespread
adoption
limited.
Recently,
large
language
models
(LLMs)
have
emerged
as
promising
solution
automate
facilitate
SR.
Therefore,
this
narrative
review
aims
provide
an
overview
LLMs
for
beyond.
We
found
current
literature
on
is
limited,
comprising
ten
studies
generative
pre-trained
transformer
(GPT)-3.5
(
n
=
5)
and/or
GPT-4
8),
while
two
additionally
examined
performance
Perplexity
Bing
Chat
or
IT5.
All
reported
results
acknowledged
potential
SR,
with
six
out
demonstrating
feasibility
multilingual
applications.
Building
upon
these
findings,
we
discuss
limitations,
regulatory
challenges,
further
applications
report
processing,
encompassing
four
main
areas:
documentation,
translation
summarization,
clinical
evaluation,
data
mining.
In
conclusion,
underscores
transformative
efficiency
accuracy
processing.
Key
Points
Question
How
can
help
make
more
ubiquitous
?
Findings
Current
leveraging
sparse
but
shows
results,
including
.
Clinical
relevance
transform
processing
enable
However,
their
future
role
practice
depends
overcoming
limitations
opaque
algorithms
training
ACM SIGKDD Explorations Newsletter,
Journal Year:
2024,
Volume and Issue:
26(1), P. 34 - 48
Published: July 24, 2024
Large
Language
Models
(LLMs)
have
demonstrated
remarkable
success
across
various
domains.
However,
despite
their
promising
performance
in
numerous
real-world
applications,
most
of
these
algorithms
lack
fairness
considerations.
Consequently,
they
may
lead
to
discriminatory
outcomes
against
certain
communities,
particularly
marginalized
populations,
prompting
extensive
study
fair
LLMs.
On
the
other
hand,
LLMs,
contrast
traditional
machine
learning,
entails
exclusive
backgrounds,
taxonomies,
and
fulfillment
techniques.
To
this
end,
survey
presents
a
comprehensive
overview
recent
advances
existing
literature
concerning
Specifically,
brief
introduction
LLMs
is
provided,
followed
by
an
analysis
factors
contributing
bias
Additionally,
concept
discussed
categorically,
summarizing
metrics
for
evaluating
promoting
fairness.
Furthermore,
resources
including
toolkits
datasets,
are
summarized.
Finally,
research
challenges
open
questions
discussed.