medRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Oct. 30, 2024
ABSTRACT
Background
Few
studies
have
explored
the
degree
to
which
fine-tuning
a
large-language
model
(LLM)
can
improve
its
ability
answer
specific
set
of
questions
about
research
study.
Methods
We
created
an
instruction
comprising
250
marked-down
HIV
drug
resistance,
16
per
study,
answers
each
question,
and
explanations
for
answer.
The
were
broadly
relevant
pathogenic
human
viruses
including
whether
study
reported
viral
genetic
sequences
demographics
antiviral
treatments
persons
from
whom
obtained.
fine-tuned
GPT-4o-mini
(GPT-4o),
Llama3.1-8B-Instruct
(Llama3.1-8B),
Llama3.1-70B-Instruct
(Llama3.1-70B)
using
quantized
low
rank
adapter
(QLoRA).
assessed
accuracy,
precision,
recall
base
in
answering
same
on
test
120
different
studies.
Paired
t-tests
Wilcoxon
signed-rank
tests
used
compare
models
one
another,
their
respective
model,
another.
Results
Prior
fine-tuning,
GPT-4o
displayed
significantly
greater
performance
than
both
Llama3.1-70B
Llama3.1-8B
due
precision
compared
with
Llama3.1-8B;
there
was
no
difference
between
Llama3.1-8B.
After
Llama3.1-70B,
but
not
Llama3.1-8B,
improved
models.
resulted
mean
6%
increased
9%
recall;
15%
precision.
outperformed
did
perform
as
well
superior
recall.
Conclusion
Fine-tuning
smaller
led
marked
improvement
papers.
process
we
describe
will
be
useful
researchers
studying
other
medical
domains.
AUTHOR
SUMMARY
Addressing
key
biomedical
often
requires
systematically
reviewing
data
numerous
studies—a
that
demands
time
expertise.
Large
language
(LLMs)
shown
potential
screening
papers
summarizing
content.
However,
few
groups
these
enhance
specialized
In
this
three
LLMs
subject
resistance
proprietary
LLM
(GPT-4o-mini)
two
open-source
(Llama3.1-Instruct-70B
Llama
3.1-Instruct-8B).
To
fine-tune
models,
selected
covering
included
sequences,
patient
demographics,
treatments.
then
tested
independent
Our
results
showed
Llama3.1-Instruct-70B
domain-specific
questions,
while
Llama3.1-Instruct-8B
improved.
described
offers
roadmap
fields
represents
step
our
attempt
towards
developing
capable
across
range
viruses.
npj Digital Medicine,
Journal Year:
2025,
Volume and Issue:
8(1)
Published: March 6, 2025
Integrating
Large
Language
Models
(LLMs)
into
healthcare
promises
substantial
advancements
but
requires
careful
consideration
of
technical,
ethical,
and
regulatory
challenges.
Closed
LLMs
private
companies
offer
ease
deployment
pose
risks
related
to
data
privacy
vendor
dependence.
Open
deployed
on
local
hardware
enable
greater
model
customization
demand
resources
technical
expertise.
Balancing
these
approaches,
with
collaboration
among
clinicians,
researchers,
is
crucial
ensure
effective,
secure,
ethical
implementation.
Evidence-Based Practice,
Journal Year:
2025,
Volume and Issue:
28(1), P. 1 - 4
Published: Jan. 1, 2025
Schrager,
Sarina
MD,
MS;
Seehusen,
Dean
A.
MPH;
Sexton,
Sumi
M.
MD;
Richardson,
Caroline
Neher,
Jon
Pimlott,
Nicholas
Bowman,
Marjorie
Rodíguez,
José
Morley,
Christopher
P.
PhD;
Li,
Li
PhD,
Dera,
James
Dom
MD
Author
Information
There
are
multiple
guidelines
from
publishers
and
organizations
on
the
use
of
artiXcial
intelligence
(AI)
in
publishing.However,
none
speciXc
to
family
medicine.Most
journals
have
some
basic
AI
recommendations
for
authors,
but
more
explicit
direction
is
needed,
as
not
all
tools
same.
Family Medicine and Community Health,
Journal Year:
2025,
Volume and Issue:
13(1), P. e003238 - e003238
Published: Jan. 1, 2025
There
are
multiple
guidelines
from
publishers
and
organisations
on
the
use
of
artificial
intelligence
(AI)
in
publishing.[1–5][1]
However,
none
specific
to
family
medicine.
Most
journals
have
some
basic
AI
recommendations
for
authors,
but
more
explicit
direction
is
needed,
as
not
all
The Annals of Family Medicine,
Journal Year:
2025,
Volume and Issue:
unknown, P. 240575 - 240575
Published: Jan. 13, 2025
2][3][4][5]
However,
none
are
specific
to
family
medicine.Most
journals
have
some
basic
AI
use
recommendations
for
authors,
but
more
explicit
direction
is
needed,
as
not
all
tools
the
same.As
medicine
journal
editors,
we
want
provide
a
unified
statement
about
in
academic
publishing
publishers,
and
peer
reviewers
based
on
our
current
understanding
of
field.The
technology
advancing
rapidly.While
text
generated
from
early
large
language
models
(LLMs)
was
relatively
easy
identify,
newer
versions
getting
progressively
better
at
imitating
human
challenging
detect.Our
goal
develop
framework
managing
journals.As
this
rapidly
evolving
environment,
acknowledge
that
any
such
will
need
continue
evolve.However,
also
feel
it
important
guidance
where
today.Definitions:
Artificial
intelligence
broad
field
computers
perform
tasks
historically
been
thought
require
intelligence.LLMs
recent
breakthrough
allow
generate
seems
like
comes
human.LLMs
deal
with
generation,
while
broader
term
generative
can
include
images
or
figures.Chat
GPT
one
earliest
widely
used
LLM
models,
other
companies
developed
similar
products.LLMs
"learn"
do
multifaceted
analysis
word
sequences
massive
training
database
new
words
using
complex
probability
model.The
model
has
random
component,
so
responses
exact
same
prompt
submitted
multiple
times
be
identical.LLMs
looks
medical
article
response
prompt,
article's
content
may
accurate.LLMs
"confabulate"
generating
convincing
includes
false
information.
6,7,8LLMs
search
internet
answers
questions.However,
they
paired
engines
increasingly
sophisticated
ways.For
rest
editorial,
synonymously
LLMs.
Journal of Educational Evaluation for Health Professions,
Journal Year:
2025,
Volume and Issue:
22, P. 4 - 4
Published: Jan. 16, 2025
The
peer
review
process
ensures
the
integrity
of
scientific
research.
This
is
particularly
important
in
medical
field,
where
research
findings
directly
impact
patient
care.
However,
rapid
growth
publications
has
strained
reviewers,
causing
delays
and
potential
declines
quality.
Generative
artificial
intelligence,
especially
large
language
models
(LLMs)
such
as
ChatGPT,
may
assist
researchers
with
efficient,
high-quality
reviews.
explores
integration
LLMs
into
review,
highlighting
their
strengths
linguistic
tasks
challenges
assessing
validity,
clinical
medicine.
Key
points
for
include
initial
screening,
reviewer
matching,
feedback
support,
review.
implementing
these
purposes
will
necessitate
addressing
biases,
privacy
concerns,
data
confidentiality.
We
recommend
using
complementary
tools
under
clear
guidelines
to
not
replace,
human
expertise
maintaining
rigorous
standards.