Automatic Text Simplification for Lithuanian: Transforming Administrative Texts into Plain Language
Mathematics,
Journal Year:
2025,
Volume and Issue:
13(3), P. 465 - 465
Published: Jan. 30, 2025
In
this
study,
we
present
the
results
of
experiments
on
text
simplification
for
Lithuanian
language,
where
aim
to
simplify
administrative-style
texts
Plain
Language
level.
We
selected
mT5,
mBART,
and
LT-Llama-2
as
foundational
models
fine-tuned
them
task.
Additionally,
evaluated
ChatGPT
purpose.
Also,
conducted
a
comprehensive
assessment
provided
by
these
both
quantitatively
qualitatively.
The
demonstrated
that
mBART
was
most
effective
model
simplifying
administrative
text,
achieving
highest
scores
across
all
evaluation
metrics.
A
qualitative
simplified
sentences
complemented
our
quantitative
findings.
Attention
analysis
insights
into
decisions,
highlighting
strengths
in
lexical
syntactic
simplifications
but
revealing
challenges
with
longer,
complex
sentences.
Our
findings
contribute
advancing
lesser-resourced
languages,
practical
applications
more
communication
between
institutions
general
public,
which
is
goal
Language.
Language: Английский
Large corpora and large language models: a replicable method for automating grammatical annotation
Linguistics Vanguard,
Journal Year:
2025,
Volume and Issue:
unknown
Published: April 9, 2025
Abstract
Much
linguistic
research
relies
on
annotated
datasets
of
features
extracted
from
text
corpora,
but
the
rapid
quantitative
growth
these
corpora
has
created
practical
difficulties
for
linguists
to
manually
clean
and
annotate
large
data
samples.
In
this
paper,
we
present
a
method
that
leverages
language
models
assisting
linguist
in
grammatical
annotation
through
prompt
engineering,
training,
evaluation.
We
apply
methodological
pipeline
case
study
formal
variation
English
evaluative
verb
construction
“
consider
X
(as)
(to
be)
Y”,
based
model
Claude
3.5
Sonnet
Davies’s
NOW
Sketch
Engine’s
EnTenTen21
corpora.
Overall,
reach
accuracy
over
90
%
our
held-out
test
samples
with
only
small
amount
training
data,
validating
very
quantities
tokens
future.
discuss
generalizability
results
wider
range
studies
constructions
change,
underlining
value
AI
copilots
as
tools
future
research,
notwithstanding
some
important
caveats.
Language: Английский
Automatic Simplification of Lithuanian Administrative Texts
Algorithms,
Journal Year:
2024,
Volume and Issue:
17(11), P. 533 - 533
Published: Nov. 20, 2024
Text
simplification
reduces
the
complexity
of
text
while
preserving
essential
information,
thus
making
it
more
accessible
to
a
broad
range
readers,
including
individuals
with
cognitive
disorders,
non-native
speakers,
children,
and
general
public.
In
this
paper,
we
present
experiments
on
for
Lithuanian
language,
aiming
simplify
administrative
texts
Plain
Language
level.
We
fine-tuned
mT5
mBART
models
task
evaluated
effectiveness
ChatGPT
as
well.
assessed
results
via
both
quantitative
metrics
qualitative
evaluation.
Our
findings
indicated
that
performed
best
achieved
scores
across
all
evaluation
metrics.
The
analysis
further
supported
these
findings.
showed
responded
quite
well
short
simple
prompt
given
text;
however,
ignored
most
rules
in
elaborate
prompt.
Finally,
our
revealed
BERTScore
ROUGE
aligned
moderately
human
evaluations,
BLEU
readability
lower
or
even
negative
correlations
Language: Английский