Automatic Simplification of Lithuanian Administrative Texts DOI Creative Commons
Justina Mandravickaitė, Eglė Rimkienė, Danguolė Kotryna Kapkan

et al.

Algorithms, Journal Year: 2024, Volume and Issue: 17(11), P. 533 - 533

Published: Nov. 20, 2024

Text simplification reduces the complexity of text while preserving essential information, thus making it more accessible to a broad range readers, including individuals with cognitive disorders, non-native speakers, children, and general public. In this paper, we present experiments on for Lithuanian language, aiming simplify administrative texts Plain Language level. We fine-tuned mT5 mBART models task evaluated effectiveness ChatGPT as well. assessed results via both quantitative metrics qualitative evaluation. Our findings indicated that performed best achieved scores across all evaluation metrics. The analysis further supported these findings. showed responded quite well short simple prompt given text; however, ignored most rules in elaborate prompt. Finally, our revealed BERTScore ROUGE aligned moderately human evaluations, BLEU readability lower or even negative correlations

Language: Английский

Automatic Text Simplification for Lithuanian: Transforming Administrative Texts into Plain Language DOI Creative Commons
Justina Mandravickaitė, Eglė Rimkienė, Danguolė Kotryna Kapkan

et al.

Mathematics, Journal Year: 2025, Volume and Issue: 13(3), P. 465 - 465

Published: Jan. 30, 2025

In this study, we present the results of experiments on text simplification for Lithuanian language, where aim to simplify administrative-style texts Plain Language level. We selected mT5, mBART, and LT-Llama-2 as foundational models fine-tuned them task. Additionally, evaluated ChatGPT purpose. Also, conducted a comprehensive assessment provided by these both quantitatively qualitatively. The demonstrated that mBART was most effective model simplifying administrative text, achieving highest scores across all evaluation metrics. A qualitative simplified sentences complemented our quantitative findings. Attention analysis insights into decisions, highlighting strengths in lexical syntactic simplifications but revealing challenges with longer, complex sentences. Our findings contribute advancing lesser-resourced languages, practical applications more communication between institutions general public, which is goal Language.

Language: Английский

Citations

0

Large corpora and large language models: a replicable method for automating grammatical annotation DOI
Cameron Morin, Matti Marttinen Larsson

Linguistics Vanguard, Journal Year: 2025, Volume and Issue: unknown

Published: April 9, 2025

Abstract Much linguistic research relies on annotated datasets of features extracted from text corpora, but the rapid quantitative growth these corpora has created practical difficulties for linguists to manually clean and annotate large data samples. In this paper, we present a method that leverages language models assisting linguist in grammatical annotation through prompt engineering, training, evaluation. We apply methodological pipeline case study formal variation English evaluative verb construction “ consider X (as) (to be) Y”, based model Claude 3.5 Sonnet Davies’s NOW Sketch Engine’s EnTenTen21 corpora. Overall, reach accuracy over 90 % our held-out test samples with only small amount training data, validating very quantities tokens future. discuss generalizability results wider range studies constructions change, underlining value AI copilots as tools future research, notwithstanding some important caveats.

Language: Английский

Citations

0

Automatic Simplification of Lithuanian Administrative Texts DOI Creative Commons
Justina Mandravickaitė, Eglė Rimkienė, Danguolė Kotryna Kapkan

et al.

Algorithms, Journal Year: 2024, Volume and Issue: 17(11), P. 533 - 533

Published: Nov. 20, 2024

Text simplification reduces the complexity of text while preserving essential information, thus making it more accessible to a broad range readers, including individuals with cognitive disorders, non-native speakers, children, and general public. In this paper, we present experiments on for Lithuanian language, aiming simplify administrative texts Plain Language level. We fine-tuned mT5 mBART models task evaluated effectiveness ChatGPT as well. assessed results via both quantitative metrics qualitative evaluation. Our findings indicated that performed best achieved scores across all evaluation metrics. The analysis further supported these findings. showed responded quite well short simple prompt given text; however, ignored most rules in elaborate prompt. Finally, our revealed BERTScore ROUGE aligned moderately human evaluations, BLEU readability lower or even negative correlations

Language: Английский

Citations

0