Native language identification from text using a fine-tuned GPT-2 model DOI Creative Commons
Yufeng Nie

PeerJ Computer Science, Journal Year: 2025, Volume and Issue: 11, P. e2909 - e2909

Published: May 28, 2025

Native language identification (NLI) is a critical task in computational linguistics, supporting applications such as personalized learning, forensic analysis, and machine translation. This study investigates the use of fine-tuned GPT-2 model to enhance NLI accuracy. Using NLI-PT dataset, we preprocess fine-tune classify native learners based on their Portuguese-written texts. Our approach leverages deep learning techniques, including tokenization, embedding extraction, multi-layer transformer-based classification. Experimental results show that our significantly outperforms traditional methods ( e.g ., SVM, Random Forest) other pre-trained models BERT, RoBERTa, BioBERT), achieving weighted F1 score 0.9419 an accuracy 94.65%. These large transformer work well for can help guide future research tools artificial intelligence (AI)-based education.

Language: Английский

Approximation of Algebraic Curves in Function Spaces of Topological Sequences Connected to Specific Simple Graph Families DOI Creative Commons
Mohammad Mazyad Hazzazi, Muhammad Imran, Muhammad Asgher Nadeem

et al.

Journal of Function Spaces, Journal Year: 2025, Volume and Issue: 2025(1)

Published: Jan. 1, 2025

Algebraic curves and topological sequences play a crucial role in mathematics graph theory, serving as bridge between geometry, algebra, number theory. They facilitate structural analysis various applications, including chemistry, network analysis, computer science. In this research, we introduce the concept of estimated algebraic S T , explore development linear exponential that emerge from associated with collections simple graphs. By analyzing invariants their corresponding sequences, aim to estimate model these through curves, thereby shedding light on dynamics growth trends. Our study centers some families graphs, such snake pan which obtain closed‐form expressions asymptotic approximations for sequences. constructing characterise mathematical interactions regulating evolution explain how qualities graphs affect properties. These provide new approaches comprehend complicated networks help address graph‐theoretic problems combinatorics, computational geometry.

Language: Английский

Citations

0

Native language identification from text using a fine-tuned GPT-2 model DOI Creative Commons
Yufeng Nie

PeerJ Computer Science, Journal Year: 2025, Volume and Issue: 11, P. e2909 - e2909

Published: May 28, 2025

Native language identification (NLI) is a critical task in computational linguistics, supporting applications such as personalized learning, forensic analysis, and machine translation. This study investigates the use of fine-tuned GPT-2 model to enhance NLI accuracy. Using NLI-PT dataset, we preprocess fine-tune classify native learners based on their Portuguese-written texts. Our approach leverages deep learning techniques, including tokenization, embedding extraction, multi-layer transformer-based classification. Experimental results show that our significantly outperforms traditional methods ( e.g ., SVM, Random Forest) other pre-trained models BERT, RoBERTa, BioBERT), achieving weighted F1 score 0.9419 an accuracy 94.65%. These large transformer work well for can help guide future research tools artificial intelligence (AI)-based education.

Language: Английский

Citations

0