Large language models help facilitate the automated synthesis of information on potential pest controllers DOI Creative Commons
Daan Scheepens, Joseph Millard, Maxwell J. Farrell

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Янв. 15, 2024

The body of ecological literature, which informs much our knowledge the global loss biodiversity, has been experiencing rapid growth in recent decades. increasing difficulty to synthesise this literature manually simultaneously resulted a growing demand for automated text mining methods. Within domain deep learning, large language models (LLMs) have subject considerable attention years by virtue great leaps progress and wide range potential applications, however, quantitative investigation into their ecology so far lacking. In work, we analyse ability GPT-4 extract information about invertebrate pests pest controllers from abstracts on biological control, using bespoke, zero-shot prompt. Our results show that performance is highly competitive with other state-of-the-art tools used taxonomic named entity recognition geographic location extraction tasks. On held-out test set, species locations are extracted F1-scores 99.8% 95.3%, respectively, highlight model able distinguish very effectively between primary roles interest (predators, parasitoids pests). Moreover, demonstrate predict across various ranks, automatically correct spelling mistakes. However, do report small number cases fabricated (hallucinations). As result current lack specialised, pre-trained models, general-purpose LLMs may provide promising way forward ecology. Combined tailored prompt engineering, such can be employed tasks ecology, greatly reduce time spent manual screening labelling literature.

Язык: Английский

Deep transfer learning for automatic speech recognition: Towards better generalization DOI
Hamza Kheddar, Yassine Himeur, Somaya Al‐Maadeed

и другие.

Knowledge-Based Systems, Год журнала: 2023, Номер 277, С. 110851 - 110851

Опубликована: Июль 29, 2023

Язык: Английский

Процитировано

63

GPTAraEval: A Comprehensive Evaluation of ChatGPT on Arabic NLP DOI Creative Commons

Md Tawkat Islam Khondaker,

Abdul Waheed,

El Moatez Billah Nagoudi

и другие.

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Год журнала: 2023, Номер unknown

Опубликована: Янв. 1, 2023

ChatGPT’s emergence heralds a transformative phase in NLP, particularly demonstrated through its excellent performance on many English benchmarks. However, the model’s efficacy across diverse linguistic contexts remains largely uncharted territory. This work aims to bridge this knowledge gap, with primary focus assessing capabilities Arabic languages and dialectal varieties. Our comprehensive study conducts large-scale automated human evaluation of ChatGPT, encompassing 44 distinct language understanding generation tasks over 60 different datasets. To our knowledge, marks first extensive analysis deployment NLP. findings indicate that, despite remarkable English, ChatGPT is consistently surpassed by smaller models that have undergone finetuning Arabic. We further undertake meticulous comparison GPT-4’s Modern Standard (MSA) Dialectal (DA), unveiling relative shortcomings both handling dialects compared MSA. Although we explore confirm utility employing GPT-4 as potential alternative for evaluation, adds growing body research underscoring limitations ChatGPT.

Язык: Английский

Процитировано

27

Benchmarking, ethical alignment, and evaluation framework for conversational AI: Advancing responsible development of ChatGPT DOI Creative Commons
Partha Pratim Ray

BenchCouncil Transactions on Benchmarks Standards and Evaluations, Год журнала: 2023, Номер 3(3), С. 100136 - 100136

Опубликована: Авг. 9, 2023

Conversational AI systems like ChatGPT have seen remarkable advancements in recent years, revolutionizing human–computer interactions. However, evaluating the performance and ethical implications of these remains a challenge. This paper delves into creation rigorous benchmarks, adaptable standards, an intelligent evaluation methodology tailored specifically for ChatGPT. We meticulously analyze several prominent including GLUE, SuperGLUE, SQuAD, CoQA, Persona-Chat, DSTC, BIG-Bench, HELM MMLU illuminating their strengths limitations. also scrutinizes existing standards set by OpenAI, IEEE's Ethically Aligned Design, Montreal Declaration, Partnership on AI's Tenets, investigating relevance to Further, we propose adaptive that encapsulate considerations, context adaptability, community involvement. In terms evaluation, explore traditional methods BLEU, ROUGE, METEOR, precision–recall, F1 score, perplexity, user feedback, while proposing novel approach harnesses power reinforcement learning. Our proposed framework is multidimensional, incorporating task-specific, real-world application, multi-turn dialogue benchmarks. perform feasibility analysis, SWOT analysis adaptability framework. The highlights significance integrating it as core component alongside subjective assessments interactive sessions. By amalgamating elements, this contributes development comprehensive fosters responsible impactful advancement field conversational AI.

Язык: Английский

Процитировано

26

Utilizing BERT for Information Retrieval: Survey, Applications, Resources, and Challenges DOI Open Access
Jiajia Wang, Jimmy Xiangji Huang, Xinhui Tu

и другие.

ACM Computing Surveys, Год журнала: 2024, Номер 56(7), С. 1 - 33

Опубликована: Фев. 15, 2024

Recent years have witnessed a substantial increase in the use of deep learning to solve various natural language processing (NLP) problems. Early models were constrained by their sequential or unidirectional nature, such that they struggled capture contextual relationships across text inputs. The introduction bidirectional encoder representations from transformers (BERT) leads robust for transformer model can understand broader context and deliver state-of-the-art performance NLP tasks. This has inspired researchers practitioners apply BERT practical problems, as information retrieval (IR). A survey focuses on comprehensive analysis prevalent approaches pretrained encoders like IR thus be useful academia industry. In light this, we revisit variety BERT-based methods this survey, cover wide range techniques IR, group them into six high-level categories: (i) handling long documents, (ii) integrating semantic information, (iii) balancing effectiveness efficiency, (iv) predicting weights terms, (v) query expansion, (vi) document expansion. We also provide links resources, including datasets toolkits, systems. Additionally, highlight advantages employing encoder-based contrast recent large ChatGPT, which are decoder-based demand extensive computational resources. Finally, summarize outcomes suggest directions future research area.

Язык: Английский

Процитировано

15

Studying and improving reasoning in humans and machines DOI Creative Commons
Nicolas Yax, Hernán Anlló, Stefano Palminteri

и другие.

Communications Psychology, Год журнала: 2024, Номер 2(1)

Опубликована: Июнь 3, 2024

In the present study, we investigate and compare reasoning in large language models (LLMs) humans, using a selection of cognitive psychology tools traditionally dedicated to study (bounded) rationality. We presented human participants an array pretrained LLMs new variants classical experiments, cross-compared their performances. Our results showed that most included errors akin those frequently ascribed error-prone, heuristic-based reasoning. Notwithstanding this superficial similarity, in-depth comparison between humans indicated important differences with human-like reasoning, models' limitations disappearing almost entirely more recent LLMs' releases. Moreover, show while it is possible devise strategies induce better performance, machines are not equally responsive same prompting schemes. conclude by discussing epistemological implications challenges comparing machine behavior for both artificial intelligence psychology.

Язык: Английский

Процитировано

10

Transfer learning for language model adaptation DOI Creative Commons
Mehwish Bari

Опубликована: Янв. 1, 2023

Language is the pathway to democratize boundary of land and culture.Bridging gap between languages one biggest challenges Artificial Intelligent (AI) systems.The current success AI systems dominated by supervised learning paradigm where gradient-based algorithms (i.e., SGD, Adam) are designed optimize complex high-dimensional planes.These learn from statistical observations that typically collected with intention a specific task product review, sentiment analysis).The use

Язык: Английский

Процитировано

22

Evaluation of ChatGPT on Biomedical Tasks: A Zero-Shot Comparison with Fine-Tuned Generative Transformers DOI Creative Commons
Israt Jahan, Md Tahmid Rahman Laskar, Chun Peng

и другие.

Опубликована: Янв. 1, 2023

ChatGPT is a large language model developed by OpenAI. Despite its impressive performance across various tasks, no prior work has investigated capability in the biomedical domain yet. To this end, paper aims to evaluate of on benchmark such as relation extraction, document classification, question answering, and summarization. best our knowledge, first that conducts an extensive evaluation domain. Interestingly, we find based datasets have smaller training sets, zero-shot even outperforms state-of-the-art fine-tuned generative transformer models, BioGPT BioBART. This suggests ChatGPT’s pre-training text corpora makes it quite specialized Our findings demonstrate potential be valuable tool for tasks lack annotated data.

Язык: Английский

Процитировано

20

Empirical Study of Zero-Shot NER with ChatGPT DOI Creative Commons

Tingyu Xie,

Qi Li, Jian Zhang

и другие.

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Год журнала: 2023, Номер unknown

Опубликована: Янв. 1, 2023

Large language models (LLMs) exhibited powerful capability in various natural processing tasks. This work focuses on exploring LLM performance zero-shot information extraction, with a focus the ChatGPT and named entity recognition (NER) task. Inspired by remarkable reasoning of symbolic arithmetic reasoning, we adapt prevalent methods to NER propose strategies tailored for NER. First, explore decomposed question-answering paradigm breaking down task into simpler subproblems labels. Second, syntactic augmentation stimulate model's intermediate thinking two ways: prompting, which encourages model analyze structure itself, tool augmentation, provides generated parsing tool. Besides, self-consistency proposing two-stage majority voting strategy, first votes most consistent mentions, then types. The proposed achieve improvements across seven benchmarks, including Chinese English datasets, both domain-specific general-domain scenarios. In addition, present comprehensive analysis error types suggestions optimization directions. We also verify effectiveness few-shot setting other LLMs.

Язык: Английский

Процитировано

17

A survey on recent advances in named entity recognition DOI Creative Commons
Imed Keraghel,

Stanislas Morbieu,

Athman Bouguettaya

и другие.

arXiv (Cornell University), Год журнала: 2024, Номер unknown

Опубликована: Янв. 1, 2024

Named Entity Recognition seeks to extract substrings within a text that name real-world objects and determine their type (for example, whether they refer persons or organizations). In this survey, we first present an overview of recent popular approaches, including advancements in Transformer-based methods Large Language Models (LLMs) have not had much coverage other surveys. addition, discuss reinforcement learning graph-based highlighting role enhancing NER performance. Second, focus on designed for datasets with scarce annotations. Third, evaluate the performance main implementations variety differing characteristics (as regards domain, size, number classes). We thus provide deep comparison algorithms never been considered together. Our experiments shed some light how affect behavior compare.

Язык: Английский

Процитировано

8

Large language models help facilitate the automated synthesis of information on potential pest controllers DOI Creative Commons
Daan Scheepens, Joseph Millard, Maxwell J. Farrell

и другие.

Methods in Ecology and Evolution, Год журнала: 2024, Номер 15(7), С. 1261 - 1273

Опубликована: Май 20, 2024

Abstract The body of ecological literature, which informs much our knowledge the global loss biodiversity, has been experiencing rapid growth in recent decades. increasing difficulty synthesising this literature manually simultaneously resulted a growing demand for automated text mining methods. Within domain deep learning, large language models (LLMs) have subject considerable attention years due to great leaps progress and wide range potential applications; however, quantitative investigation into their ecology so far lacking. In work, we analyse ability GPT‐4 extract information about invertebrate pests pest controllers from abstracts articles on biological control, using bespoke, zero‐shot prompt. Our results show that performance is highly competitive with other state‐of‐the‐art tools used taxonomic named entity recognition geographic location extraction tasks. On held‐out test set, species locations are extracted F1‐scores 99.8% 95.3%, respectively, highlight model can effectively distinguish between roles interest such as predators, parasitoids pests. Moreover, demonstrate model's predict across various ranks. However, do report small number cases fabricated (confabulations). Due lack specialised, pre‐trained models, general‐purpose LLMs may provide promising way forward ecology. Combined tailored prompt engineering, be employed tasks ecology, greatly reduce time spent manual screening labelling literature.

Язык: Английский

Процитировано

7