Current Opinion in Plant Biology, Год журнала: 2024, Номер 82, С. 102665 - 102665
Опубликована: Ноя. 22, 2024
Язык: Английский
Current Opinion in Plant Biology, Год журнала: 2024, Номер 82, С. 102665 - 102665
Опубликована: Ноя. 22, 2024
Язык: Английский
Chemical Society Reviews, Год журнала: 2024, Номер unknown
Опубликована: Дек. 20, 2024
Large language models (LLMs) allow for the extraction of structured data from unstructured sources, such as scientific papers, with unprecedented accuracy and performance.
Язык: Английский
Процитировано
2bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown
Опубликована: Ноя. 6, 2024
Abstract Premise: Recently, plant science has seen transformative advances in scalable data collection for sequence and chemical data. These large datasets, combined with machine learning, revealed that conducting metabolic research on scales yields remarkable insights. A key next step increasing scale been the advent of accessible language models, which, even their early stages, can distill structured from literature. This brings us closer to creating specialized databases consolidate virtually all published knowledge a topic. Methods: Here, we first test different prompt engineering technique / model combinations identification validated enzyme-product pairs. Next, evaluate automated retrieval augmented generation applied identifying compound-species associations. Finally, build determine accuracy multimodal model-based pipeline transcribes images tables into machine-readable formats. Results: When tuned each specific task, these methods perform high accuracies (80-90 percent pair table image transcription), or modest (50 percent) but lower false-negative rates than previous (down 40 55 identification. Discussion: We enumerate several suggestions working models as researchers, among which is importance user’s domain-specific expertise knowledge. Significance Statement Scientific have played major role advancing research. However, today’s advanced are incomplete and/or not built best suit certain tasks. explored evaluated use various techniques expand subset existing task-specific ways. Our results illustrate potential high-accuracy additions restructurings using assuming by used task. findings important because they outline method could greatly rapidly tailor them efforts, leading greater productivity effective utilization past findings. All authors collected data, analyzed prepared manuscript, approved its final version. The declare no competing interests.
Язык: Английский
Процитировано
1Current Opinion in Plant Biology, Год журнала: 2024, Номер 82, С. 102665 - 102665
Опубликована: Ноя. 22, 2024
Язык: Английский
Процитировано
0