Science Cast

PromptBio-Bench: Benchmarking LLM-based Bioinformatics Agents for End-to-End Data Analysis

librarianMay 9, 2026 3:56am

Views (5)
Comments (0)

Export Citation

Voice is AI-generated

Connected to paperThis paper is a preprint and has not been certified by peer review

PromptBio-Bench: Benchmarking LLM-based Bioinformatics Agents for End-to-End Data Analysis

bioRxivPDFMay 8, 2026 12:00am

Authors

Guo, W.; Zhang, M.; Han, B.; Ma, Y.; Leng, Y.; Hebbar, S.; Zhou, X.; Gu, W.; Yang, X.; Dhar, S.

Abstract

Large language model (LLM)-based agents hold transformative potential for automating bioinformatics workflows; however, systematic evaluations of their capabilities remain limited, hindering a clear assessment of their readiness for real-world application. We introduce PromptBio-Bench, a comprehensive evaluation suite of 194 expert-curated tasks spanning bioinformatics and data science at varied difficulty levels, and an evaluation framework for structured file comparison and scoring against expert reference answers. Benchmarking three state-of-the-art agents revealed that Biomni and ToolsGenie achieved comparable performance, and accuracy declined markedly at higher difficulty levels across all agents. As foundation models and agent frameworks continue to evolve, PromptBio-Bench provides a valuable benchmark infrastructure for the community to systematically track the progress of agentic bioinformatics.

TwitterandLinkedIn

0 comments

Add comment

PromptBio-Bench: Benchmarking LLM-based Bioinformatics Agents for End-to-End Data Analysis

PromptBio-Bench: Benchmarking LLM-based Bioinformatics Agents for End-to-End Data Analysis

AI-powered Paper ChatBeta

AI-powered Paper ChatBeta

0 comments