Benchmarking Retrieval-Augmented Generation for Medicine

Guangzhi Xiong, Qiao Jin, Zhiyong Lu et al.

2024 · 199 citations

While large language models (LLMs) have achieved state-of-the-art performance on a wide range of medical question answering (QA) tasks, they still face challenges with hallucinations and outdated knowledge.Retrievalaugmented generation (RAG) is a promising solution and has been widely adopted.However, a RAG system can involve multiple flexible components, and there is a lack of best practices regarding the optimal RAG setting for various medical purposes.To systematically evaluate such systems, we propose the Medical Information Retrieval-Augmented Generation Evaluation (MIRAGE), a first-of-i…

Read the paper →

Explore this paper's citation graph on Constellation.