End-to-End Radiology Report Generation From Chest X-Rays Using Vision–Language Models

Authors: G. Vennila, Dinesh Kumar K., S. Anitha et al.

Publication: Advances in Computational Intelligence and Robotics, Medical LLMs and AI in Healthcare

Published: Jun 12, 2026

Source: Crossref

Back to Search View Original Cite This Article

Abstract

<jats:p>The generation of diagnostic reports from chest X-ray images is a complex task that requires both accurate medical image interpretation and clear clinical language. Manual reporting is time-consuming, expertise-intensive, and susceptible to fatigue. Although deep learning–based systems have been developed to support this process, most traditional approaches treat image analysis and report generation as separate tasks, leading to inconsistencies and loss of critical details. Recent vision–language models offer a more unified solution by linking visual understanding with natural language generation. This chapter proposes an end-to-end framework that integrates a Swin Transformer image encoder with a Q-Former and a fine-tuned BioMedLM for producing accurate and clinically meaningful reports. Experiments on a curated chest X-ray dataset demonstrate improved pathology coverage, reduced errors, and enhanced report clarity, supporting reliable and efficient clinical decision-making.</jats:p>

Keywords

generation image reports chest xray

End-to-End Radiology Report Generation From Chest X-Rays Using Vision–Language Models

Abstract

Keywords

Related Articles

Radiology and Nuclear Physics

Chapter 1 The Stakes of Losing Turkey (July 1978–April 1982)

6 The Sixth Plague: Habits of a Hardened Heart

1 An Empire’s Revolutionary War

2 The End of Act 1