PubMed ID: 41999903
Author(s): Domalpally A, Chew EY, Eydelman MB, Keenan TDL, Keane PA, Lee AY, Lee CS, Lad EM, Lim JI, Lowenstein A, Schmidt-Erfurth U, Abramoff MD; Collaborative Community for Ophthalmic Imaging Executive Committee and the Working Group for Artificial Intelligence in Age-related Macular Degeneration. Reference Standard for Validation of Age-Related Macular Degeneration Screening Algorithms. Ophthalmology. 2026 Apr 16:S0161-6420(26)00282-4. doi: 10.1016/j.ophtha.2026.04.013. Online ahead of print. PMID 41999903
Journal: Ophthalmology, Apr 2026
PURPOSE Artificial intelligence (AI)-based screening models hold promise for identifying individuals with undiagnosed age-related macular degeneration (AMD) in non-specialist settings. A standardized reference framework for image labeling is needed to enable consistent training, validation, and deployment of AI based screening algorithms.The goal of the present study is to establish expert consensus on image -based reference standard for labeling AMD DESIGN: Modified Delphi consensus study Subjects/ Participants: fellowship-trained retina specialists, ophthalmologists, AI specialists, and imaging specialists METHODS: A prespecified Delphi process was conducted using structured surveys . Over two rounds, panelists assessed opinions on existing reference standards, including the AREDS scale and Beckman scale, as well as imaging modalities such as color, optical coherence tomography (OCT), and autofluorescence. The surveys also evaluated imaging features of AMD, including drusen, pseudodrusen, and pigment changes, as well as referral criteria. Consensus was defined using a 9-point Likert scale, with predefined statistical thresholds for agreement.
MAIN OUTCOME MEASURES Agreement on key elements of a reference standard RESULTS: Consensus was reached on adopting the Beckman Classification as the level 1 reference standard (median score 8; agreement). OCT use for identifying key AMD features, including drusen, GA, and CNV, also reached consensus (median scores 8.5-9; agreement). Pigment change detection did not reach consensus (median 7.5; uncertain), and screening age thresholds showed non-consensus (median 8; uncertain). Referral thresholds reached consensus, including urgent referral for neovascular AMD and non-urgent referral for GA and intermediate AMD (median 9; agreement).
CONCLUSIONS This study defines a consensus-based reference standard for labeling AMD from images for AI based screening. These recommendations are intended to support consistent AI model development and evaluation, while remaining distinct from clinical practice guidelines.
Copyright © 2026 American Academy of Ophthalmology, Inc. All rights reserved.