A group of health systems are pitting artificial intelligence models against each other to see which ones are most effective for clinicians.
On Wednesday, Boston-based health system Mass General Brigham, Atlanta-based Emory Healthcare, Madison-based University of Wisconsin School of Medicine and Public Health, University of Washington School of Medicine’s Department of Radiology and industry group American College of Radiology launched an interactive challenge where clinicians can compare and rank how different AI models perform in clinical settings.
Related: What digital health companies are focusing on to drive sales
The group is calling the initiative the Healthcare AI Challenge Collaborative. Participating clinicians from those organizations will initially examine radiology AI models but Mass General Brigham Chief Data Science Officer Dr. Keith Dreyer said there are plans to expand to other specialties.
The challenge is the latest example of health systems' desire to better understand the impact of AI. Many providers are seeking evidence on how AI can improve clinical outcomes and reduce burnout while coming to grips with the costs associated with the technology.
The goal is to create an arena environment to evaluate how models perform on real cases seen by clinicians, Dreyer said. He said the group plans an additional three or four competitions by early next year that will focus on pathology and clinical notetaking.
Dreyer said the challenge will give developers the ability to understand potential deficiencies with their models and engage with hard-to-reach subject matter experts.
The virtual challenge will allow clinicians to view outputs from nine of commercial AI models including those from tech vendors OpenAI, Microsoft and Amazon. Dreyer said the goal is to publicly rank the models' performance by the end of the year.
Participating clinicians in the first challenge will view a model's output of a de-identified patient image and then rank each model's draft report, differential diagnosis, findings and care recommendations. Dreyer said the process will provide valuable access for model developers while drawing clearer conclusions on AI efficacy. Clinicians will not be able to see which AI model performed the analysis until after each has been ranked.
The announcement comes as the industry attempts to establish guardrails around AI in the absence of comprehensive federal policy. The Coalition for Health AI, a stakeholder group with about 3,000 members including many health systems, are among those seeking to create guidelines on responsible AI use.