Blog News

ChatMED Takes Center Stage at IS2025 in Ljubljana, Slovenia

The ChatMED consortium made a resounding impact at the 28th Information Society Multiconference (IS2025), unveiling a suite of five groundbreaking papers (available in Knowledge Repository) that redefine how Artificial Intelligence is evaluated, deployed, and trusted in healthcare. Held at the Jožef Stefan Institute, the conference served as a global stage for the ChatMED team to demonstrate how they are moving beyond simple “chatbots” to create robust, privacy-compliant, and clinically verifiable AI systems.

The Award-Winning Innovation: Logic Meets AI

The best paper award went to the TU Graz team for their paper, “Extraction of Knowledge Representations for Reasoning from Medical Questionnaires”. While most modern AI relies on statistical guessing, the TU Graz team introduced a method to extract “hard logic” from medical questionnaires. By converting the NetDoktor “Symptom-Checker” into formal decision trees and logical formulae, they created a deterministic “truth” against which AI models can be rigorously tested. This innovation addresses a critical flaw in current AI: the “black box” problem. Their methodology allows researchers to mathematically verify why a diagnosis was reached, providing the explainability that is non-negotiable for medical safety. The jury recognized this as a vital step toward creating AI systems that doctors can actually trust.

Redefining Evaluation: Beyond Just Accuracy

Complementing the logic-based approach, the team from the University of Ljubljana and JSI presented “Beyond Accuracy: A Multidimensional Evaluation Framework for Medical LLM Applications (M-LEAF)”. Authors Rok Smodiš, Filip Ivanišević, Ivana Karasmanakis, and Matjaž Gams argued that diagnostic accuracy alone is insufficient for safety. They introduced M-LEAF, a new framework that evaluates AI across eight pillars, including empathy, safety, and reliability. In their pilot study comparing GPT-4o with the Slovenian HomeDOCtor system, they demonstrated that while both systems are accurate, a structured framework is essential to catch critical failures in safety and interaction quality before they reach patients.

Privacy First: The GDPR Challenge

Addressing the legal realities of deploying AI in Europe, Tadej Horvat, Žan Roštan, Matjaž Gams, and Jakob Jaš presented “Evaluating Large Language Models for Privacy-Sensitive Healthcare Applications”. Their research highlighted a crucial trade-off: while commercial “frontier” models (like GPT-5) currently lead in complex reasoning, open-weight models hosted locally are rapidly closing the gap and are essential for GDPR compliance. The paper detailed the architecture of HomeDOCtor, a “zero-egress” system deployed in Slovenia that ensures patient data never leaves the secure local environment, proving that privacy and AI innovation can coexist.

Clinical Reality and Cognitive Growth

The consortium also showcased results from real-world clinical domains:

  • Epilepsy on Reddit: A team from the University Clinical Center Niš (Serbia) presented a study evaluating ChatGPT-4o’s ability to answer epilepsy-related questions on Reddit. They found the AI’s responses were highly accurate and comprehensive but consistently lacked the empathy of human physicians, marking it as a useful complementary tool rather than a replacement.
  • AI IQ Progression: In a fascinating look at the trajectory of machine intelligence, Jakob Jaš and Matjaž Gams presented “IQ Progression of Large Language Models”. Their analysis of Mensa and offline IQ tests showed that AI models have skyrocketed from below-average human intelligence to scoring in the top decile within just 12 months, projecting that models could reach IQ equivalents of 145–170 by 2026.

The diversity of these papers, spanning logic, ethics, privacy, clinical testing, and cognitive theory, underscores the ChatMED project’s holistic approach. By securing the Best Paper Award and presenting such a comprehensive body of work, the consortium has firmly established itself as a leader in the responsible application of Generative AI in healthcare.

To top