Extraction of Knowledge Representations for Reasoning from Medical Questionnaires

27 January, 2026

Many people have tried an online symptom checker: it asks a series of questions (“Where does it hurt?”, “Do you have a fever?”) and then suggests a short list of possible diagnoses.

But there is a problem if we want to trust and test such systems, especially modern AI chatbots and large language models that are unpredictable. We need to find a curated reference, a knowledge base that is consistent and reproducible. The paper behind this post proposes a practical way to build exactly that – a medical knowledge base extracted from a certified source.

The idea behind the paper is to treat the questionnaire called NetDoktor like a giant decision tree, then convert every path through that tree into a clean “if these facts are true, then these diagnoses apply” rule.

Each questionnaire from NetDoktor is identical in structure:

You start at the first question (what we call the root).
Each answer sends you to the next question (a branching).
Eventually you reach an end screen showing suggested diagnoses (the leaves).

If we systematically explore all possible answer combinations, we can reconstruct the full decision tree hidden inside the questionnaire.

The questions and answers are written for humans, not computers. An important next step is to translate natural text in the following manner:

“Is the skin reddened (in places)?” => a logical statement like reddened_skin or not_reddened_skin, depending on the user’s answer.
“Select the most affected area” => a single variable chosen option becomes the fact (e.g., head, neck, etc.)

The paper distinguishes two common styles of questions:

Type 1 (open-ended): where the answer itself becomes the fact, and
Type 2 (closed-ended): the question defines the fact, and the answer decides whether it is true or false.

Once we have the tree and the extracted logical statements, we can generate rules by walking from the root to every leaf:

Everything you answered along the way becomes the premise (the “IF’” part).
The diagnoses at the leaf become the conclusion (the “THEN” part).

An illustrative example looks like this: If someone has (i) nausea and stomach ache and also (fever or diarrhea), then one possible diagnosis is acute gastroenteritis. If someone has (ii) nausea and stomach ache and no fever and no diarrhea, then one possible diagnosis is gastritis.

We depict the example (i) as a decision tree.

Doing this at scale turns the whole questionnaire into a structured knowledge base that a computer can reason with.

A logic-based knowledge base is useful because it is:

Explainable: you can trace exactly which answers led to which diagnoses.
Deterministic: the same inputs always give the same outputs.
Queryable: you can ask targeted questions like “which diagnoses are possible if we only know these symptoms?”

One especially interesting application in the paper is testing modern AI systems: use this rule-based system as an oracle to evaluate black-box models like LLMs on the same diagnostic tasks.

The paper highlights several challenges and next steps, such as:

Handling cases where multiple different paths lead to the same diagnosis.
Removing redundancy when similar subtrees repeat.
Converting large trees into more efficient internal formats so you can compute things faster (for example, counting how many symptom combinations lead to each diagnosis).
Going beyond “what diagnosis was suggested” to “what were the sufficient and necessary reasons behind that suggestion” – which can help analyse potential bias in the underlying knowledge base.

The paper’s contribution is a pipeline for extracting expert-curated medical knowledge from an existing symptom-checker and transforming it into a clean, explainable, machine-reasonable set of rules.

In a world where medical AI is getting more powerful and sometimes less transparent, having a method to build understandable, testable knowledge bases from real expert sources is a big step toward safer and more accountable systems.

More can be read in the full paper available in the IS 2025 conference proceedings: https://chatmed-project.eu/knowledge-repository/conference-papers-is2025-ljubljana-slovenia/.

Can We See How Large Language Models Think?

A Topological Perspective on Explainable AI Large Language Models (LLMs) have become astonishingly capable, but also deeply opaque.They generate fluent text, reason across domains, and assist in high-stakes settings, yet their internal decision-making remains largely a black box. In our newly published paper, “Exploring the Potential of Topological Data Analysis for Explainable Large Language Models: […]

Prompt-to-Pill: Connecting AI Systems Across the Entire Drug Journey

Drug discovery is not a single prediction problem.It is a chain of decisions, from molecule design to clinical evaluation, where each step constrains the next. Yet most AI systems today still operate in silos, solving individual tasks without preserving continuity across the pipeline. Prompt-to-Pill explores a different paradigm: treating drug development as an end-to-end workflow, […]

How to Align Medical AI with the EU AI Act and MyHealth@EU

The integration of Artificial Intelligence (AI) into healthcare is moving faster than ever. However, for European developers, this innovation comes with a formidable “dual-compliance” challenge. On one hand, AI-based Clinical Decision Support Systems (CDSS) are classified as high-risk under the EU AI Act, requiring strict safety and transparency controls. On the other, the cross-border exchange […]

Building Bridges for Future Healthcare Innovation at APHP & Inserm

On December 4th, the ChatMED team traveled to Paris for a pivotal networking meeting that promises to shape the future trajectory of our project. Hosted on the invitation of two giants in European medical research, AP-HP (Assistance Publique – Hôpitaux de Paris) and Inserm (French National Institute of Health and Medical Research), the event was […]

The Power of Timely Action: A Healthcare System that Delivers

It was a privilege to take part as a panelist at the 4th AmCham North Macedonia Healthcare Conference, joining an inspiring discussion on “Recommendations for Improvement in the Year Ahead.” This panel built on a previous expert workshop where we explored:💡 Transition from a reactive to a proactive healthcare system💊 Improving access to medicines and […]

Vezilka: A New AI Beacon for the Region

We are thrilled to announce a monumental milestone that underscores the transformative power of the ChatMED ecosystem: the official launch of Vezilka, a new “AI Antenna” established as a key node of the Greek Pharos AI Factory. Vezilka is part of the European Union’s ambitious “https://digital-strategy.ec.europa.eu/en/policies/ai-factories” AI Factories and Antennas initiative, funded by EuroHPC JU. It […]

Art, AI, and Wellbeing

From its very inception, the ChatMED project was driven by a spirit of cognitive wellbeing. We believe that the intersection of Artificial Intelligence and healthcare isn’t just about data points and diagnostics; it is about supporting the human mind and enhancing the quality of life. This philosophy inspired a unique art exhibition held during our […]

ChatMED Takes Center Stage at IS2025 in Ljubljana, Slovenia

The ChatMED consortium made a resounding impact at the 28th Information Society Multiconference (IS2025), unveiling a suite of five groundbreaking papers (available in Knowledge Repository) that redefine how Artificial Intelligence is evaluated, deployed, and trusted in healthcare. Held at the Jožef Stefan Institute, the conference served as a global stage for the ChatMED team to […]

Strengthening Technical and Administrative Mastery: Highlights from the ChatMED Training Session at IS2025

On October 7, 2025, the ChatMED consortium convened at the Jožef Stefan Institute in Ljubljana for a vital training session held during the 28th Information Society Multiconference. Targeted specifically at technical and administrative staff as well as senior researchers, this session was designed to fortify the essential skills required to navigate the complexities of EU-funded […]

From Networking to Co-Creation: Achievements of the First ChatMED Summer School

The first ChatMED Summer School was originally envisioned as a simple networking event. However, driven by the consortium’s momentum, it evolved into something much more ambitious: proposals co-creation laboratory. Held under the title “Interdisciplinary Innovations,” the week-long event gathered experts from AI, neurology, clinical practice, and software quality to co-develop a competitive proposal for the […]

Innovation in Action: Building Capacity at the Jožef Stefan Institute

A cornerstone of the ChatMED project is the exchange of expertise between widening countries and leading international research centers. In May 2025, five key staff members from the consortium spent three intensive weeks at the Jožef Stefan Institute (JSI) in Ljubljana, Slovenia. These Short-Term Staff Exchanges (STSEs) focused on four specific topics, each resulting in […]

Bridging the Gap: Highlights from ChatMED’s Year 1 Training Sessions

Empowering Healthcare with Generative AI In its first year, the ChatMED project took significant strides toward bridging the knowledge gap in Generative AI within the healthcare sector. Hosted by the Faculty of Computer Science and Engineering (FCSE) in Skopje, North Macedonia, the project organized two comprehensive training sessions designed to equip researchers and medical practitioners […]