Notebook 6 — Reasoning and SPARQL over Mary's record¶
What the previous five notebooks bought us¶
By this point Mary's clinical record is a working OWL ontology — dist/mie-05.owl — with 48 classes, 66 individuals, and zero locally-declared properties. Every assertion was made through SULO's own vocabulary, with the PRO extension for the role taxonomy, and the reasoner has been consulted along the way to confirm consistency and derive classifications.
This notebook makes no new assertions. Its purpose is to ask five concrete clinical questions of the assembled ontology and watch the modelling decisions of NB1–NB5 pay off — each query reuses a different piece of the machinery, and each piece earns its place by being load-bearing for at least one query.
| Question | What it exercises |
|---|---|
| 1. Reconstruct Mary's clinical timeline in order. | NB1 — instance-level sulo:precedes + SPARQL property path precedes* |
| 2. Which diagnoses about Mary's disease have been confirmed? | NB5 — diagnosis triangle + value restriction defined class |
| 3. Which BP readings crossed the hypertension threshold? | NB4 — constrained-datatype defined class |
| 4. List every clinical feature recorded on Mary's biopsy specimen. | NB4 — qualities and quantities attached via hasFeature |
| 5. Merge the FHIR view of Mary with the MIE view. | NB5 — owl:sameAs cross-system identity |
Learning objectives¶
- Query the ontology with SPARQL 1.1 —
SELECT,COUNT(DISTINCT …), property paths (*/+), multi-hop traversal,ORDER BY - Distinguish asserted answers (what was stated directly) from inferred answers (what classification produced) and recognise that for defined classes,
Class.instances()and SPARQL?x a Classreturn the inferred set after reasoning - Read OWA-respecting queries — what the reasoner will derive vs. what it cannot under open-world semantics — and adjust query design accordingly
- See, in one place, that the constraint we held through five notebooks — no new properties added to the MIE ontology — does not prevent expressive clinical querying; SULO's vocabulary is enough
Setting up¶
We reload the final ontology checkpoint and run the reasoner once. After this, the defined-class memberships are materialised, and queries below can rely on the reasoner-extended graph.
import sys, os
for _p in ['.', '..', '../..']:
if os.path.isdir(os.path.join(_p, 'lib')):
os.chdir(_p); sys.path.insert(0, os.getcwd()); break
from lib.helpers import *
onto_path.append("dist")
sulo = get_ontology("dist/sulo.owl").load()
pro = get_ontology("dist/pro.owl").load()
mie = get_ontology("dist/mie-05.owl").load()
result = safe_call_reasoner(mie)
print(f"Reasoner ok: {result['ok']}")
print(f"Inconsistent classes: {result['inconsistent']}")
print(f"MIE classes: {len(list(mie.classes()))}")
print(f"MIE individuals: {len(list(mie.individuals()))}")
print(f"MIE object properties (local): {len(list(mie.object_properties()))} ← still zero")
Reasoner ok: True Inconsistent classes: [] MIE classes: 53 MIE individuals: 67 MIE object properties (local): 0 ← still zero
Q1 — Mary's clinical timeline, in order¶
From NB1 we asserted eight sulo:precedes edges chaining Mary's nine clinical events from the Feb 18 visit through to the Sep 30 follow-up. SPARQL 1.1 property paths let us traverse the chain at query time without needing transitive closure on the property itself: precedes* matches a path of zero or more hops, so binding the LHS to the starting event recovers the full forward closure plus the starting node.
We then pull each event's atTime and the hasValue literal on that time, and ORDER BY it. The result is a chronologically sorted list of every event in Mary's odyssey — recovered from eight asserted edges and one SPARQL query.
rows = list(default_world.sparql("""
PREFIX sulo: <https://w3id.org/sulo/>
PREFIX mie: <https://w3id.org/ontostart/mie/>
SELECT ?proc ?when WHERE {
mie:mary_visit_feb18 sulo:precedes* ?proc .
?proc sulo:atTime ?t .
?t sulo:hasValue ?when .
} ORDER BY ?when
"""))
print(f"Mary's clinical timeline — {len(rows)} (event, time) pairs\n")
for proc, when in rows:
print(f" {str(when):20s} {proc.name}")
Mary's clinical timeline — 11 (event, time) pairs 2026-02-18 09:00:00 mary_visit_feb18 2026-02-18 10:30:00 mary_visit_feb18 2026-02-20 11:00:00 mary_ultrasound_feb20 2026-02-22 14:00:00 mary_diag_feb22_preliminary 2026-02-25 09:30:00 mary_biopsy_feb25 2026-03-01 12:00:00 mary_histo_mar01 2026-03-01 16:00:00 mary_diag_mar01_confirmed 2026-03-10 08:00:00 mary_chemo_2026 2026-06-15 17:00:00 mary_chemo_2026 2026-07-01 08:00:00 mary_lumpectomy_jul01 2026-09-30 10:00:00 mary_followup_sep30
Q2 — Confirmed diagnoses about Mary's disease¶
From NB5 we set up the diagnosis triangle: DiagnosisStatement (InformationObject) refersTo BreastCancer (Process). The class ConfirmedDiagnosis was defined as DiagnosisStatement & hasFeature value confirmed_status. Mary has two diagnosis statements — preliminary (Feb 22) and confirmed (Mar 1) — and only the latter satisfies the value restriction.
After reasoning, mary_dx_statement_mar01 is a member of ConfirmedDiagnosis. The SPARQL query asks for any statement of that type, and recovers the disease individual it refers to:
rows = list(default_world.sparql("""
PREFIX sulo: <https://w3id.org/sulo/>
PREFIX mie: <https://w3id.org/ontostart/mie/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?stmt ?disease WHERE {
?stmt rdf:type mie:ConfirmedDiagnosis .
?stmt sulo:refersTo ?disease .
}
"""))
print(f"Confirmed diagnoses recorded for Mary — {len(rows)} result(s)\n")
for stmt, disease in rows:
print(f" Statement: {stmt.name}")
print(f" refersTo disease: {disease.name}")
print(f" disease type: {[c.name for c in disease.is_a if hasattr(c,'name')]}")
Confirmed diagnoses recorded for Mary — 1 result(s)
Statement: mary_dx_statement_mar01
refersTo disease: mary_breast_cancer
disease type: ['InvasiveCarcinomaOfBreast']
Q3 — Which of Mary's BP readings crossed the hypertension threshold?¶
From NB4 we defined HypertensiveReading as BPMeasurement & hasValue some int[≥ 140]. Mary's three readings on Feb 18 were 118, 142, and 165 mmHg, so two of them should classify.
The query returns the readings together with their numeric values — confirming both which readings classified and why:
# Q3a — readings classified as HypertensiveReading by the reasoner
rows = list(default_world.sparql("""
PREFIX sulo: <https://w3id.org/sulo/>
PREFIX mie: <https://w3id.org/ontostart/mie/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?reading ?value WHERE {
?reading rdf:type mie:HypertensiveReading .
?reading sulo:hasValue ?value .
} ORDER BY ?value
"""))
print(f"Mary's readings classified as HypertensiveReading — {len(rows)} of 3\n")
for reading, value in rows:
print(f" {reading.name:30s} {value} mmHg")
Mary's readings classified as HypertensiveReading — 2 of 3 mary_bp_reading2_feb18 142 mmHg mary_bp_reading3_feb18 165 mmHg
# Q3b — for context, show *all* Mary's readings, flagged
rows = list(default_world.sparql("""
PREFIX sulo: <https://w3id.org/sulo/>
PREFIX mie: <https://w3id.org/ontostart/mie/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?reading ?value WHERE {
?reading rdf:type mie:BPMeasurement .
?reading sulo:hasValue ?value .
} ORDER BY ?value
"""))
print("All BPMeasurement readings on Mary:")
ht_set = {r[0] for r in default_world.sparql("""
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX mie: <https://w3id.org/ontostart/mie/>
SELECT ?r WHERE { ?r rdf:type mie:HypertensiveReading }
""")}
for reading, value in rows:
flag = " HYPERTENSIVE" if reading in ht_set else ""
print(f" {reading.name:30s} {value} mmHg{flag}")
All BPMeasurement readings on Mary: mary_bp_reading1_feb18 118 mmHg mary_bp_reading2_feb18 142 mmHg HYPERTENSIVE mary_bp_reading3_feb18 165 mmHg HYPERTENSIVE
Q4 — Every clinical feature recorded on Mary's specimen¶
Mary's tissue specimen carries four features — tumour grade (Grade 2), and three receptor status qualities (ER+, PR+, HER2−), all attached via hasFeature in NB4. A SPARQL query joining hasFeature and rdf:type recovers them, together with their qualitative classes:
# Q4a — every feature attached to Mary's specimen, with its type
rows = list(default_world.sparql("""
PREFIX sulo: <https://w3id.org/sulo/>
PREFIX mie: <https://w3id.org/ontostart/mie/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
SELECT ?feature ?type WHERE {
mie:mary_tissue_feb25 sulo:hasFeature ?feature .
?feature rdf:type ?type .
FILTER (?type != owl:NamedIndividual)
} ORDER BY ?feature
"""))
print(f"Features of mary_tissue_feb25 — {len(rows)} (feature, type) pairs\n")
for feature, t in rows:
print(f" {feature.name:25s} type: {t.name}")
Features of mary_tissue_feb25 — 4 (feature, type) pairs mary_tumour_grade type: TumourGrade2 mary_er_status type: ERPositive mary_pr_status type: PRPositive mary_her2_status type: HER2Negative
# Q4b — which classes does Mary's specimen now belong to (asserted + inferred)?
rows = list(default_world.sparql("""
PREFIX mie: <https://w3id.org/ontostart/mie/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT ?cls WHERE {
mie:mary_tissue_feb25 rdf:type ?cls .
}
"""))
print("Classes Mary's specimen belongs to (asserted + inferred):")
for (cls,) in rows:
print(f" - {cls.name}")
Classes Mary's specimen belongs to (asserted + inferred): - NamedIndividual - Tissue - HormoneReceptorPositive - IntermediateOrHighGradeTumour
Q5 — Merging the FHIR view with the MIE view¶
From NB5, mie:mary and mie:fhir_patient_12345 are linked by owl:sameAs. owl:sameAs is symmetric by OWL semantics — but SPARQL evaluates over the raw graph and does not automatically traverse symmetry. owlready2 also does not eagerly materialise sameAs-induced property assertions, so a plain SPARQL query against either URI returns only what was directly asserted on that URI.
The pragmatic fix is a UNION covering both directions of the link. The query below answers the question "given an external FHIR ID, what does the MIE ontology know about this patient?" by joining through owl:sameAs in either direction:
# Q5a — every role and quality of Mary, looked up via owl:sameAs in either direction
rows = list(default_world.sparql("""
PREFIX sulo: <https://w3id.org/sulo/>
PREFIX mie: <https://w3id.org/ontostart/mie/>
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
SELECT DISTINCT ?role ?role_type WHERE {
{ mie:fhir_patient_12345 owl:sameAs ?mie_id }
UNION
{ ?mie_id owl:sameAs mie:fhir_patient_12345 }
?role sulo:isFeatureOf ?mie_id .
?role rdf:type ?role_type .
FILTER (?role_type != owl:NamedIndividual)
} ORDER BY ?role
"""))
print(f"Roles & qualities of FHIR Patient/12345 (via owl:sameAs join) — {len(rows)} result(s)\n")
for role, t in rows:
print(f" {role.name:25s} type: {t.name}")
Roles & qualities of FHIR Patient/12345 (via owl:sameAs join) — 2 result(s) mary_patient_role type: SubjectOfCareRole mary_systolic_bp type: SystolicBloodPressure
# Q5b — reverse direction: external URIs sameAs Mary
rows = list(default_world.sparql("""
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX mie: <https://w3id.org/ontostart/mie/>
SELECT ?ext WHERE { mie:mary owl:sameAs ?ext }
"""))
print("External-system URIs sameAs Mary:")
for (ext,) in rows:
print(f" - {ext.iri}")
External-system URIs sameAs Mary: - https://w3id.org/ontostart/mie/fhir_patient_12345
Recap — five queries, five OWL/SULO patterns paying off¶
| Query | OWL/SULO patterns it relied on | Where the pattern came from |
|---|---|---|
| Q1 Mary's timeline | sulo:precedes (instance-level) + SPARQL property path precedes* |
NB1 |
| Q2 Confirmed diagnoses | InformationObject refersTo Process, value restriction hasFeature.value(confirmed_status), defined class ConfirmedDiagnosis |
NB5 |
| Q3 Hypertensive readings | Quantity with hasValue, ConstrainedDatatype(int, min_inclusive=140), defined class HypertensiveReading |
NB4 |
| Q4 Specimen features | Quality subclasses attached via hasFeature, defined classes for IntermediateOrHighGradeTumour and HormoneReceptorPositive |
NB4 |
| Q5 Cross-system identity | owl:sameAs via equivalent_to.append() on individuals; 2-hop traversal across the link |
NB5 |
Five queries, five distinct OWL/SULO patterns, all expressed using SULO's own vocabulary — no MIE-local properties were declared anywhere in the entire tutorial. The reasoner did the heavy lifting (classification of defined classes); SPARQL did the retrieval. The two roles are complementary: reasoning extends the graph; querying inspects it.
What is not answered here, but should be in a production deployment, is FAIRness: discoverability, persistence, machine-readability of the metadata. That is the subject of NB7, where the ontology is packaged for publication with proper version IRIs, Dublin Core metadata, and a FOOPS! assessment.