Notebook 1 — Mary's clinical odyssey begins¶

Processes, their parts, when they happen, in what order¶

Mary is 52. On 2026-02-18 she arrives at the outpatient gynecology clinic. Eight months later she will be in remission. Between those two dates a sequence of clinical processes unfolds — an examination, an ultrasound, a biopsy, a diagnosis, chemotherapy, a surgery, a follow-up. By the end of this notebook we will have asserted the entire shape of her timeline: every process, its internal structure, its timestamps, and the order in which the processes occur.

Learning objectives¶

  1. Declare clinical events as subclasses of sulo:Process
  2. Express internal structure of a process via sulo:hasDirectPart with cardinality
  3. Anchor processes in time via sulo:atTime + sulo:TimeInstant / StartTime / EndTime
  4. Order processes at the individual level via sulo:precedes
  5. Recover the transitive closure of those orderings at query time with a SPARQL property path

Setting up¶

We load SULO and create a fresh MIE-tutorial extension ontology that imports it.

In [1]:
import sys, os, datetime
for _p in ['.', '..', '../..']:
    if os.path.isdir(os.path.join(_p, 'lib')):
        os.chdir(_p); sys.path.insert(0, os.getcwd()); break

from lib.helpers import *
onto_path.append("dist")
In [2]:
sulo = get_ontology("dist/sulo.owl").load()

mie = get_ontology("https://w3id.org/ontostart/mie/")
mie.imported_ontologies.append(sulo)

print(f"SULO loaded: {len(list(sulo.classes()))} classes")
print(f"MIE ontology: {mie.base_iri}")
SULO loaded: 17 classes
MIE ontology: https://w3id.org/ontostart/mie/

Mary's odyssey at a glance¶

Date Event
2026-02-18 Routine gynecologic visit + manual breast exam
2026-02-20 Ultrasound of left breast
2026-02-22 Diagnostic assessment (preliminary)
2026-02-25 Core needle biopsy of left breast
2026-03-01 Histopathology + diagnostic assessment (confirmed)
2026-03-10 → 06-15 Neoadjuvant chemotherapy
2026-07-01 Lumpectomy of left breast
2026-09-30 Follow-up visit (remission)

Clinical class names are prefixed SCT_ to signal SNOMED-CT alignment intent.

§1 — Declaring the clinical processes¶

Every event is a sulo:Process. We declare each as a direct sub-class. We are not committing to internal structure or participants yet — only naming the kinds of things that can happen.

Note: SCT_DiagnosticAssessment will be instantiated twice in Mary's timeline (Feb 22 preliminary, Mar 1 confirmed). One class, two individuals.

In [3]:
# The visit and its component processes
with mie:
    class SCT_RoutineGynecologicExamination(sulo.Process):
        """A planned periodic examination of a patient by a gynecologist."""
        label = [locstr("routine gynecologic examination", "en")]

    class SCT_PhysicalExamination(sulo.Process):
        """A clinical examination of the patient's body by inspection, palpation, percussion, auscultation."""
        label = [locstr("physical examination", "en")]

    class SCT_ManualBreastExamination(sulo.Process):
        """A focused palpation of one or both breasts and the regional lymph nodes."""
        label = [locstr("manual breast examination", "en")]

    class SCT_ClinicalDocumentation(sulo.Process):
        """The act of recording clinical observations into the patient record."""
        label = [locstr("clinical documentation", "en")]
In [4]:
# Imaging + diagnostic processes
with mie:
    class SCT_UltrasonographyOfLeftBreast(sulo.Process):
        """Diagnostic ultrasonography focused on the tissues of the left breast."""
        label = [locstr("ultrasonography of left breast", "en")]

    class SCT_DiagnosticAssessment(sulo.Process):
        """A clinical reasoning process forming a diagnostic judgment from evidence."""
        label = [locstr("diagnostic assessment", "en")]

    class SCT_CoreNeedleBiopsyOfBreast(sulo.Process):
        """A percutaneous procedure that removes a cylindrical tissue sample from the breast."""
        label = [locstr("core needle biopsy of breast", "en")]

    class SCT_HistopathologyTest(sulo.Process):
        """Laboratory examination of a tissue specimen for cellular abnormalities."""
        label = [locstr("histopathology test", "en")]
In [5]:
# Treatment + follow-up
with mie:
    class SCT_NeoadjuvantAntineoplasticChemotherapy(sulo.Process):
        """Cytotoxic drug therapy administered before primary surgical treatment."""
        label = [locstr("neoadjuvant antineoplastic chemotherapy", "en")]

    class SCT_LumpectomyOfLeftBreast(sulo.Process):
        """Surgical excision of a localised lesion from the left breast."""
        label = [locstr("lumpectomy of left breast", "en")]

    class SCT_FollowUpVisit(sulo.Process):
        """A subsequent clinical encounter to assess disease status after a prior intervention."""
        label = [locstr("follow-up visit", "en")]

print(f"Declared {len(list(mie.classes()))} processes for Mary's odyssey")
Declared 11 processes for Mary's odyssey

§2 — Processes have parts¶

Parthood is not only for spatial objects. SULO's hasPart / hasDirectPart apply to processes too.

Why hasDirectPart, not hasPart? sulo:hasPart is declared transitive in SULO. Transitive properties cannot bear cardinality restrictions in OWL 2 DL — combining the two breaks the profile. sulo:hasDirectPart is its non-transitive sub-property, designed for cardinality-bearing parthood claims.

We add three restrictions, one at a time.

In [6]:
# Exactly one physical examination per visit
with mie:
    SCT_RoutineGynecologicExamination.is_a.append(
        sulo.hasDirectPart.exactly(1, SCT_PhysicalExamination)
    )
In [7]:
# At most one documentation step per visit
with mie:
    SCT_RoutineGynecologicExamination.is_a.append(
        sulo.hasDirectPart.max(1, SCT_ClinicalDocumentation)
    )
In [8]:
# Within a physical exam, at most one manual breast examination
with mie:
    SCT_PhysicalExamination.is_a.append(
        sulo.hasDirectPart.max(1, SCT_ManualBreastExamination)
    )
In [9]:
print("Restrictions on SCT_RoutineGynecologicExamination:")
for r in SCT_RoutineGynecologicExamination.is_a:
    print("  ", r)
print("\nRestrictions on SCT_PhysicalExamination:")
for r in SCT_PhysicalExamination.is_a:
    print("  ", r)
Restrictions on SCT_RoutineGynecologicExamination:
   sulo.Process
   sulo.hasDirectPart.exactly(1, mie.SCT_PhysicalExamination)
   sulo.hasDirectPart.max(1, mie.SCT_ClinicalDocumentation)

Restrictions on SCT_PhysicalExamination:
   sulo.Process
   sulo.hasDirectPart.max(1, mie.SCT_ManualBreastExamination)

§3 — Anchoring a process in time¶

Processes occur at times. sulo:atTime has range sulo:Time. SULO ships four temporal sub-classes:

Class When to use
sulo:TimeInstant A point in time
sulo:StartTime The first instant of an extended process
sulo:EndTime The last instant of an extended process
sulo:Duration A length of time (introduced in NB4)

Each time entity carries a value via sulo:hasValue — a functional data property.

We build Mary's Feb 18 visit one piece at a time.

In [10]:
# The visit individual itself
with mie:
    mary_visit_feb18 = SCT_RoutineGynecologicExamination("mary_visit_feb18")
    mary_visit_feb18.label = [locstr("Mary's routine gynecologic visit, 2026-02-18", "en")]
In [11]:
# Start time — 09:00 on 2026-02-18
with mie:
    visit_start = sulo.StartTime("mary_visit_feb18_start")
    visit_start.hasValue = datetime.datetime(2026, 2, 18, 9, 0)
In [12]:
# End time — 10:30 on 2026-02-18
with mie:
    visit_end = sulo.EndTime("mary_visit_feb18_end")
    visit_end.hasValue = datetime.datetime(2026, 2, 18, 10, 30)
In [13]:
# Attach both time anchors to the visit
with mie:
    mary_visit_feb18.atTime = [visit_start, visit_end]

print(f"Visit start: {visit_start.hasValue}")
print(f"Visit end:   {visit_end.hasValue}")
Visit start: 2026-02-18 09:00:00
Visit end:   2026-02-18 10:30:00
In [14]:
# Sub-process individuals — the physical exam (which itself contains a breast exam) plus documentation
with mie:
    mary_phys_exam   = SCT_PhysicalExamination("mary_phys_exam_feb18")
    mary_breast_exam = SCT_ManualBreastExamination("mary_breast_exam_feb18")
    mary_doc_feb18   = SCT_ClinicalDocumentation("mary_doc_feb18")

    mary_phys_exam.hasDirectPart   = [mary_breast_exam]
    mary_visit_feb18.hasDirectPart = [mary_phys_exam, mary_doc_feb18]

print(f"Direct parts of the visit: {[p.name for p in mary_visit_feb18.hasDirectPart]}")
print(f"Direct parts of the physical exam: {[p.name for p in mary_phys_exam.hasDirectPart]}")
Direct parts of the visit: ['mary_phys_exam_feb18', 'mary_doc_feb18']
Direct parts of the physical exam: ['mary_breast_exam_feb18']

§4 — The rest of the odyssey¶

We now create one individual per remaining event. Most are anchored to a single TimeInstant; the chemotherapy uniquely has StartTime + EndTime because it extends over fourteen weeks.

We add one event at a time, with a short helper to keep the time-instant boilerplate compact.

In [15]:
# Helper — create a TimeInstant with a datetime value
def _instant(name, dt):
    with mie:
        t = sulo.TimeInstant(name)
        t.hasValue = dt
        return t
In [16]:
# 2026-02-20 — ultrasound
with mie:
    mary_ultrasound = SCT_UltrasonographyOfLeftBreast("mary_ultrasound_feb20")
    mary_ultrasound.atTime = [_instant("t_2026_02_20", datetime.datetime(2026, 2, 20, 11, 0))]
In [17]:
# 2026-02-22 — preliminary diagnostic assessment
with mie:
    mary_diag_prelim = SCT_DiagnosticAssessment("mary_diag_feb22_preliminary")
    mary_diag_prelim.atTime = [_instant("t_2026_02_22", datetime.datetime(2026, 2, 22, 14, 0))]
In [18]:
# 2026-02-25 — biopsy
with mie:
    mary_biopsy = SCT_CoreNeedleBiopsyOfBreast("mary_biopsy_feb25")
    mary_biopsy.atTime = [_instant("t_2026_02_25", datetime.datetime(2026, 2, 25, 9, 30))]
In [19]:
# 2026-03-01 — histopathology test and the *confirmed* diagnostic assessment
with mie:
    mary_histo = SCT_HistopathologyTest("mary_histo_mar01")
    mary_histo.atTime = [_instant("t_2026_03_01_lab", datetime.datetime(2026, 3, 1, 12, 0))]

    mary_diag_confirm = SCT_DiagnosticAssessment("mary_diag_mar01_confirmed")
    mary_diag_confirm.atTime = [_instant("t_2026_03_01", datetime.datetime(2026, 3, 1, 16, 0))]
In [20]:
# 2026-03-10 → 2026-06-15 — chemotherapy: extends over time, so StartTime + EndTime
with mie:
    mary_chemo  = SCT_NeoadjuvantAntineoplasticChemotherapy("mary_chemo_2026")
    chemo_start = sulo.StartTime("mary_chemo_start"); chemo_start.hasValue = datetime.datetime(2026, 3, 10, 8, 0)
    chemo_end   = sulo.EndTime("mary_chemo_end");     chemo_end.hasValue   = datetime.datetime(2026, 6, 15, 17, 0)
    mary_chemo.atTime = [chemo_start, chemo_end]
In [21]:
# 2026-07-01 — lumpectomy
with mie:
    mary_lumpectomy = SCT_LumpectomyOfLeftBreast("mary_lumpectomy_jul01")
    mary_lumpectomy.atTime = [_instant("t_2026_07_01", datetime.datetime(2026, 7, 1, 8, 0))]
In [22]:
# 2026-09-30 — follow-up
with mie:
    mary_followup = SCT_FollowUpVisit("mary_followup_sep30")
    mary_followup.atTime = [_instant("t_2026_09_30", datetime.datetime(2026, 9, 30, 10, 0))]

odyssey = [mary_visit_feb18, mary_ultrasound, mary_diag_prelim,
           mary_biopsy, mary_histo, mary_diag_confirm,
           mary_chemo, mary_lumpectomy, mary_followup]

print(f"Mary's odyssey now has {len(odyssey)} event individuals.")
Mary's odyssey now has 9 event individuals.

§5 — Ordering processes with sulo:precedes¶

sulo:precedes relates two processes when the first ends no later than the second begins. SULO ships it as a plain ObjectProperty — not declared transitive. Whether transitive closure is appropriate depends on the question.

Following the tutorial's discipline — use only SULO's relations — we will not introduce a local transitive sub-property. We assert only Mary's eight immediate hops, and we recover the closure at query time using a SPARQL property path (§7).

In [23]:
# Eight immediate-successor assertions — no new property, no transitivity declaration
mary_visit_feb18.precedes  = [mary_ultrasound]
mary_ultrasound.precedes   = [mary_diag_prelim]
mary_diag_prelim.precedes  = [mary_biopsy]
mary_biopsy.precedes       = [mary_histo]
mary_histo.precedes        = [mary_diag_confirm]
mary_diag_confirm.precedes = [mary_chemo]
mary_chemo.precedes        = [mary_lumpectomy]
mary_lumpectomy.precedes   = [mary_followup]

asserted = sum(len(e.precedes) for e in odyssey)
print(f"Total asserted sulo:precedes triples: {asserted}")
Total asserted sulo:precedes triples: 8

§6 — Run the reasoner: consistency¶

Before querying, we run HermiT to verify that the cardinality restrictions on SCT_RoutineGynecologicExamination are satisfied by mary_visit_feb18. The reasoner is not needed for the transitive chain — that's coming from SPARQL — but it remains the right tool for OWL-axiom validation.

In [24]:
result = safe_call_reasoner(mie)
print("Reasoner ok:          ", result["ok"])
print("Inconsistent classes: ", result["inconsistent"])
Reasoner ok:           True
Inconsistent classes:  []

§7 — Reasoning about chemotherapy in context (SPARQL property paths)¶

We zoom in on mary_chemo_2026 — the 14-week chemotherapy process — and ask the reasoner two questions, using SPARQL 1.1 property paths to traverse the sulo:precedes chain in both directions.

  • ^sulo:precedes+ — what events preceded chemo, one or more hops back? The ^ reverses the path; + is the one-or-more-hops operator. The asserted predecessor link is mary_diag_mar01_confirmed precedes mary_chemo; the closure follows the chain back to the very first visit.
  • sulo:precedes+ — what events followed chemo, one or more hops forward?

A single SPARQL + expands one asserted predecessor link into the full reachable chain — without ever declaring sulo:precedes as a TransitiveProperty and without persisting any inferred triples in the graph.

In [25]:
# Predecessors of chemotherapy — events reachable via ^sulo:precedes+
rows = list(default_world.sparql("""
    PREFIX sulo: <https://w3id.org/sulo/>
    SELECT ?before WHERE { ??1 ^sulo:precedes+ ?before }
""", [mary_chemo]))

print(f"Events before {mary_chemo.name} — {len(rows)} hop(s) back:")
for (e,) in rows:
    print(f"  - {e.name}")
Events before mary_chemo_2026 — 6 hop(s) back:
  - mary_diag_mar01_confirmed
  - mary_histo_mar01
  - mary_biopsy_feb25
  - mary_diag_feb22_preliminary
  - mary_ultrasound_feb20
  - mary_visit_feb18
In [26]:
# Successors of chemotherapy — events reachable via sulo:precedes+
rows = list(default_world.sparql("""
    PREFIX sulo: <https://w3id.org/sulo/>
    SELECT ?after WHERE { ??1 sulo:precedes+ ?after }
""", [mary_chemo]))

print(f"Events after {mary_chemo.name} — {len(rows)} hop(s) forward:")
for (e,) in rows:
    print(f"  - {e.name}")
Events after mary_chemo_2026 — 2 hop(s) forward:
  - mary_lumpectomy_jul01
  - mary_followup_sep30

§8 — Computing the duration of chemotherapy¶

sulo:StartTime and sulo:EndTime are sub-classes of sulo:TimeInstant, each carrying a sulo:hasValue data property whose range is xsd:dateTime. Mary's chemo runs from 2026-03-10 08:00 → 2026-06-15 17:00, attached to mary_chemo_2026 via two sulo:atTime links.

A single SPARQL query retrieves both timestamps; Python's datetime arithmetic then gives the duration in whatever unit the use case wants.

In [27]:
# One SPARQL query returns Mary's chemo start + end datetimes; Python does the arithmetic
rows = list(default_world.sparql("""
    PREFIX sulo: <https://w3id.org/sulo/>
    PREFIX rdf:  <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
    SELECT ?start ?end WHERE {
        ??1 sulo:atTime ?ts , ?te .
        ?ts rdf:type sulo:StartTime ; sulo:hasValue ?start .
        ?te rdf:type sulo:EndTime   ; sulo:hasValue ?end .
    }
""", [mary_chemo]))

start, end = rows[0]
duration  = end - start

print(f"Chemotherapy:  {start}  →  {end}")
print(f"Duration:      {duration}")
print(f"               = {duration.days} days")
print(f"               = {duration.days / 7:.1f} weeks")
print(f"               = {duration.total_seconds() / 3600:.0f} hours")
Chemotherapy:  2026-03-10 08:00:00  →  2026-06-15 17:00:00
Duration:      97 days, 9:00:00
               = 97 days
               = 13.9 weeks
               = 2337 hours

§9 — Save the checkpoint¶

We save the ontology as dist/mie-01.owl. NB2 will reload this file and add the participating people, devices, and roles using the SULO Process-Role-Object pattern.

In [30]:
os.makedirs("dist", exist_ok=True)
mie.save(file="dist/mie-01.owl", format="rdfxml")
print("Saved dist/mie-01.owl")
print(f"  classes:                   {len(list(mie.classes()))}")
print(f"  individuals:               {len(list(mie.individuals()))}")
print(f"  object properties:         {len(list(mie.object_properties()))}")
print(f"  data properties:           {len(list(mie.data_properties()))}")
Saved dist/mie-01.owl
  classes:                   11
  individuals:               23
  object properties:         0
  data properties:           0