Skip to content

Ontology Enforcement Modes

When SEOCHO indexes a document it extracts entities and relationships and checks them against your ontology. Enforcement decides what happens to material that does not fit the ontology’s declared types: is it admitted, guided toward the schema, rejected, or admitted-but-flagged?

You set it in one line:

ontology:
path: ./schema.yaml
enforcement: guided # guided | strict | open

In a sweep it becomes the axis you compare:

variants:
- name: guided
vars: { enforcement: guided }
- name: strict
vars: { enforcement: strict }
- name: open
vars: { enforcement: open }

enforcement is an admission policy for extracted data — not a reasoning regime. It does not change query-time inference; it changes which extracted nodes and relationships are allowed into the graph and how out-of-vocabulary material is handled. (It is the SHACL closed-shape idea, not the OWA/CWA distinction.)

guided (default)strictopen
Extraction promptontology guides+ closed-vocabulary instructionontology guides
Empty-result retryyesnoyes
Generic Entity fallbackyesnoyes
Heuristic fallback (LLM fails)yesnoyes
On validation errorwarn & writereject chunkwarn & write
Closed validation¹noyesno
Out-of-vocabulary taggingnono_out_of_ontology

¹ Closed validation drops the Entity exemption and additionally checks that every relationship endpoint references an extracted node and conforms to the relationship’s declared source/target type (following the ontology’s broader subclass chain).

The ontology guides extraction but does not gate it. The model is told which entity and relationship types to prefer; anything that does not fit cleanly still gets written, and validation problems are recorded as warnings rather than blocking the write. If extraction comes back empty, SEOCHO retries once in a relaxed mode and may fall back to a generic Entity node.

This is the mode SEOCHO’s retrieval quality was tuned against. Use it when you want maximum recall and you treat the ontology as a strong hint, not a contract — most question-answering workloads.

Only what the ontology declares is admitted. Concretely:

  • a constant “use only the listed types; never invent an Entity” instruction is appended to the extraction prompt;
  • the relaxed-empty retry and the generic-Entity / heuristic fallbacks are disabled — an empty extraction is a legitimate outcome for text that contains nothing in your vocabulary;
  • validation runs closed: unknown labels, dangling relationship endpoints, and endpoints that violate a relationship’s declared source/target type are all errors;
  • a chunk that produces any validation error is rejected (not written), and the rejection is counted in the report.

Use it when the graph must be trustworthy — audit, compliance, or regulated domains where a single off-ontology fact is worse than a missing one. Expect fewer nodes than guided, and accept that some chunks contribute nothing.

A strict run that indexes fewer nodes than guided is working as intended — it admitted only schema-conformant facts.

Everything extracted is written, exactly like guided — plus every node or relationship whose type is not in the ontology is stamped with a _out_of_ontology: "true" property. Nothing is rejected.

That tag is the point: it lets you keep full recall while still being able to separate sanctioned facts from unsanctioned ones, and it is the signal an offline governance pass uses to find schema-induction candidates — types the data wants that your ontology does not yet have.

// What is the data asking me to add to the ontology?
MATCH (n) WHERE n._out_of_ontology = 'true'
RETURN labels(n) AS undeclared_type, count(*) AS hits
ORDER BY hits DESC

Use it while you are still designing the ontology, or to audit how much of a corpus your current schema actually covers.

  • guided — most QA workloads; ontology as a strong hint. (default)
  • strict — trust matters more than recall; reject anything off-schema.
  • open — discovery / ontology design; keep everything, label the gaps.

A quick way to see the difference on your own corpus is to sweep all three and read the comparison table:

Terminal window
seocho sweep examples/run/sweep-enforcement/sweep.yaml

The bundled example indexes the same documents and questions under all three modes and prints node/relationship counts side by side. See Getting Started for the full seocho run / seocho sweep interface and the Ontology Guide for designing the schema these modes enforce.