Skip to content

Files and Artifacts

Source mirrored from seocho/docs/FILES_AND_ARTIFACTS.md

This page answers the practical question: after I run SEOCHO, where do the important files live, and how do I inspect them directly?

Typical paths:

  • schema.jsonld
    • default ontology file used by CLI examples
  • any custom path you pass with --schema
  • sample ontology files in examples/datasets/*.jsonld

How to inspect:

Terminal window
cat schema.jsonld | jq .
seocho ontology check --schema schema.jsonld
seocho ontology export --schema schema.jsonld --format shacl --output shacl.json
seocho ontology diff --left schema_v1.jsonld --right schema_v2.jsonld

What they are for:

  • define node types, relationships, properties, and cardinality
  • drive extraction prompts, validation, and query generation
  • act as the portable schema contract you can version in git

How the same ontology reaches both local SDK and runtime paths:

from seocho import Ontology, Seocho
ontology = Ontology.from_jsonld("schema.jsonld")
client = Seocho(ontology=ontology)
artifacts = client.approved_artifacts_from_ontology()
prompt_context = client.prompt_context_from_ontology()
draft = client.artifact_draft_from_ontology(name="finance_core_v1")

Use this when:

  • local indexing/querying should use the ontology directly
  • runtime ingest/query should use the same ontology through approved semantic artifacts
  • you want one schema contract instead of a separate hand-maintained runtime payload

Typical path:

  • data/neo4j/

Important subdirectories:

  • data/neo4j/data
  • data/neo4j/logs
  • data/neo4j/import
  • data/neo4j/plugins

How to inspect:

  • DozerDB browser: http://localhost:7474
  • Bolt endpoint: bolt://localhost:7687
  • Python local mode:
from seocho.store import Neo4jGraphStore
store = Neo4jGraphStore("bolt://localhost:7687", "neo4j", "password")
print(store.get_schema(database="neo4j"))

Typical path:

  • outputs/semantic_artifacts/

What lives there:

  • draft and approved ontology candidates
  • SHACL-like shape candidates
  • vocabulary candidates used for governed semantic retrieval

How to create them from the same local ontology:

from seocho import Ontology, Seocho
ontology = Ontology.from_jsonld("schema.jsonld")
client = Seocho(ontology=ontology)
draft = client.artifact_draft_from_ontology(name="finance_core_v1")

How to inspect:

Terminal window
seocho artifacts list --status approved
curl -sS "http://localhost:8001/semantic/artifacts?workspace_id=default" | jq .
curl -sS "http://localhost:8001/semantic/artifacts/<ARTIFACT_ID>?workspace_id=default" | jq .

Typical path:

  • outputs/rule_profiles/rule_profiles.db

What lives there:

  • workspace-scoped rule profiles derived from runtime or governance flows

How to inspect:

Terminal window
curl -sS "http://localhost:8001/rules/profiles?workspace_id=default" | jq .

If you want to inspect the SQLite file directly:

Terminal window
sqlite3 outputs/rule_profiles/rule_profiles.db '.tables'

Typical path:

  • outputs/semantic_metadata/

What lives there:

  • semantic run summaries
  • route / support / strategy metadata
  • response previews used for debugging and regression review

How to inspect:

Terminal window
curl -sS "http://localhost:8001/semantic/runs?workspace_id=default&limit=5" | jq .

Typical path:

  • traces/seocho-runtime.jsonl
  • or whatever SEOCHO_TRACE_JSONL_PATH points to

How to inspect:

Terminal window
tail -n 20 traces/seocho-runtime.jsonl

Recommended interpretation:

  • jsonl is the neutral, portable trace artifact
  • Opik is optional and can be hosted or self-hosted

Useful starting points:

  • examples/quickstart.ipynb
  • examples/bring_your_data.ipynb
  • examples/datasets/tutorial_filings_sample.json
  • examples/datasets/fibo_base.jsonld
  • examples/datasets/fibo_plus.jsonld
  • examples/datasets/fibo_minus.jsonld

Default make up or docker compose up -d starts:

  • neo4j
  • extraction-service
  • evaluation-interface

Legacy service kept behind an explicit profile:

Terminal window
docker compose --profile legacy-semantic up -d semantic-service

That means the current first-run product path is:

  1. store graph state in data/neo4j/
  2. talk to the runtime through extraction-service
  3. optionally use the thin local UI in evaluation-interface

And the ontology-first path is:

  1. author or inspect schema.jsonld
  2. use it directly in local SDK mode
  3. convert it into typed runtime semantic artifacts when the server/runtime path needs the same constraints