Python SDK

Source mirrored from seocho/docs/PYTHON_INTERFACE_QUICKSTART.md

SEOCHO’s Python SDK is designed around a simple rule:

keep agents and graph databases aligned to one ontology contract
default to the semantic layer
turn on bounded repair when retrieval is insufficient
use debate only as an explicit advanced mode

If you already have the runtime running at http://localhost:8001, you can start here immediately.

If your first question is how to load your own records safely, read APPLY_YOUR_DATA.md alongside this guide. If your first question is how to measure quality or latency, read BENCHMARKS.md instead of treating this guide as the benchmark contract.

1. Install

Pick the mode first:

SEOCHO standardizes on uv; the uv pip forms below work in any environment, and plain pip is a drop-in if you are not on uv.

HTTP client mode only:

uv pip install seocho

local SDK engine from the published package:

uv pip install "seocho[local]"

repository development or local CLI authoring from a clone:

uv sync --extra dev      # then prefix commands with `uv run` (no venv to activate)

offline ontology governance helpers:

uv pip install "seocho[ontology]"

Important:

uv pip install seocho is enough for remote HTTP client mode.
uv pip install "seocho[local]" is the simplest published-package path for Seocho.local(...).
Seocho.local(ontology) defaults to embedded LadybugDB, so a Neo4j/DozerDB server is optional for hello world.
pass graph="bolt://..." or Neo4jGraphStore(...) when you want the production DozerDB/Neo4j path.
uv sync --extra dev (then uv run …) is the right path when you are editing the repo itself.

If you are iterating on schema evolution, use the offline governance CLI:

seocho ontology check --schema schema.jsonld
seocho ontology export --schema schema.jsonld --format shacl --output shacl.json
seocho ontology diff --left schema_v1.jsonld --right schema_v2.jsonld
seocho ontology report --schema schema_v2.ttl --output outputs/ontology_report.json

Recommendation:

keep a stable package_id on the ontology and bump version semantically
treat seocho ontology diff as the first migration warning gate before runtime rollout
use seocho ontology report before rollout when you want one bundle with context_hash, semantic artifact draft, SHACL export, and synthetic sample-data validation

2. Configure

You have two primary SDK shapes:

Mode	Constructor	When to use it
HTTP client	`Seocho(base_url="http://localhost:8001", workspace_id="default")`	consume a running SEOCHO runtime
Embedded local engine	`Seocho.local(ontology)`	serverless SDK authoring, experiments, hello world
Explicit local engine	`Seocho(ontology=..., graph_store=..., llm=...)`	direct backend control

Fastest script-style setup:

import seocho

seocho.configure(base_url="http://localhost:8001", workspace_id="default")

Application-style setup:

from seocho import Seocho

client = Seocho(base_url="http://localhost:8001", workspace_id="default")

Important parameters:

base_url: root URL of the running SEOCHO runtime
workspace_id: required logical scope for runtime-facing data and queries
user_id, agent_id, session_id: optional scope fields for memory/runtime tracing

Embedded local engine example:

from seocho import Ontology, Seocho

client = Seocho.local(Ontology(name="demo"))
client.add("Marie Curie worked at the University of Paris.")
print(client.ask("Where did Marie Curie work?"))

Production graph-server example:

from seocho import Seocho, Ontology
from seocho.store import Neo4jGraphStore, OpenAIBackend

client = Seocho(
    ontology=Ontology(name="demo"),
    graph_store=Neo4jGraphStore("bolt://localhost:7687", "neo4j", "password"),
    llm=OpenAIBackend(model="gpt-4o"),
    workspace_id="default",
)

Local-mode constructor parameters you should understand:

graph_store: explicit backend used by the SDK; Seocho.local(...) creates an embedded LadybugDB store by default
llm: OpenAI-compatible extraction/query backend
ontology: schema contract that drives extraction, validation, and query generation

2.1 Use One Ontology Across Local And Runtime Paths

This is the intended product shape:

author one ontology
use it directly in local SDK mode
convert that same ontology into typed runtime artifacts when you need the server/runtime path

from seocho import Ontology, Seocho

ontology = Ontology.load("schema.jsonld")
client = Seocho(ontology=ontology)

artifacts = client.approved_artifacts_from_ontology()
prompt_context = client.prompt_context_from_ontology(
    instructions=["Prefer finance ontology labels and constraints."]
)
draft = client.artifact_draft_from_ontology(name="finance_core_v1")

Use the helpers like this:

approved_artifacts_from_ontology(): build runtime-safe ApprovedArtifacts
prompt_context_from_ontology(): build typed semantic prompt context from the same ontology
artifact_draft_from_ontology(): build a draft payload for the semantic artifact lifecycle APIs

That keeps schema.jsonld, local SDK prompts, SHACL validation, and runtime semantic artifacts on one contract.

Important:

Seocho.local(ontology) is the shortest serverless path.
Explicit local engine mode is Seocho(ontology=..., graph_store=..., llm=...).
Passing only ontology, only graph_store, or only llm does not enable the in-process local engine.

3. Put Your Data In

3.1 Add one memory

Use this when you want the shortest path from text to graph-backed memory.

memory = client.add(
    "Alex manages the Seoul retail account and prefers ontology-aware retrieval.",
    metadata={"source": "sdk_quickstart"},
)

print(memory.memory_id)

If you are indexing through the runtime path but still want the same ontology to govern the request:

result = client.add_with_details(
    "ACME acquired Beta in 2024.",
    prompt_context=client.prompt_context_from_ontology(),
    approved_artifacts=client.approved_artifacts_from_ontology(),
)

print(result.ingest_summary)

add() is intentionally the opinionated default path. Use add_with_details() when you explicitly want the expert surface for prompt_context, approved artifacts, or governance-controlled overrides.

3.2 Ingest a batch of raw records

Use this when you want SEOCHO to extract entities and relationships into a target graph.

result = client.raw_ingest(
    [
        {"id": "r1", "content": "ACME acquired Beta in 2024."},
        {"id": "r2", "content": "Beta provides risk analytics to ACME."},
    ],
    target_database="kgruntime",
)

print(result.status)
print(result.records_processed)

4. Ask the Simple Way

If you only want a string answer:

print(client.ask("What do you know about Alex?"))
print(client.ask("Who manages the Seoul retail account?", graph_ids=["kgnormal"]))

If you also want runtime metadata without dropping to the advanced semantic API:

response = client.ask_response(
    "Who manages the Seoul retail account?",
    graph_ids=["kgnormal"],
)

print(response.response)
print(response.runtime_mode)
print(response.answer_envelope["query_mode"])
print(response.routing_decision)

Use chat() only when you explicitly want the lightweight memory/chat surface. Use ask_response() when you want the public query surface plus metadata.

5. Use `ask()` As The Primary Query Surface

Prefer graph_ids instead of raw database names when you know the intended graph target. ask() routes into semantic graph QA automatically when graph scope or semantic controls are present.

answer = client.ask(
    "Who manages the Seoul retail account?",
    graph_ids=["kgnormal"],
)

print(answer)

receipt = client.ask_response(
    "Who manages the Seoul retail account?",
    graph_ids=["kgnormal"],
)
print(receipt.runtime_mode)
print(receipt.support.status)
print(receipt.evidence.grounded_slots)
print(receipt.evidence.missing_slots)
print(receipt.run_record.run_id)

6. Turn On Repair When the Query Is Hard

If the question is harder, ambiguous, or likely to need relation-path repair, stay on ask() but enable bounded semantic repair.

answer = client.ask(
    "What is Neo4j related to GraphRAG?",
    graph_ids=["kgnormal"],
    reasoning_mode=True,
    repair_budget=2,
)
print(answer)

receipt = client.ask_response(
    "What is Neo4j related to GraphRAG?",
    graph_ids=["kgnormal"],
    reasoning_mode=True,
    repair_budget=2,
)
print(receipt.runtime_mode)
print(receipt.semantic_context["reasoning"])
print(receipt.strategy.executed_mode)
print(receipt.strategy.next_mode_hint)
print(receipt.evidence.missing_slots)

What reasoning_mode does:

keeps semantic graph QA as the first execution path
validates constrained Cypher before execution
retries with a small repair budget when slot fill is insufficient
avoids jumping straight to multi-agent debate

Parameter guidance:

reasoning_mode=False: fastest default path
reasoning_mode=True: enable bounded semantic repair for harder questions
repair_budget=0: no retry
repair_budget=1..2: practical default for hard graph questions
query_mode="graph_cot": explicit advanced override for the Graph-CoT execution lane
cot_mode=True: SDK shorthand for query_mode="graph_cot" on client.ask(...), client.ask_response(...), and client.semantic(...)
graph_ids=[...]: preferred public routing key when you know the graph target
databases=[...]: physical database scope when you are working at DB level

Graph-CoT mode still rides on the same public ask() contract:

receipt = client.ask_response(
    "What is Neo4j related to GraphRAG?",
    graph_ids=["kgnormal"],
    cot_mode=True,
)

print(receipt.answer_envelope["query_mode"])
print(receipt.strategy.executed_mode)
print(receipt.agent_pattern["pattern"])
print(receipt.graph_cot["guardrail_verdict"]["decision"])
print(receipt.graph_cot["final_answer"]["status"])

Use semantic() only when you explicitly want the advanced/debug graph-QA surface and need to control or inspect the semantic lane directly.

7. Use the Builder Surface

For application code, the builder can be easier to read.

result = (
    client.plan("Who manages the Seoul retail account?")
    .on_graph("kgnormal")
    .with_repair_budget(2)
    .run()
)

print(result.route)
print(result.support.status)
print(result.strategy.executed_mode)

Mode selection rules:

plan(...).run() defaults to semantic execution
plan(...).graph_cot().run() keeps direct semantic execution but selects the Graph-CoT sub-mode
plan(...).react() uses graph-scoped single-agent routing
plan(...).advanced() or plan(...).debate() uses explicit debate mode
result.strategy.advanced_debate_recommended tells you when debate is worth trying

8. Use Advanced Mode Only When You Really Need Debate

Debate is not the default retrieval path. It is for cross-graph comparison or cases where you explicitly want multiple graph specialists to answer.

advanced = client.advanced(
    "Compare what the baseline and finance graphs know about Alex.",
    graph_ids=["kgnormal", "kgfibo"],
)

print(advanced.debate_state)
print(advanced.response)
print(advanced.debate_results)

Equivalent explicit call:

advanced = client.debate(
    "Compare what the baseline and finance graphs know about Alex.",
    graph_ids=["kgnormal", "kgfibo"],
)

9. Inspect Graph Targets

for graph in client.graphs():
    print(graph.graph_id, graph.database, graph.ontology_id, graph.vocabulary_profile)

Expected fields:

graph_id
database
ontology_id
vocabulary_profile
description

10. Module-Level Convenience API

If you prefer import seocho style scripting:

import seocho

seocho.configure(base_url="http://localhost:8001", workspace_id="default")

print(seocho.ask("What do you know about Alex?"))

receipt = seocho.ask_response(
    "Who manages the Seoul retail account?",
    graph_ids=["kgnormal"],
    reasoning_mode=True,
    repair_budget=2,
)
print(receipt.response)
print(receipt.runtime_mode)

advanced = seocho.advanced(
    "Compare what the baseline and finance graphs know about Alex.",
    graph_ids=["kgnormal", "kgfibo"],
)
print(advanced.debate_state)

recent_runs = seocho.semantic_runs(limit=5, route="lpg")
print(recent_runs[0].run_id)

11. Async API

import asyncio
from seocho import AsyncSeocho


async def main() -> None:
    client = AsyncSeocho()
    answer = await client.ask("What does SEOCHO do?")
    print(answer)
    await client.aclose()


asyncio.run(main())

12. Publish A Portable SDK Runtime

When one developer builds a local SDK configuration and wants other developers to consume it over HTTP client mode, export a portable runtime bundle and serve it behind the small bundle HTTP adapter.

Author the bundle:

from seocho import DeepSeekBackend, LanceDBVectorStore, Seocho

client = Seocho(
    ontology=ontology,
    graph_store=graph_store,
    llm=DeepSeekBackend(model="deepseek-chat"),
    vector_store=LanceDBVectorStore(uri="./.lancedb", table_name="team_memory"),
    agent_config=agent_config,
)

bundle = client.export_runtime_bundle(
    "portable.bundle.json",
    app_name="team-memory-runtime",
    default_database="neo4j",
)

print(bundle.app_name)

You can also do the same from the CLI:

seocho bundle export \
  --schema schema.jsonld \
  --neo4j-uri bolt://localhost:7687 \
  --neo4j-user neo4j \
  --neo4j-password password \
  --provider deepseek \
  --model deepseek-chat \
  --database neo4j \
  --output portable.bundle.json

Serve the bundle:

seocho serve-http --bundle portable.bundle.json --host 0.0.0.0 --port 8010

Consume it from another process or machine with the standard HTTP client mode:

from seocho import Seocho

remote = Seocho(base_url="http://localhost:8010", workspace_id="default")

print(remote.ask("What do you know about Alex?"))
receipt = remote.ask_response("Who manages Seoul retail?", databases=["neo4j"])
print(receipt.runtime_mode)

Portable bundle limits in the current implementation:

supported backend export: Neo4jGraphStore
supported LLM export: OpenAI-compatible backends
- OpenAIBackend
- DeepSeekBackend
- KimiBackend
- GrokBackend
portable config only; custom Python strategies are rejected
intended HTTP surface: add, ask, chat, search, and basic semantic
not a replacement for the full main server runtime when you need debate, governance, or the complete artifact lifecycle

13. Choose A Provider And Vector Backend

SEOCHO supports any OpenAI-compatible chat-completion provider through a single plug-in surface. The provider names and models shown below are illustrative examples of that plug-in surface, not endorsements, partnerships, or recommendations — substitute the provider and model that match your own policy and cost profile.

from seocho import DeepSeekBackend, GrokBackend, KimiBackend, OpenAIBackend, QwenBackend

# Example instantiations — replace with the provider and model you use.
openai_llm = OpenAIBackend(model="<model-id>")
deepseek_llm = DeepSeekBackend(model="<model-id>")
kimi_llm = KimiBackend(model="<model-id>")
grok_llm = GrokBackend(model="<model-id>")
qwen_llm = QwenBackend(model="<model-id>")

Provider env vars follow the preset names: OPENAI_API_KEY, DEEPSEEK_API_KEY, MOONSHOT_API_KEY, XAI_API_KEY, and DASHSCOPE_API_KEY.

For semantic search, choose an in-memory or persistent vector backend:

from seocho import FAISSVectorStore, LanceDBVectorStore

faiss_vectors = FAISSVectorStore(model="text-embedding-3-small")
lancedb_vectors = LanceDBVectorStore(
    uri="./.lancedb",
    table_name="team_memory",
    model="text-embedding-3-small",
)

14. Inspect Semantic Run History

The runtime now keeps a SQLite-backed semantic run registry outside the graph store. You can inspect it from the SDK without touching internal files.

recent_runs = client.semantic_runs(limit=10, route="lpg")
print(recent_runs[0].run_id)
print(recent_runs[0].support.status)

run_record = client.semantic_run(recent_runs[0].run_id)
print(run_record.strategy.executed_mode)
print(run_record.evidence_summary)

15. Run Manual-Gold Evaluation

Use the SDK evaluation harness when you want a small regression matrix over question_only, reference_only, semantic_direct, and semantic_repair.

from seocho import ManualGoldCase, SemanticEvaluationHarness

harness = SemanticEvaluationHarness(client)
summary = harness.run_matrix(
    [
        ManualGoldCase(
            case_id="neo4j-1",
            question="What is Neo4j connected to?",
            graph_ids=["kgnormal"],
            expected_intent_id="relationship_lookup",
            required_slots={
                "source_entity": "Neo4j",
                "target_entity": "Cypher",
                "relation_paths": "USES",
            },
            preferred_relations=["USES"],
            repair_budget=2,
        )
    ]
)

print(summary.aggregate_metrics["semantic_direct"])
print(summary.aggregate_metrics["semantic_repair"])

These metrics are intentionally narrow:

intent_match_rate
support_rate
required_answer_slot_coverage_manual
preferred_evidence_hit_rate

14. CLI Equivalents

seocho serve
seocho add "Alex manages the Seoul retail account."
seocho search "Who manages the Seoul retail account?"
seocho chat "What do you know about Alex?"
seocho graphs
seocho stop

15. Mental Model

Use this decision rule:

Start with ask() for almost all querying.
Use ask_response() when you want the same surface plus runtime metadata.
Add reasoning_mode=True before reaching for debate.
Use semantic(...) only when you explicitly need the advanced graph-QA surface.
Use advanced() only when you explicitly want multi-agent comparison.

16. Agent-Level Sessions

Sessions maintain context across indexing and querying operations. Three execution modes are available via AgentConfig:

from seocho import Seocho, Ontology, AgentConfig, RoutingPolicy, AGENT_PRESETS
from seocho.store import Neo4jGraphStore, OpenAIBackend

onto = Ontology.from_jsonld("schema.jsonld")
store = Neo4jGraphStore("bolt://localhost:7687", "neo4j", "password")
llm = OpenAIBackend(model="gpt-4o-mini")

# Pipeline mode (default) — deterministic, no LLM reasoning about flow
s = Seocho(
    ontology=onto,
    graph_store=store,
    llm=llm,
    ontology_profile="finance-core",
)

with s.session("analysis") as sess:
    sess.add("ACME acquired Beta in 2024.")
    sess.add("Beta provides risk analytics to ACME.")
    # QueryAgent sees structured context derived from the same ontology
    answer = sess.ask("What does ACME know about Beta?")

ontology_profile is optional. When set, SEOCHO attaches a compact ontology_context descriptor to indexing metadata, query traces, and session agent context. The descriptor also includes the SKOS-style glossary hash derived from vocabulary terms and aliases, so glossary changes invalidate the context identity. This is the lightweight middleware seam that proves indexing and querying used the same shared ontology contract without adding a new storage format or hot-path reasoning dependency.

In local SDK mode, the same context is also persisted as compact _ontology_* properties on graph nodes and relationships. Query paths inspect those hashes and surface ontology_context_mismatch metadata when the active ontology profile differs from data already indexed in the target graph. SEOCHO does not block the answer automatically; the metadata is a re-indexing and audit guardrail for teams that change ontology profiles over time.

HTTP runtime mode exposes the same guardrail through typed SDK responses:

receipt = client.ask_response("Who manages Seoul retail?", databases=["kgnormal"])
print(receipt.ontology_context_mismatch["mismatch"])

chat = client.chat("What do we know about Seoul retail?", databases=["kgnormal"])
print(chat.ontology_context_mismatch.get("warning", ""))

result = client.plan("Compare Seoul retail sources").on_graph("retail_kg").advanced().run()
print(result.ontology_context_mismatch["databases"])

The same top-level field is available on router, debate, execution-plan, and platform chat responses. This keeps the library interface consistent: agent code asks a graph question, while SEOCHO carries ontology/database parity metadata beside the answer.

If the same ontology must also govern runtime ingest, reuse it instead of authoring a second payload:

runtime_artifacts = s.approved_artifacts_from_ontology()
runtime_prompt_context = s.prompt_context_from_ontology()

Agent mode (LLM decides tool execution order)

s = Seocho(
    ontology=onto, graph_store=store, llm=llm,
    agent_config=AgentConfig(execution_mode="agent"),
)

Supervisor with hand-off (explicit opt-in)

s = Seocho(
    ontology=onto, graph_store=store, llm=llm,
    agent_config=AgentConfig(
        execution_mode="supervisor",
        handoff=True,
        routing_policy=RoutingPolicy.thorough(),
    ),
)

with s.session("auto") as sess:
    sess.run("ACME acquired Beta in 2024.")      # → IndexingAgent
    sess.run("What does ACME know about Beta?")  # → QueryAgent

Routing policy presets

Preset	Latency	Tokens	Quality	Use when
`RoutingPolicy.fast()`	70%	20%	10%	Speed matters most
`RoutingPolicy.balanced()`	33%	33%	34%	General use
`RoutingPolicy.thorough()`	10%	10%	80%	Accuracy matters most

YAML agent design specs

If you want a reviewable agent setup checked into git, declare it in YAML and let SEOCHO compile it into AgentConfig plus ontology_profile.

from seocho import Ontology, Seocho

onto = Ontology.from_jsonld("schema.jsonld")

client = Seocho.from_agent_design(
    "examples/agent_designs/planning_multi_agent_finance.yaml",
    ontology=onto,
    llm="openai/gpt-4o-mini",
    workspace_id="finance-prod",
)

The YAML must include an ontology: section. If the section is missing, or it does not declare a binding like profile, ontology_id, package_id, or path, SEOCHO raises a ValueError.

See AGENT_DESIGN_SPECS.md and the examples/agent_designs/ directory for three starter patterns:

planning + multi-agent collaboration
reflection + chain-of-thought
memory + tool use

YAML indexing design specs

If you want graph-model-aware indexing checked into git, declare it in YAML and let SEOCHO materialize the ontology graph model plus local indexing defaults.

from seocho import Ontology, Seocho

onto = Ontology.from_jsonld("schema.jsonld")

client = Seocho.from_indexing_design(
    "examples/indexing_designs/lpg_finance_provenance.yaml",
    ontology=onto,
    llm="openai/gpt-4o-mini",
    workspace_id="finance-prod",
)

The YAML must include an ontology: section and must declare graph_model + storage_target. RDF-targeted designs must also declare a materialization.rdf_mode.

For graph_model: lpg, SEOCHO installs a property-graph-oriented extraction prompt by default so the model can preserve source-grounded scalar properties without collapsing period-specific metrics.

See INDEXING_DESIGN_SPECS.md and the examples/indexing_designs/ directory for starter designs covering:

LPG + provenance-first indexing
RDF + deductive expansion
hybrid + inquiry-cycle repair

17. Ontology Merge

Combine two ontologies when integrating new domains:

finance = Ontology.from_jsonld("finance.jsonld")
legal = Ontology.from_jsonld("legal.jsonld")

# Union: combine properties from both
combined = finance.merge(legal)
# → Company has revenue (finance) + jurisdiction (legal)

# Strict: raise on type conflicts
combined = finance.merge(legal, strategy="strict")

combined.to_jsonld("combined.jsonld")

Strategies: union (default), left_wins, right_wins, strict.

18. Where Ontology And Runtime Files Live

Common locations:

default ontology file in CLI examples: schema.jsonld
local graph state for docker stack: data/neo4j/
semantic artifact store: outputs/semantic_artifacts/
rule profile registry: outputs/rule_profiles/rule_profiles.db
semantic run metadata: outputs/semantic_metadata/
trace file path: SEOCHO_TRACE_JSONL_PATH

Use these commands to inspect them directly:

seocho ontology check --schema schema.jsonld
seocho ontology export --schema schema.jsonld --format shacl --output shacl.json
seocho artifacts list --status approved
curl -sS "http://localhost:8001/semantic/artifacts?workspace_id=default" | jq .

See FILES_AND_ARTIFACTS.md for the full map.

The practical ontology file flow is:

save the ontology as schema.jsonld
inspect it with seocho ontology check --schema schema.jsonld
derive SHACL with seocho ontology export --schema schema.jsonld --format shacl --output shacl.json
use the same ontology to build runtime artifacts with approved_artifacts_from_ontology()

19. Read Next

APPLY_YOUR_DATA.md
../QUICKSTART.md
GRAPH_MEMORY_API.md

That keeps the default path deterministic and inspectable while still leaving a high-power advanced mode available.