Skip to content

Architecture

Source mirrored from seocho/docs/ARCHITECTURE.md

SEOCHO transforms unstructured data into structured Knowledge Graphs. It implements an asynchronous, dynamic Agent Pool architecture leveraging a Parallel Debate pattern to answer complex queries against dynamically provisioned graph targets.

Current baseline:

  • Agent runtime: OpenAI Agents SDK
  • Trace/evaluation contract: vendor-neutral (none|console|jsonl|opik)
  • Preferred team backend: Opik
  • Canonical neutral artifact: JSONL
  • Graph backend: DozerDB (Neo4j protocol compatible)

SEOCHO is an ontology-aligned modular monolith. The product boundary is one Python SDK plus one runtime shell, but the internal system is split into clear planes:

  • Public facade: Seocho.add(), Seocho.ask(), Seocho.local(), and Seocho.remote() stay stable for users.
  • Data plane: seocho/index/* owns document/file ingestion, extraction, linking, rule assessment, and graph writes.
  • Control plane: seocho/query/* owns intent extraction, ontology-aware strategy selection, evidence bundles, semantic routing, debate reuse, and answer synthesis.
  • Ontology plane: seocho/ontology.py and seocho/ontology_context.py provide the schema contract and compact context hash that both indexing and querying carry.
  • Runtime shell: runtime/* owns HTTP routes, request validation, policy, readiness, workspace/database registry, and deployment-specific health.
  • Compatibility shell: extraction/* is being reduced to legacy import compatibility and batch-only helpers while canonical logic moves into seocho/* and runtime/*.

The key invariant is that ontology context must survive the full loop:

ontology -> indexing prompt -> graph write metadata -> query intent/evidence
-> agent/debate trace -> answer support assessment

FinDER-style evaluation is useful because it exercises that invariant end to end. A case is not considered healthy just because the runtime returns support_status=supported; benchmark artifacts also track answer matching, missing slots, semantic evidence reuse, tool calls, reasoning attempts, and token usage so false support claims become visible.

Default local activation is intentionally smaller than the full historical repo:

  • neo4j
  • extraction-service
  • evaluation-interface

This is the supported first-run path for developers. The old semantic-service still exists, but only behind the explicit compose profile:

Terminal window
docker compose --profile legacy-semantic up -d semantic-service

That distinction matters because most current onboarding, API verification, and platform UX flow through extraction-service, not the standalone legacy semantic container.

During the staged runtime rename, extraction-service still starts the legacy flat agent_server module. The default compose path now bakes extraction/, runtime/, and seocho/ into the image so the runtime matches a known build snapshot. The flat extraction/* compatibility aliases still bootstrap canonical runtime/* modules inside that image. Live bind mounts remain available only through docker-compose.dev.yml for explicit development loops.

The main local artifacts are deliberately file-system visible:

  • ontology contract: usually schema.jsonld
  • graph state for the local compose stack: data/neo4j/
  • semantic artifacts: outputs/semantic_artifacts/
  • rule profile registry: outputs/rule_profiles/rule_profiles.db
  • semantic run metadata: outputs/semantic_metadata/
  • traces: path from SEOCHO_TRACE_JSONL_PATH

Operators should inspect these directly during debugging instead of treating the runtime as opaque. See /docs/files_and_artifacts/ for the concrete commands.

This board is the active architecture priority order.

  • isolate OpenAI Agents SDK calls behind one adapter layer
  • enforce contract tests for runner signatures and response payload shape
  • block release when strict runtime e2e smoke fails
  • only create debate agents for databases that actually exist and are reachable
  • expose degraded/partial state in runtime payload when some DB agents are skipped
  • keep routing safe when registry includes stale DB names
  • migrate id(node) usage to elementId(node) contract-safe path
  • validate query templates against DozerDB/Neo4j deprecation warnings
    • status: semantic query flow path migrated to elementId contract
  • separate runtime API readiness from batch/notebook pipeline readiness
  • split health checks and operational alerts by service role
    • status: /health/runtime and /health/batch endpoints added
  • define ready | degraded | blocked for each graph agent
  • make router/supervisor allocate requests using readiness metadata
    • status: debate runtime emits readiness state and platform fallback on blocked
  • enforce /rules/assess as promotion gate for rule profile rollout
  • expand CI integration checks for ontology artifact and rule export paths

Architecture changes are not complete unless users can run this path directly:

  1. ingest raw data via /platform/ingest/raw
  2. ensure semantic index via /indexes/fulltext/ensure
  3. ask semantic/debate questions via /api/chat/send
  4. inspect trace and runtime payload in UI
  5. pass make e2e-smoke

This critical path is the default acceptance gate for user-facing releases.

  • agent orchestration and routing policies
  • runtime authorization policy enforcement
  • deployment quality gates and ADR governance

Canonical SDK control-plane modules:

  • seocho/agent/
  • seocho/query/
  • seocho/http_transport.py
  • seocho/ontology.py as the stable public ontology facade
  • seocho/ontology_context.py for compact shared ontology context descriptors and cache
  • seocho/ontology_serialization.py for JSON-LD persistence
  • seocho/ontology_artifacts.py for runtime artifact and typed prompt generation
  • seocho/ontology_governance.py for offline diff/check/export flows
  • seocho/runtime_contract.py

Primary modules:

  • runtime/agent_server.py
  • runtime/server_runtime.py
  • runtime/policy.py
  • docs/decisions/

extraction/ is a historical package name, not the desired long-term runtime name.

Long-term package shape:

  • seocho/
    • canonical engine modules
  • runtime/
    • deployment shell, route wiring, policy, readiness, registry
  • extraction/
    • extraction-only helpers or compatibility wrappers during migration

We are intentionally choosing runtime/ over server/ because the shell owns more than HTTP route files. The staged migration contract lives in docs/RUNTIME_PACKAGE_MIGRATION.md. Contributor-facing placement guidance lives in docs/MODULE_OWNERSHIP_MAP.md.

ADR-0080 introduced internal seams for the modular monolith. The important question now is where those seams are actually in use.

Current wiring:

  • local SDK ingest path: Seocho.add() -> seocho/local_engine._LocalEngine -> seocho.index.ingestion_facade.IngestionFacade -> seocho.index.pipeline.IndexingPipeline
  • runtime semantic composition root: runtime/server_runtime.py now builds the shared SemanticAgentFlow through the canonical seocho.query.AgentFactory
  • runtime graph reads: runtime/memory_service.py and runtime Cypher tool execution now route read queries through seocho.query.QueryProxy
    • QueryProxy now owns the typed query-result contract so runtime and semantic paths can distinguish empty results from connector/contract failures
  • runtime readiness model: runtime.agent_readiness.summarize_readiness() normalizes debate availability onto runtime.agent_state.AgentStateMachine

Explicit non-goal for this slice:

  • the legacy debate specialist-agent factory in extraction/agent_factory.py is still in place. That path remains compatible, but it is not yet the canonical factory seam.

Ontology Module Boundaries (Active Direction)

Section titled “Ontology Module Boundaries (Active Direction)”

Ontology remains a first-class SDK primitive, but it should no longer behave as one monolithic implementation file.

  • seocho/ontology.py
    • stable public Ontology, NodeDef, RelDef, and P surface
    • schema validation, SHACL derivation, and prompt-facing API entrypoints
  • seocho/ontology_context.py
    • stable ontology_context_hash descriptor shared by indexing, query, traces, and agent sessions
    • small in-process cache for compiled extraction/query/agent context artifacts
    • SKOS-style glossary/vocabulary hash derived from ontology labels, aliases, properties, and relationship terms
    • graph-write metadata helpers that attach compact _ontology_* properties to persisted nodes and relationships
    • query guardrail helpers that compare active ontology context with indexed graph context hashes
    • typed HTTP response metadata for memory search/chat, semantic query, router, debate, execution-plan, and platform chat clients
  • Target middleware contract: docs/ONTOLOGY_RUN_CONTEXT_STRATEGY.md
    • aligns workspace, graph/database scope, ontology profile, glossary identity, policy, tool use, debate, session carryover, and evidence status
  • Target property-graph lens contract: docs/PROPERTY_GRAPH_LENS_STRATEGY.md
    • preserves schemaless graph flexibility while marking only agent-visible anchors, evidence sources, evidence paths, provenance, importance, confidence, and context metadata
  • seocho/ontology_serialization.py
    • canonical JSON-LD load/save helpers
    • no runtime governance side effects
  • seocho/ontology_artifacts.py
    • runtime-facing typed artifact promotion
    • approved artifacts, semantic prompt context, vocabulary shaping
  • seocho/ontology_governance.py
    • offline check/diff/export/OWL inspection path

Canonical direction:

  • local SDK and runtime promotion paths should consume explicit ontology-side contracts instead of hand-built client glue
  • public API compatibility should stay centered on Ontology
  • heavy governance and OWL inspection stays out of the request hot path

Client Facade Boundaries (Active Direction)

Section titled “Client Facade Boundaries (Active Direction)”

Seocho should stay a public facade, not a second home for canonical engine logic.

  • seocho/client.py
    • public SDK facade and orchestration entrypoints
  • seocho/http_transport.py
    • HTTP request/response wrapping and exception mapping
  • seocho/client_artifacts.py
    • ontology-to-runtime-artifact bridge helpers
  • seocho/query/*, seocho/agent/*, seocho/ontology_*
    • canonical engine subdomains used by the facade

Canonical direction:

  • keep the constructor and top-level SDK calls stable
  • move HTTP transport, ontology bridge helpers, and local engine internals out of client.py over incremental slices
  • avoid adding new canonical business logic directly to the facade
  • data ingestion and extraction pipeline
  • SHACL-like rule inference and validation
  • graph loading/query execution on DozerDB

Primary modules:

  • seocho/rules.py — canonical rule inference/validation (shared by SDK + server)
  • seocho/index/pipeline.py — canonical indexing pipeline with rule + embedding support
  • seocho/index/linker.py — canonical embedding-based entity linker
  • extraction/pipeline.py — legacy batch pipeline
  • extraction/rule_constraints.py — re-export shim to seocho.rules
  • extraction/vector_store.py — adapter shim to seocho.store.vector
  • extraction/graph_loader.py

Performance and quality must be measured in two tracks:

  • private finance corpus
    • ingestion, graph construction, finance-domain QA
  • GraphRAG-Bench
    • retrieval, evidence quality, reasoning

Measurement order:

  1. SEOCHO local SDK baseline
  2. SEOCHO runtime baseline
  3. peer baselines

That rule prevents us from confusing deployment overhead with canonical engine quality. See docs/BENCHMARKS.md.

[External Data]
DataSource (CSV/JSON/Parquet/API)
OntologyPromptBridge (ontology → LLM prompt injection)
EntityExtractor (LLM-based entity/relationship extraction)
EntityLinker (LLM-based entity resolution)
EntityDeduplicator (embedding similarity dedup)
RuleConstraints (SHACL-like inference + validation)
DatabaseManager (Neo4j DB creation + schema + load)
AgentFactory (DB별 전용 Agent 생성)
User Question → Router/DebateOrchestrator → AgentPool → SharedMemory → Supervisor → Answer

For graph QA with hard entity disambiguation requirements:

User Question
Semantic Layer
- extract question entities
- fulltext lookup on DozerDB/Neo4j
- semantic dedup/disambiguation with label hints
RouterAgent
├── LPGAgent (property graph neighborhood path)
└── RDFAgent (RDF/ontology-oriented path)
AnswerGenerationAgent (final synthesis)

Why this path exists:

  • query-time entity mapping is the hardest failure point in graph QA
  • fulltext-first lookup improves recall for imperfect user entity strings
  • semantic re-ranking + dedup reduces wrong-node selection before Cypher generation

Canonical direction:

  • seocho/query/ is the canonical query engine surface
  • local SDK and server runtime should share planner, executor, and answer shaping contracts from seocho/query/*
  • semantic query Phase A now also shares intent, support assessment, strategy selection, Cypher validation, and insufficiency contracts from seocho/query/{intent,strategy_chooser,cypher_validator,insufficiency,contracts}.py
  • semantic query Phase B now also shares constraint-slice building and run metadata persistence from seocho/query/{constraints,run_registry}.py
  • semantic query Phase C now also shares entity resolution, route selection, LPG/RDF specialists, and answer framing from seocho/query/semantic_agents.py
  • semantic query Phase D now also shares SemanticAgentFlow orchestration from seocho/query/semantic_flow.py; extraction keeps runtime injection and compatibility aliases only
  • extraction/* should keep transport and runtime orchestration concerns, not grow a second query engine

Server entrypoint direction:

  • runtime/agent_server.py is the FastAPI transport shell
  • runtime/server_runtime.py owns shared runtime service composition
  • routers should prefer lazy service getters over eager singleton boot at import

Extraction Cleanup Classification (Current)

Section titled “Extraction Cleanup Classification (Current)”

The extraction layer is being reduced toward transport, provisioning, and compatibility roles.

  • shim now:
    • extraction/rule_constraints.py
    • extraction/vector_store.py
  • compatibility alias now:
    • extraction/agent_server.py
    • extraction/agent_readiness.py
    • extraction/middleware.py
    • extraction/memory_service.py
    • extraction/public_memory_api.py
    • extraction/server_runtime.py
    • extraction/policy.py
  • keep as transport/composition:
    • runtime/agent_server.py
    • runtime/agent_readiness.py
    • runtime/middleware.py
    • runtime/memory_service.py
    • runtime/public_memory_api.py
    • runtime/server_runtime.py
    • runtime/runtime_ingest.py
  • keep as compatibility caller over canonical seam:
    • extraction/pipeline.py
  • compatibility alias now:
    • extraction/runtime_ingest.py
  • canonical helper seams now shared with runtime ingest:
    • seocho/index/runtime_memory.py
    • seocho/index/runtime_artifacts.py
  • migrate later:
    • remaining runtime ingest orchestration beyond extraction/linking setup

Shared ingestion seam:

  • seocho/index/extraction_engine.py
    • canonical extraction prompt rendering
    • canonical linking prompt rendering
    • canonical graph payload normalization
    • reused by SDK indexing, extraction compatibility pipeline, and runtime ingest prompt-driven extraction/linking setup

Intent-First Graph-RAG Contract (Active Direction)

Section titled “Intent-First Graph-RAG Contract (Active Direction)”

Semantic QA and memory-first answering should move toward an explicit intent-to-evidence contract:

  • intent_id
  • required_relations
  • required_entity_types
  • focus_slots
  • selected_triples
  • slot_fills
  • missing_slots
  • provenance and confidence

This keeps graph retrieval accountable to answerability rather than vague local relevance.

Primary implementation anchors:

  • seocho/query/intent.py
  • seocho/query/strategy_chooser.py
  • seocho/query/cypher_validator.py
  • seocho/query/insufficiency.py
  • extraction/semantic_query_flow.py as the compatibility surface during migration
  • runtime/memory_service.py
  • runtime/public_memory_api.py
  • seocho/types.py

Reference design brief: docs/GRAPH_RAG_AGENT_HANDOFF_SPEC.md

Enterprise Vocabulary Layer (Planned Direction)

Section titled “Enterprise Vocabulary Layer (Planned Direction)”

To reduce keyword brittleness in graph retrieval, SEOCHO adopts a governed vocabulary layer derived from extraction and SHACL-like artifacts.

Control-plane responsibilities:

  • maintain vocabulary lifecycle (draft -> approved -> deprecated)
  • enforce approval policy and promotion gates before runtime exposure
  • manage global baseline vocabulary and workspace-scoped override policy

Data-plane responsibilities:

  • generate vocabulary candidates from entity extraction/linking results
  • enrich candidate terms from SHACL-like rule artifacts (labels, aliases, value constraints)
  • persist artifacts for review and provenance audit

Runtime contract:

  • resolve query terms using approved vocabulary (global baseline + workspace_id override)
  • apply lightweight expansion/disambiguation only in hot path
  • keep heavy ontology reasoning in offline governance path (owlready2)

Offline ontology governance operators should prefer the SDK CLI surface:

  • seocho ontology check --schema schema.jsonld
  • seocho ontology export --schema schema.jsonld --format shacl --output shacl.json
  • seocho ontology diff --left old.jsonld --right new.jsonld
  • seocho ontology inspect-owl --source ontology.owl
ModuleFilePurpose
DataSourceextraction/data_source.pyABC + FileDataSource(CSV/JSON/Parquet) + APIDataSource
DataCollectorextraction/collector.pyLegacy HuggingFace collector (backward compat)

Standard record format: {"id": str, "content": str, "category": str, "source": str, "metadata": dict}

| CanonicalExtractionEngine | seocho/index/extraction_engine.py | Shared extraction/linking prompt + normalization seam for SDK and extraction compatibility paths | | RuntimeMemoryHelpers | seocho/index/runtime_memory.py | Shared deterministic memory-graph shaping helpers for SDK/runtime ingestion paths | | RuntimeArtifactHelpers | seocho/index/runtime_artifacts.py | Shared deterministic runtime semantic-artifact merge, vocabulary, and summary helpers | | OntologyPromptBridge | extraction/ontology_prompt_bridge.py | Backward-compatible ontology → prompt bridge; new code should prefer ontology contracts directly | | EntityExtractor | extraction/extractor.py | Legacy OpenAI extractor wrapper retained for compatibility paths that have not yet moved to the canonical seam | | EntityLinker | extraction/linker.py | Legacy LLM linker wrapper retained for compatibility paths that have not yet moved to the canonical seam | | EntityDeduplicator | extraction/deduplicator.py | Embedding cosine similarity-based semantic deduplication | | RuleConstraints | extraction/rule_constraints.py | SHACL-like rule inference and node constraint validation annotations | | PromptManager | extraction/prompt_manager.py | Jinja2 prompt templating + history logging |

ModuleFilePurpose
DatabaseRegistryextraction/config.pyRuntime-extensible DB name allowlist (singleton: db_registry)
GraphRegistryextraction/config.pyGraph target descriptors (graph_id -> uri/database/ontology/vocabulary)
DatabaseManagerextraction/database_manager.pyDB provisioning + schema + data loading
GraphLoaderextraction/graph_loader.pyNeo4j MERGE operations (label-validated)
SchemaManagerextraction/schema_manager.pyConstraint/index application
ModuleFilePurpose
AgentsRuntimeAdapterextraction/agents_runtime.pySDK compatibility layer for runner/trace contracts (starting_agent vs agent)
AgentFactoryextraction/agent_factory.pyDynamically provisions dedicated Agents per graph target
SharedMemoryextraction/shared_memory.pyRequest-scoped shared memory between agents + query cache
DebateOrchestratorextraction/debate.pyExecutes Parallel Debate pattern (fan-out → collect → synthesize)
Agent Serverruntime/agent_server.pyFastAPI endpoints (/run_agent, /run_debate, /run_agent_semantic, /platform/chat/send, /platform/ingest/raw)
Agent Readinessruntime/agent_readiness.pyRuntime readiness summary helpers for graph-agent orchestration
Request Middlewareruntime/middleware.pyRequest ID correlation middleware for runtime HTTP endpoints
Memory Serviceruntime/memory_service.pyMemory-first runtime facade over ingest and semantic graph search
Platform Agentsextraction/platform_agents.pyBackend/Frontend specialist orchestration + session state management
Runtime Raw Ingestorruntime/runtime_ingest.pyRuntime raw-text extraction/linking/rule-annotation and DB loading
ModuleFilePurpose
Tracingextraction/tracing.pyVendor-neutral trace contract with optional Opik exporter activation: configure_opik(), wrap_openai_client(), @track, update_current_span
Configextraction/config.pySEOCHO_TRACE_BACKEND, SEOCHO_TRACE_JSONL_PATH, SEOCHO_TRACE_OPIK_MODE, OPIK_URL, OPIK_WORKSPACE, OPIK_PROJECT_NAME
ModuleFilePurpose
Platform Serverevaluation/server.pyCustom chat frontend backend (proxy + static hosting)
Platform UIevaluation/static/Interactive chat UI (trace + candidate override + raw ingest controls)
User → Router → {GraphAgent, VectorAgent, WebAgent, TableAgent} → Supervisor → Answer
  • Legacy static 7-agent pipeline.
  • The Router delegates the query to exactly 1 specialist agent.
  • Sequential hand-off chain execution.

2. Parallel Debate Mode (POST /run_debate)

Section titled “2. Parallel Debate Mode (POST /run_debate)”
User → DebateOrchestrator → [Agent_db1 ∥ Agent_db2 ∥ ... ∥ Agent_dbN] → SharedMemory → Supervisor → Answer
  • All database-bound agents execute concurrently via asyncio.gather().
  • Each agent stores its isolated conclusion in Shared Memory.
  • A Supervisor Agent aggregates and synthesizes the final comprehensive answer.
  • Fault Isolated: If one agent fails, the Supervisor synthesizes using the remaining successful agent answers.
  • Runtime payload includes per-database readiness metadata (agent_statuses) and overall degraded flag (degraded) for safe UI/ops handling.

3. Semantic Graph QA Mode (POST /run_agent_semantic)

Section titled “3. Semantic Graph QA Mode (POST /run_agent_semantic)”
User → SemanticLayer → RouterAgent → {LPGAgent | RDFAgent | both} → AnswerGenerationAgent
  • 4-agent model aligned with current query-time disambiguation requirement
  • semantic layer performs:
    • question entity extraction
    • fulltext index candidate search
    • contains-based fallback search
    • optional ontology-hint alias/label boost from offline artifact (output/ontology_hints.json)
    • lightweight semantic dedup/disambiguation
  • frontend can pin candidate mappings and resend as entity_overrides
  • route policy:
    • RDF hints (rdf, owl, shacl, sparql) → RDFAgent
    • default → LPGAgent
    • mixed hints → hybrid path (both agents)
from config import db_registry
db_registry.register("mydb01") # 등록
db_registry.is_valid("mydb01") # 검증 (True)
db_registry.list_databases() # 사용자 DB 목록 (system/neo4j 제외)
  • DB Name Validation: ^[A-Za-z][A-Za-z0-9]*$ (Must start with a letter, alphanumeric only)
  • VALID_DATABASES (legacy fallback lookup) statically references db_registry._databases.
from config import graph_registry
graph_registry.list_graph_ids() # graph IDs used by debate/runtime APIs
graph_registry.get_graph("kgfibo") # uri/database/ontology/vocabulary descriptor
  • Graph targets are loaded from SEOCHO_GRAPH_REGISTRY_FILE.
  • Each graph_id can point to a different Neo4j or DozerDB instance.
  • Debate mode fans out by graph_id, not only by database name.
factory = AgentFactory(neo4j_connector)
agent = factory.create_graph_agent(graph_target, schema_info)
  • Each agent’s query tool is closure-bound directly to a specific graph target context.
  • Automatic SharedMemory cache integration (via RunContextWrapper).
memory = SharedMemory()
memory.cache_query_result("kgnormal", "MATCH (n) RETURN n", "[...]")
memory.get_cached_query("kgnormal", "MATCH (n) RETURN n") # cache hit
memory.put("agent_result:kgnormal", "answer text")
memory.get_all_results() # Supervisor용 전체 결과
  • Request-scoped: Exactly 1 instance instantiated per API request.
  • Implements MD5 hashing on Cypher queries for rapid caching lookups.
from ontology.base import Ontology
from ontology_prompt_bridge import OntologyPromptBridge
ontology = Ontology.from_yaml("conf/schemas/fibo.yaml")
bridge = OntologyPromptBridge(ontology)
context = bridge.render_extraction_context()
# → {"entity_types": "- Organization: ...", "relationship_types": "...", "ontology_name": "FIBO"}
  • Converts NodeDefinition and RelationshipDefinition from the Ontology YAML into native LLM prompt variables.
  • The default.yaml prompt dynamically switches between static mapping vs dynamic logic using standard {% if ontology_name %} templating blocks.

두 시스템의 역할이 명확히 분리되어 있습니다:

  • Purpose: Interactive operational chat UX + semantic disambiguation override loop.
  • Location: evaluation/server.py + evaluation/static/* (listening on port 8501).
  • Tracing: The backend streams trace_steps and ui_payload to the frontend for real-time visualization.
  • Capabilities: Allows the user to select semantic candidates, trigger override re-queries, and maintains strict session-level dialog history.
  • Additional capability: Supports direct raw-text ingestion (Ingest Raw) before query execution.
  • Purpose: Team-grade eval, debugging, LLM scoring, and deep agent visualization.
  • Location: http://localhost:5173 (opt-in via docker compose --profile opik up -d).
  • Tracing: activated only when the trace backend is explicitly set to opik; otherwise runtime traces can stay local in JSONL or console form.
  • Capabilities: Captures parent-child span trees, exact LLM costs/latency, custom datasets & experiments, and prompt scoring metrics.

Opik에서 Debate 패턴은 다음과 같은 span hierarchy로 표현됩니다:

agent_server.run_debate [tags: debate-mode]
└─ debate.run_debate [phase: orchestration, agent_count: N]
├─ debate.run_single_agent [phase: fan-out, db: kgnormal]
│ └─ (OpenAI chat.completions.create) [auto-traced]
├─ debate.run_single_agent [phase: fan-out, db: kgfibo]
│ └─ (OpenAI chat.completions.create) [auto-traced]
└─ debate.supervisor_synthesis [phase: synthesis]
└─ (OpenAI chat.completions.create) [auto-traced]

Each individual recursive span carries heavy metadata (e.g., db_name, agent_name, phase) and diagnostic tags allowing rapid filtering and auditing within the Opik Dashboard UI.

FANOUT (yellow) ─┬─ DEBATE: Agent_kgnormal (blue)
├─ DEBATE: Agent_kgfibo (blue)
└─ DEBATE: Agent_xxx (blue)
COLLECT (orange) ← 모든 DEBATE
SYNTHESIS: Supervisor (green)

Edge routing in the UI is rendered strictly via the metadata.parent (for the fan-out trajectory) and metadata.sources (for the collection synthesis loop) properties on the payload.

OPENAI_API_KEY=sk-...
OPENAI_MODEL=gpt-4o-mini
DEEPSEEK_API_KEY=
MOONSHOT_API_KEY=
XAI_API_KEY=
NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=password
# Opik (opt-in)
SEOCHO_TRACE_BACKEND=none
SEOCHO_TRACE_JSONL_PATH=./traces/seocho-runtime.jsonl
SEOCHO_TRACE_OPIK_MODE=self_host
OPIK_VERSION=1.10.18
OPIK_URL=http://opik-backend:8080
OPIK_WORKSPACE=default
OPIK_PROJECT_NAME=seocho
OPIK_API_KEY=
conf/
├── config.yaml # Reference defaults (runtime is env-first)
├── prompts/
│ ├── default.yaml # Extraction prompt (supports ontology variables)
│ ├── linking.yaml # Entity linking prompt
└── ingestion/
├── config.yaml # Ingestion dataset/openai/neo4j + schema selector
└── schema/
├── baseline.yaml
├── fibo.yaml
└── tracing.yaml
EndpointMethodDescription
/run_agentPOSTLegacy router mode
/run_agent_semanticPOSTSemantic entity-resolution flow (Router/LPG/RDF/Answer)
/run_debatePOSTParallel debate mode
/indexes/fulltext/ensurePOSTEnsure fulltext index exists for semantic routing
/platform/chat/sendPOSTCustom interactive platform chat endpoint
/platform/ingest/rawPOSTRuntime raw record ingestion into target graph DB
/platform/chat/session/{session_id}GETRead platform session history
/platform/chat/session/{session_id}DELETEReset platform session
/semantic/artifacts/draftsPOSTSave ontology/SHACL/vocabulary candidates as draft artifact
/semantic/artifactsGETList semantic artifacts (draft/approved/deprecated)
/semantic/artifacts/{artifact_id}GETRead one semantic artifact
/semantic/artifacts/{artifact_id}/approvePOSTPromote draft artifact to approved
/semantic/artifacts/{artifact_id}/deprecatePOSTDeprecate approved artifact for runtime vocabulary retirement
/databasesGETList registered databases
/agentsGETList active DB-bound agents

Offline helper:

  • scripts/ontology/build_ontology_hints.py builds output/ontology_hints.json from OWL via owlready2.