Skip to content

Open Source Playbook

Synced automatically from seocho/docs/OPEN_SOURCE_PLAYBOOK.md

This guide is for contributors who want to adapt SEOCHO to real domain data.

Recommended order for external teams:

  1. load your raw data through runtime ingest
  2. define ontology/schema artifacts
  3. validate semantic routing on your graph
  4. tighten rule governance with /rules/assess
  5. add integration tests before feature expansion
Terminal window
curl -sS -X POST http://localhost:8001/platform/ingest/raw \
-H "Content-Type: application/json" \
-d '{
"workspace_id":"default",
"target_database":"mydomain",
"records":[
{"id":"r1","content":"Entity A acquired Entity B."},
{"id":"r2","content":"Entity B supplies analytics to Entity C."}
]
}' | jq .
Terminal window
curl -sS -X POST http://localhost:8001/indexes/fulltext/ensure \
-H "Content-Type: application/json" \
-d '{
"workspace_id":"default",
"databases":["mydomain"],
"create_if_missing":true
}' | jq .
Terminal window
curl -sS -X POST http://localhost:8501/api/chat/send \
-H "Content-Type: application/json" \
-d '{
"session_id":"oss_semantic_1",
"message":"What entities are linked in mydomain?",
"mode":"semantic",
"workspace_id":"default",
"databases":["mydomain"]
}' | jq '{assistant_message, route: .runtime_payload.route}'
  • baseline examples:
    • extraction/conf/schemas/baseline.yaml
    • extraction/conf/schemas/fibo.yaml
Terminal window
python scripts/ontology/build_ontology_hints.py \
--ontology ./path/to/domain.owl \
--output output/ontology_hints.json

Use this artifact to improve semantic candidate reranking without heavy runtime ontology reasoning.

Core extension points:

  • extraction/semantic_query_flow.py: semantic extraction/resolution and route choice
  • extraction/agent_factory.py: per-database agent creation
  • extraction/debate.py: fan-out/collect/synthesis orchestration
  • extraction/platform_agents.py: backend/frontend runtime payload shaping

When adding a new route policy, keep:

  • workspace_id propagation
  • runtime permission checks (extraction/policy.py)
  • trace metadata contract for UI DAG rendering (node_id, parent_id, parent_ids)

5. Required quality gates for contributors

Section titled “5. Required quality gates for contributors”

Run these before opening a PR:

Terminal window
make test
make test-integration
make e2e-smoke
scripts/pm/lint-agent-docs.sh

If you changed runtime API or orchestration behavior, include at least one integration test in extraction/tests/.

  • prefer small, contract-first changes over broad refactors
  • add ADR + decision log entry for architecture-significant changes
  • document user-visible behavior in README.md and docs/QUICKSTART.md
  • avoid adding runtime-only dependency on heavy ontology reasoning tools

Source-of-truth docs live in this repository.

For stable website publishing, update these together:

  • docs/README.md
  • docs/QUICKSTART.md
  • docs/ARCHITECTURE.md
  • docs/WORKFLOW.md

Then push to main after running local quality gates.