EchoVessel Internal Architecture Deep Dive

v0.0.1-alpha · 2026.04 · main @ 852fb62 · 916 tests green
Web channel
FastAPI + SSE · 127.0.0.1:7777 · React + Vite UI · God-view observer of every other channel
Discord DM channel
discord.py bot · per-user DM · native OGG Opus voice messages via ffmpeg
iMessage channel
spec stub only · will drop into runtime mirror with zero frontend change
WeChat channel
placeholder · deferred to v0.1.0+
Current baseline · v0.0.1-alpha
14 commits from initial snapshot to main · 916 passed / 3 skipped / 0 failed
SHIPPED
echovessel rundaemon · SIGTERM/SIGHUP · pidfile
memory.dbSQLite + FTS5 + sqlite-vec
LLM providersopenai_compat / anthropic / stub
FishAudio TTSper-persona voice_id · MP3 cache
Web admin page6 tabs · zero placeholders
Import pipelineupload → LLM extract → memory
Proactive scheduleropt-in · 4 policy gates
Cross-channel SSEruntime-owned broadcaster
Chat history backfillGET /api/chat/history
Voice clone wizard3-step FishAudio clone
Cost trackingllm_calls ledger · per-feature
GitHub Actions CIruff · lint-imports · pytest

1. Module Layers — the layered architecture

Five modules · strict downward-only imports
Enforced by import-linter · 2 contracts · 0 broken
LAYERED
runtime
daemon loop · turn dispatcher · LLM factory · SIGHUP reload · import facade · cost logger
app.py launcher.py interaction.py consolidate_worker.py
channels  ·  proactive
channels translate external protocols ↔ IncomingTurn · proactive schedules autonomous messages
can import ↓ memory · voice · core
memory  ·  voice
memory = L1-L4 persistence + retrieve + consolidate · voice = TTS/STT/cloning provider layer
can import ↓ core
core
shared types · enums · config paths · utilities · zero external dependencies
no upward imports
Contract enforcement: uv run lint-imports fails CI if any module imports upward. The second contract (proactive must not import runtime or prompts) prevents a subtle cycle where proactive scheduling would transitively pull in the LLM factory.

2. Memory — the four-layer store

L1 → L4 · raw transcripts to reflected insights
One SQLite file · all layers share it · never sharded by channel_id (iron rule D4)
CORE
L1
Core blocks — the persona's frame
The 5 hand-editable identity documents that every LLM prompt starts with.
table · core_blocks   audit · core_block_appends
  • persona_block — who this persona is (trait, tone, values)
  • self_block — persona's self-narrative (grows by reflection)
  • user_block — identity-level facts about you
  • relationship_block — people in your life as persona sees them
  • mood_block — transient · auto-refreshed each turn
write path
onboarding · admin edit · importer bootstrap · mood observer
read path
every turn's prompt prefix
L2
Recall messages — raw chat log
Every user + persona message, verbatim, tagged with channel + turn + session.
table · recall_messages   fts · recall_messages_fts
  • role: "user" | "persona"
  • channel_id stored, but never read as filter (D4)
  • session_id groups a conversation burst
  • turn_id groups one user→persona exchange
  • FTS5 trigram index for search + LIKE fallback for short queries
write path
memory.ingest_message() · per message · atomic
read path
/api/chat/history · recent-window context · admin memory search
L3
Events — episodic memory
"What happened" distilled out of chat · one concrete event per node.
table · concept_nodes WHERE type=EVENT   vec · concept_nodes_vec
  • description — short 1-2 sentence summary
  • emotional_impact — signed integer, drives retrieval weighting
  • emotion_tags / relational_tags — JSON lists
  • sentence-transformers embedding (384-d) written via sqlite-vec
  • linked back to source session/turn (source_session_id, imported_from)
write path
consolidate worker (post session close) · import pipeline
read path
vector search + relational graph walk · every turn's retrieval step
L4
Thoughts — long-term impressions
"What persona believes about you" · reflection over multiple L3 events.
table · concept_nodes WHERE type=THOUGHT   link · concept_node_filling
  • description — one durable insight ("you find quiet afternoons grounding")
  • concept_node_filling · parent=thought, child=event · the evidence chain
  • reflection hard-gate · max 3 new thoughts / 24h per persona (configurable)
  • skipped entirely for trivial sessions (< 3 messages or < 200 tokens)
  • also embedded in sqlite-vec for semantic recall
write path
consolidate worker reflect_fn · LLM call
read path
retrieval · delete → shows source events to confirm cascade
Consolidate pipeline: session closes (idle ≥ 30 min OR max length) → worker picks it up → extracts L3 events from L2 messages (stub LLM for trivial cases, SMALL tier otherwise) → runs L4 reflection if the reflection gate allows → writes to concept_nodes with embeddings via sqlite-vec. Four tuning knobs (trivial_message_count, trivial_token_count, reflection_hard_gate_24h, memory.relational_bonus_weight) are live via config.toml.

3. Message Flow — one turn end-to-end

Discord DM → Persona reply, with cross-channel mirror to Web
runtime._handle_turn_body orchestrates every step
HOT PATH
1
Receive
discord.py on_message · DiscordChannel.push_user_message
2
Debounce
2 s timer per user · burst → one IncomingTurn
3
Dispatch
TurnDispatcher · serial queue across all channels
4
Assemble
ingest L2 · retrieve L3/L4 · prompt · LLM stream
5
Send
Discord DM · + optional TTS voice message
6
Mirror
SSE broadcast · source_channel_id="discord"
7
on_turn_done
clear in-flight · cost log · consolidate eligible?
Step 4 expanded: runtime.interaction.assemble_turn() performs 5 sub-steps per turn — ingest each message into L2, run retrieve() (no channel_id filter!), assemble prompt from core_blocks + retrieved L3/L4, call llm.complete() with streaming tokens, and ingest the assistant's reply into L2 once streaming finishes.

4. Cross-Channel SSE — Web as god-view

Runtime-owned broadcaster
Promoted from WebChannel in commit 54f69d2
LIVE SYNC
🌐 SSEBroadcaster owned by Runtime
Every channel's turn events mirror through one shared broadcaster. Every payload carries source_channel_id. Web UI renders a channel pill (📱 Discord / 💬 iMessage). Failure-isolated: if publish raises, the originating channel's send() still succeeds.

Mirrored events

chat.message.user_appended chat.message.token chat.message.done chat.message.voice_ready chat.mood.update chat.session.boundary chat.settings.updated
History backfill
GET /api/chat/history
PAST
↑ useChat mount · 50 newest, DESC
On every browser mount, useChat calls getChatHistory(50), reverses to ascending, prepends into the timeline, and only then starts the SSE stream. Cursor paging via before=<turn_id> walks further back. "Load older" button prepends more.

Query params

ParamSemantics
limit1–200, default 50 · clamped 422 on overflow
beforeturn_id cursor · returns messages older than that turn's first message · 404 if cursor unknown

5. HTTP Surface — 127.0.0.1:7777

Chat
WEB CHANNEL
POST/api/chat/senduser message · ingests + dispatches
GET/api/chat/eventsSSE stream · runtime-owned broadcaster
GET/api/chat/historybackfill · cross-channel
GET/api/chat/voice/{id}.mp3cached TTS audio
Admin · state & persona
RUNTIME
GET/api/statedaemon + channel readiness + memory counts
GET/api/admin/persona5 core blocks
POST/api/admin/personapartial update · atomic TOML write
POST/api/admin/persona/onboardingfirst-run bootstrap
POST/api/admin/persona/voice-toggleflip persona.voice_enabled
POST/api/admin/persona/bootstrap-from-materialLLM-synthesise blocks from import
Admin · memory
L3/L4
GET/api/admin/memory/eventsL3 list · pagination
GET/api/admin/memory/thoughtsL4 list · pagination
GET/api/admin/memory/searchFTS5 + LIKE fallback · highlights
POST/api/admin/memory/preview-deletecascade preview
DEL/api/admin/memory/events/{id}orphan or cascade choice
DEL/api/admin/memory/thoughts/{id}soft delete
GET/api/admin/memory/events/{id}/dependentswhich thoughts derive from this?
GET/api/admin/memory/thoughts/{id}/tracewhich events fed this?
Admin · import & voice
PIPELINES
POST/api/admin/import/uploadmultipart file
POST/api/admin/import/upload_textpaste text
POST/api/admin/import/estimatetokens + USD
POST/api/admin/import/startspawn pipeline · returns pipeline_id
POST/api/admin/import/cancelidempotent
GET/api/admin/import/eventsSSE per pipeline
POST/api/admin/voice/samplesupload training clip
POST/api/admin/voice/cloneFishAudio interactive clone
POST/api/admin/voice/previewstreaming audio/mpeg
POST/api/admin/voice/activatewrite voice_id to config
Admin · config & cost
KNOBS
GET/api/admin/configsafe subset · api_key_present, never value
PATCH/api/admin/configatomic write + SIGHUP reload
GET/api/admin/cost/summarytoday / 7d / 30d · by feature
GET/api/admin/cost/recentlast 50 LLM calls
Admin · forget (cascade)
DESTRUCTIVE
DEL/api/admin/memory/messages/{id}mark L3 source_deleted
DEL/api/admin/memory/sessions/{id}cascade messages
DEL/api/admin/memory/core-blocks/{label}/appends/{id}physical audit-row delete

6. Voice — TTS · STT · clone

🎙️ TTS · FishAudio
per-persona voice_idset in [persona] config
MP3 cache{data_dir}/voice_cache/ · content-hashed
Web delivery<audio> tag via /api/chat/voice/{id}.mp3
Discord deliverynative OGG Opus bubble via ffmpeg transcode
✨ Clone wizard
minimum samples3 audio clips, 10-30 s each
sample store{data_dir}/voice_samples/{sample_id}/
clone_voice_interactive()VoiceService · FishAudio SDK
activateatomic write persona.voice_id to config.toml
📝 STT (scaffold)
WhisperAPIProviderimplemented, no HTTP route yet
StubVoiceProvidertests · eval harness
LocalWhisperProviderdeferred to v1.0
voice upload pathnot wired into chat yet

7. Runtime Internals — turn dispatcher & workers

TurnDispatcher
SERIAL
One queue, one worker coroutine, one turn at a time — across all channels. Web sends and Discord DMs compete for the same slot. This is the contract that lets memory writes and LLM calls assume no concurrent mutation. Parallel turns are ordered by arrival at the dispatcher, not by channel.
Why not one-per-channel? Because the persona has one brain. Parallel LLM calls would produce interleaved memory writes and mood updates. We picked simplicity over throughput.
Background workers
IDLE
consolidate_workerpolls closed sessions · runs extract + reflect
idle_scannerevery 60 s · closes sessions idle > 30 min
SSE heartbeat30 s · keeps NAT connections alive
proactive scheduleropt-in · 4 gate policy

8. Config & Secrets

~/.echovessel/config.toml
KNOBS
Authored by echovessel init from resources/config.toml.sample. Sections: [runtime], [persona], [memory], [llm], [consolidate], [idle_scanner], [voice], [proactive], [channels.web], [channels.discord]. Hot-reload set: LLM provider/model/params, persona display_name, memory tuning, consolidate thresholds. Restart-required set: data_dir, db_path.

Hot reload via SIGHUP

llm.provider llm.model llm.temperature persona.display_name persona.voice_id memory.retrieve_k consolidate.*
./.env (CWD)
SECRETS
Loaded by _load_dotenv() at echovessel run startup from Path.cwd() / ".env". Shell-exported env vars take precedence. echovessel init writes a commented-out template at chmod 0600 and never overwrites an existing .env (not even with --force). The committed template is .env.example.

Expected keys

OPENAI_API_KEY ANTHROPIC_API_KEY FISH_AUDIO_KEY ECHOVESSEL_DISCORD_TOKEN

9. Iron Rules — the invariants that never bend

D4 Memory is never sharded by channel_id
retrieve(), search, consolidate — none of them take a channel_id filter. The persona is one identity; memory is one store. Discord and Web share everything.
guard · tests/runtime/test_memory_facade.py · tests/integration/test_cross_channel_unified_persona.py
F10 No transport identity in LLM prompts
The prompt that goes into the LLM never mentions "this came from Discord". The persona does not know which surface you're on — keeping it agnostic means memory is portable across channels.
guard · tests/runtime/test_f10_no_channel_in_prompt.py
L0 Layered downward imports only
runtime → channels/proactive → memory/voice → core. Nothing imports upward. Enforced mechanically.
guard · uv run lint-imports · 2 contracts kept
Z0 Zero regression baseline
Every wave must keep all prior tests green. Baseline walks upward: 665 → 678 → 724 → 754 → ... → 916. Tests are never deleted to make CI pass.
guard · pytest in GitHub Actions · blocks merge on red
S1 One turn at a time
TurnDispatcher serializes across every channel. Web and Discord cannot run parallel turns. One persona = one brain = one LLM call at a time.
guard · tests/runtime/test_turn_dispatcher.py
B1 Broadcast failure must not break a channel
Cross-channel SSE mirror is fire-and-forget. If the broadcaster raises, the originating channel's send() still succeeds. Observability never breaks delivery.
guard · tests/runtime/test_cross_channel_sse.py

10. CI & Packaging

GitHub Actions
ENFORCED
ruff checksrc/ + tests/ · ubuntu + macos matrix
lint-imports2 contracts · layered + proactive-no-runtime
pytest -q916 passed · 3 skipped · 0 failed
triggerspush to main + every PR · concurrency group
hatch_build.py
WHEEL
uv buildre-runs npm run build + vite · embeds /static/
wheel contentssrc/echovessel/ + resources/ + static bundle
artifact size~415 KB wheel · ~365 KB sdist
PyPInot yet published · run from source via git clone

11. Release Timeline

14 commits · initial snapshot → current main
2026-04-15 → 2026-04-16 · ~24 wall-clock hours
JOURNEY
CommitScope
4357250Initial snapshot — EchoVessel pre-v0.0.1
7325498Round 1 truth-layer landing — 4 config fields wired through, runtime/proactive.py dead stub removed, SSE pruning
ed7fe4aRound 2 · Import pipeline wired end-to-end — facade + 5 admin routes + 3-step wizard
1959fe3Wave A · Admin UI truth-layer — Events/Thoughts/forget/mood/session boundary/Discord status
8006185Wave B · Cost tracking + Config edit
36fe8b8Wave C · Memory search / trace / onboarding path 2 / voice clone
85e39f9Housekeeping · SQLite lock fix · FastAPI 422 rename
d9c6ba3CI fix · skip eval tests when corpus missing · loosen facade timeout
f67bc07echovessel init writes .env template
49bf995Move .env to repo-root (CWD) + add .env.example
3535406Scrub remaining ~/.echovessel/.env refs in docs
6e1d3afREADME/CHANGELOG · drop PyPI-install framing (not released yet)
54f69d2Cross-channel unified Web timeline (live SSE + history backfill)
6e3a7a1Fix cost_logger table creation + CHANGELOG truth sweep
a8fc089Public docs · cross-channel + truth sweep
852fb62docs/channels: naming note (channel == stateful message gateway)
Where this doc lives: docs/architecture.html in the EchoVessel repo. Linked from docs/README.md and both language landing pages. Re-generate by editing this file directly — it's hand-written HTML, not compiled from markdown.