← Projects
EchoVessel

EchoVessel

Local-first digital persona engine with long-term memory, voice, and channel integrations — carry an echo long enough for it to become presence.

Type
CLI
Role
Solo
Status
Active
Tech
Python 3.11+ FastAPI SQLite + sqlite-vec React 19 Vite TypeScript sentence-transformers FishAudio TTS Whisper discord.py pytest
Started
Apr 2026
EchoVessel — a digital persona engine

EchoVessel is an open-source digital persona engine. You define or distill a persona from your own settings and source material, then run it as a long-lived companion that remembers, speaks, and grows with you — instead of resetting after every reply.

The core idea: a persona shouldn’t feel like a new tab every time you open it. It should feel continuous.

Why this exists

Most chat tools treat memory as a vector dump and identity as a system prompt. The result is responsive, but never present. EchoVessel asks a different question: what does it take for a digital persona to feel like the same person tomorrow as it was today? The answer turned out to be a system, not a feature — one where memory, voice, and behavior all serve the same continuity.

What it actually does

Five modules cooperating inside a single local daemon:

  • memory — long-term persona memory, hierarchical (L1–L6)
  • voice — text-to-speech, speech-to-text, voice cloning
  • proactive — autonomous outreach with policy gates
  • channels — pluggable transports (Web, Discord, more on the way)
  • runtime — the daemon that ties everything together
EchoVessel architecture — module layers and contracts

Open the full architecture diagram →

Memory is the heart of it

Most “AI memory” is a search problem: find the most similar past chunk, paste it into the prompt. EchoVessel treats memory as a structure — six layers, each answering a different question about the persona’s relationship with you.

LayerQuestion it answersWhat it storesWritten whenRole at read time
L1 · core blocks”Who am I right now?”Short, stable text — persona, self, user, relationship, styleManually, on admin edits, or via importAlways injected into the prompt, unconditionally
L2 · recall messages”What was literally said?”Every user and persona message, verbatim, FTS5-indexedOn every turn, immediatelyGround-truth archive; expands context around L3 hits
L3 · events”What happened in that conversation?”One-line episodic facts with emotional impact, tags, embedding, and an optional day-precision event_time windowWhen a session closes (extraction pass)Primary target of vector retrieval; day-precision delta renders as “a few days ago” / “next week” in the prompt
L4 · thoughts, intentions, expectations”What do I believe about this person, what have I promised, what am I expecting?”Backward-looking insights, strict persona-side commitments, and forward-looking expectations with due dates — all in the same table, distinguished by typeReflection (fast loop) + a slow-tick phase that runs between sessionsVector-retrieved; pinned persona-thoughts also render as # About {speaker}, promises as # Promises you've made, expectations as # You've been expecting
L5 · entities + aliases”Who are the third parties I know about, and what do they go by?”Canonical names, any aliases (“Scott” = “黄逸扬”), and a many-to-many junction to the events they appear inExtraction at session close, with a three-tier dedup (alias match → embedding thresholds → ask-user when uncertain)Alias scan on every query; an exact alias hit pulls every linked event into the candidate pool with a rerank bonus — the engineering basis for cross-language recall
L6 · episodic state”How do I feel right now?”A single-row JSON snapshot: mood, energy, last user signal, timestampAs a side effect of extraction — no extra LLM callRenders as # How you feel right now in the system prompt; decays back to neutral after 12 hours so a long quiet period doesn’t open under stale affect

How a memory gets picked

When the persona is about to reply, every candidate gets ranked by a five-factor score:

score = 0.5 · recency + 3.0 · relevance + 2.0 · impact + 1.0 · relational_bonus + 1.5 · entity_anchor
  • recency — exponential decay with a 14-day half-life
  • relevance — vector similarity to the current query, normalized to [0, 1]
  • impact|emotional_impact| / 10, so peak moments outweigh forgettable ones on ties
  • relational_bonus — a flat +1.0 whenever a memory carries an identity-bearing, vulnerability, turning-point, commitment, or correction tag
  • entity_anchor+1.0 whenever the query mentioned an alias that resolves to this memory’s linked entity; the escape hatch for cross-language recall where the embedder sees zero overlap

A min_relevance floor (default 0.4) drops orthogonal matches before scoring, so a high-impact unrelated event cannot sneak in on the back of the impact weight. Entity-anchored candidates bypass the floor — if you asked about “Scott” and the event only mentions “黄逸扬”, the anchor is allowed to carry it through. The shape of this formula owes a debt to the Stanford “Generative Agents” paper; the relational and entity-anchor bonuses are the parts tailored to persona memory.

The hard problem isn’t storage. It’s deciding what to remember, how to represent it, and when it should wake up and influence the next reply.

EchoVessel memory — six layers, one picture

Open the interactive memory diagram →

How a single message wakes the system up

Every message triggers a small choreography across layers: which memories surface, which get written, which get distilled into longer-term form. The companion runtime-flow page traces this turn-by-turn against a real conversation.

EchoVessel runtime flow — how each memory layer wakes up

Open the runtime flow diagram →

Thinking between turns

A persona that only ever reacts when you speak is a chatbot. A companion should also think about you when you’re quiet.

Between sessions, a small reflection phase runs as part of the consolidate worker. It reads recent events and produces forward- looking output as typed memory nodes — observations that became visible, commitments the persona has made to itself, and expectations about what you might bring up next. When you come back and say “anything on your mind?”, the reply can reference specific content the persona was turning over — not freshly generated flattery, but something that actually got written down while you were away.

The phase is fenced in carefully. There are token walls per cycle, a daily cap, a kill switch in config, a 20% edit-distance bound on self-narrative appends, and a closed enumeration of what kinds of nodes the reflection can write. It is not allowed to invent new goals, schedule future actions, call external APIs, or recurse. The design constraint is honest reflection with strict blast radius.

Voice as identity

Voice isn’t a TTS afterthought. Each persona has its own voice (cloned or selected) that speaks across every channel — including native Discord voice messages, indistinguishable from the bubble a human friend would send.

Relationships without affection meters

EchoVessel doesn’t have a “likeability score.” A persona’s bond with you is visible in behavior — tone shifts, naming changes, deeper recall, more initiative — not a progress bar.

Local-first by default

Your persona lives on your machine. The data file sits in ~/.echovessel/memory.db. The embedder runs locally. The only network traffic is to the LLM endpoint you configure. No telemetry, no phone-home, no gradual creep into the cloud.

Ethics & open source

EchoVessel is for fictional characters, original characters, your own self-persona, consented digital counterparts, and creative or memorial reconstructions. It is not an impersonation tool for pretending to be a real person in external communication.

It stays open-source because digital presence and intimate computing tools should not belong only to closed commercial platforms.

Name

EchoVessel — carry an echo long enough for it to become presence.

Say hi 👋