How Smartopol Works — Self-Hosted AI Agent Architecture

§ 01 · the short version

If you only read five minutes, read this.

01

One program, always on.
Smartopol runs as a single background service on your machine. Every front-end — terminal, Discord, Telegram, web, external tools — talks to the same brain. One memory, many doors in.
02

Memory Pyramid is the brain.
A hierarchical memory that organizes itself as you use it. All conversations, all channels, all users. One local file on your disk. Sub-100ms recall across millions of tokens.
03

Context is built in layers.
What never changes (tools, abilities), what changes slowly (who you are), and what changes every turn (this moment's recall) are kept separate. Big practical benefit: roughly half the input-token bill compared with a naïve agent.
04

Every night, the agent dreams.
A bounded reflection job consolidates the day's memory, promotes what mattered, prunes noise, updates the user profile. Budget-tracked, auditable, opt-out-able.
05

Nothing here is a black box.
Every tool call is written to a tamper-evident audit log. Every night's reflection writes a dated receipt you can read. Every plugin runs in a sandbox and must declare what it touches. You own the data, the log, and the keys.

§ 02 · memory architecture

The long section. Where we earn the "native Memory Pyramid" claim.

Level N     [Meta-summaries]           ← Cross-topic patterns, lifetime insights
                 │
Level 2     [Domain Summaries]         ← "Work projects", "Home automation"
                 │
Level 1     [Topic Summaries]          ← "The X project", "Recipe preferences"
                /│\
Level 0  [Chunks] [Chunks] [Chunks]  ← Raw conversation fragments

Three operations keep the pyramid alive.

ingest

Every message is broken into small, searchable fragments and stored with the context of when and where it was said.

consolidate

A background job groups related fragments into topic summaries, then groups summaries into bigger ones. Important things climb. Noise compresses.

prune

Things you revisit stay sharp. Things untouched for months collapse into their summary and are forgotten on purpose. The way a brain is supposed to work.

Recall — hybrid, not naïve.

Exact matches run instantly — for names, dates, the precise word you used.
Meaning matches run in parallel — for “that thing we talked about last week,” even if you phrase it differently today.
Both scored together — a result that wins on both wins big. No either/or.
All levels at once — a single recall can pull a raw sentence and the summary that explains it.
Hot cache for queries you hit often. Hot path: 10–15ms.

why this matters

Model context windows keep growing, but a growing body of evidence shows recall accuracy falls off long before you hit the limit. Dumping millions of tokens into a prompt is slow, expensive, and actually worse than smart retrieval.

Targeted recall of the right 2–4 KB beats a million-token dump every time. Memory Pyramid is the thing that decides what “the right 2–4 KB” is.

§ 03 · context in layers

What changes fast, what changes slowly, what never changes — kept separate.

static

Rarely changes.

What the agent can do — its tools, its stable identity. Reused across conversations.

slow

Who you are.

Your profile, your habits, how you like to work. The agent reads it; it cannot silently rewrite it.

volatile

Rebuilt every turn.

This moment's recall — what the pyramid surfaces as relevant right now.

Why bother separating them? Because the two that don't change each turn can be reused by your LLM provider instead of re-charged for. Practical result: roughly half the input-token bill a naïve agent would rack up.

§ 04 · the dreaming cycle

This is the section readers came for.

Off by default. When you turn it on, the agent reflects once a day at a time you choose — by default 04:00 local, when you're asleep and the server is quiet.

What it reads

The last 24 hours of conversation across every channel. What the pyramid promoted, demoted, added, or forgot. The current version of your profile.

What it writes

Three small, bounded sections of your profile, trimmed to stay readable:
  • Observed traits (up to 10)
  • Patterns (up to 5)
  • Recent shifts (up to 3)

Plus a dated receipt in your data folder. You can read it. You can revert it.

What it cannot touch

Its own core identity, its stated values, its tool permissions — all read-only by design. An agent that rewrites its own soul drifts within weeks. Also off-limits: the program itself, your keys, other users' memory.

The seventh-night greeting

Every seventh successful reflection, the agent generates a short, personal greeting in its own voice. It arrives on your next message, then disappears. Most assistants fake relationship through persona prompts. Smartopol earns it through bounded, observable reflection.

I noticed you're shipping more often on Friday afternoons than in the mornings. I've been prepping release checklists at 14:00 local — let me know if that's off. — example seventh-night greeting, generated in the agent's voice

§ 05 · actor-model sessions

One actor per session. Crash isolation by design.

The service runs an actor-model session system. Actors communicate by message passing — no shared mutable state. Documented as a live spec — every behavior change lands in the same diff as the code.

▸ Crash isolation — one bad session cannot take down another
▸ Backpressure — per-session queue depth, visible in smartopol sessions list
▸ Hot reload — config changes apply to new actors without killing existing ones
▸ Multi-user fairness — per-user rate limits, admin multipliers, guest caps

§ 06 · model routing

You pick the models. The agent picks the right one for the moment.

You configure an ordered list of models you're willing to use — a primary, a fallback, a local last-resort. When the primary hits a rate limit, the next one takes over. When an API key runs out of money, the agent parks that provider for a longer cool-down (somebody forgot to pay the bill). When the primary comes back, it reclaims the top slot.

The important part: the conversation is shared. Whichever model is answering this turn sees the full history — not a fresh start, not a summary. Failovers are invisible.

§ 07 · channels

One brain. Many mouths.

One agent. Five places people can reach it today (Discord, Telegram, web, built-in terminal, direct API). Slack and WhatsApp on the roadmap. Each channel is a thin translator between the platform and the same underlying brain.

A user paired across Discord and Telegram is one identity with two inbound doors. The agent remembers what was said on Discord when the same person pings on Telegram.

§ 08 · plugin sandbox

Third-party code runs in a cage. No exceptions.

Every plugin you install runs in a locked-down sandbox with a fixed memory limit, no access to your filesystem, and no network unless it declared it upfront. If a plugin wants to read your email inbox, you see that before it runs — not after.

Declared permissions — each plugin lists what it needs. Nothing implicit.
Signed releases — third-party plugins ship cryptographically signed. A tampered plugin won't load.
Static scan — every plugin is checked for known-dangerous patterns before first run. Flagged ones are blocked by default.
No ambient trust — a plugin can't phone home or touch a file you didn't hand it.

§ 09 · audit trail

Every action the agent ever took is on disk, in order, tamper-evident.

An append-only log in your data folder records every tool call, who asked for it, what arguments were used (secrets redacted), and the result. Each entry is cryptographically linked to the one before it — nobody can quietly delete or edit a line without breaking the chain.

A built-in verify command walks the log and reports the first break it finds. In regulated contexts — healthcare, legal, finance — you can prove the agent's behavior matches an independently captured timeline.

§ 10 · updates

The agent updates itself safely — mid-conversation, without downtime.

Run smartopol update and the agent checks for a new version, downloads it, verifies it's a legitimate signed release, swaps itself atomically, and keeps your conversation alive across the restart.

· No package manager required
· No sudo required
· Updates come from a static, vendor-neutral location — not tied to any single code-hosting service
· The agent won't accidentally kill itself during an update. By design.

§ 11 · the short version

Six sentences. Everything else follows.

Memory is the bottleneck. Fixed with a real hierarchy, not a shoebox full of vectors.
Prompts are expensive. Layered so your provider reuses what didn't change — roughly half the bill.
Agents drift. The nightly reflection is bounded, reversible, and never rewrites the agent's core values.
Channels are silos. Identity is paired across them so one brain spans Discord, Telegram, web, and more.
Plugins are a supply chain. Signed, sandboxed, permission-declared — no ambient trust.
Audit is usually an afterthought. Here it's tamper-evident from day one.

One program. No magic.