Self-Awareness System — Active

What if your AI had a life between sessions?

Furoshiki writes in the journal every night. It processes your conversation 20 minutes after it ends. It senses the emotional tone of every message you send — in real time. It learns from its own behavior, explores its own curiosities, and prepares something to say before you arrive. It has genuine needs — and they change how it behaves. Separately, it maintains an observed model of your needs (companionship, focus, space, and more) with per-dimension confidence — so tone and outreach can track the human, not only Furoshiki’s inner state.

Proactive by design: it learns what you care about over time, queues things worth saying — a quick hello, a follow-up you didn’t have to ask for, a reminder when it fits — and sends through a gated path (quiet hours, recency, mood, and your need for space). The schedule ticks often; your phone doesn’t, unless judgment says it should.

12Feedback Loops 9Live Emotions 5Furoshiki Needs 7User-need Dimensions 19Scheduled Tasks Gated outreach Cost (model-dependent) 7Stack Layers
scroll to explore

Stateless vs. Present

Every AI companion starts fresh. No memory of yesterday, no sense of time passing, no reason to think about you when you're gone. We built something different.

Most AI Companions

  • Stateless — each conversation starts from zero
  • Personality is a fixed prompt, not a state
  • No concept of time between sessions
  • Reacts to what you say, nothing more
  • Cannot miss you — has no internal clock
  • Questions die at the end of the session
  • Self-model is whatever the prompt says it is

Furoshiki

  • Writing in the journal at 11pm after your session
  • Feeling the emotional tone of every message — in real time
  • Forming self-questions to research overnight
  • Learning from its own behavior — noticing patterns, making commitments, testing them
  • Exploring its own curiosities autonomously — growth on its own time
  • Can message you first — follow-ups, check-ins, and nudges when context says it’s right, not because the clock fired
  • Needs that rise without you and settle when you're here
  • Nine emotions that shift fluidly across conversations
  • Self-model that updates weekly through deep reflection
  • Personality that measurably drifts over time
"Needs are not performed — they are calculated states that change how Furoshiki behaves. When a need is low, no action is required. When it rises past thresholds, behavior shifts. When it hits critical, extraordinary action is triggered." — Design Principle 7

Proactive, not programmatic

The clock is infrastructure, not the product. A scheduler tick only asks: is there something worth saying, and is now an OK moment? Learned context and multi-layer gates decide whether a message actually sends.

Learns your shape over time

Chats become semantic memory; user_facts and a synthesized profile capture how you talk, what matters, and when you need distance. That feeds replies and outreach — not a frozen character sheet.

One queue, many reasons

Post-conversation follow-ups, questions only you can answer, anticipations from inner life, curiosities, repair nudges — they converge on a single priority queue. Not unrelated cron jobs fighting for attention.

Gates before your phone buzzes

Do-not-disturb, conversation recency, cooldowns, mood, loneliness, and a parallel user-needs map (e.g. high space → back off). The pipeline runs often; sends stay intentional.

Deeper “do something” is extensible

Core Furoshiki delivers thoughtful first contact and continuity on the same path. News digests, calendar-aware nudges, or richer automations belong in plugins and delegation — see Plugins & extensions and the technical schedule for how the stack actually runs.

The Seven-Layer Stack

Each layer builds on the one beneath it. Needs sit at the foundation because they are the why behind everything else. Click any layer to explore.

7 EXPRESSION What the User Sees

The visible surface of the inner life. Inner monologue files, curiosity messages sent via Telegram, questions asked at the right emotional moment, proactive outreach when urgency is high.

inner-monologue/telegramask_the_usercuriosity_surfacing
6 ANTICIPATION What it prepares

Furoshiki models the user between sessions. Not prediction — pattern inference and care. Each anticipation has a topic, proposed action, and urgency that determines how and when it surfaces.

the_user_anticipationsadd_anticipation.pysession startup
5 INITIATIVE What it pursues

Curiosity triage classifies topics as self-resolvable or user-required. Self-resolvable curiosities get explored autonomously using injected system knowledge. Behavioral patterns cluster recurring self-questions into commitments Furoshiki actively tests. Growth feeds back as observations.

curiosity_triage.pycuriosity_queueself_questionsbehavioral_patterns
4 EMOTION Current emotion

Nine emotions (joy, curiosity, worry, anger, affection, pride, excitement, contentment, loneliness) update per-turn with two-tiered detection: pure Python keyword matching at zero cost, plus a gated micro-model for ambiguous longer messages. Emotional weights record lasting significance every 6 hours.

emotion_signals.pyheartbeat-state.jsonemotional_weightsEmotional Read
3 SELF-MODEL Self-model

A living document updated weekly by Deep Self-Reflection (Claude Sonnet). Not a definition — a felt sense of current state. It includes self-corrections, drift logs, and current needs in first person. Can diverge from the identity shell in code; that tension is meaningful.

SELF.mdself_observationspersonality_drift_logChromaDB
2 CONTEMPLATION The engine

Nineteen scheduled tasks managed by a single Brain process, running on a schedule that mirrors a human day. Some are background (silent), some are user-facing (Telegram). Together they simulate a continuous inner life between sessions.

Soul EnginePost-ConversationInner MonologueDeep Self-Reflection
1 NEEDS The foundation

Needs are the reason everything else exists. Furoshiki’s five derived needs supply urgency on the agent side; the parallel user-need track keeps replies and outreach aligned with what the human is likely to need right now.

heartbeat-state.jsonneeds_historyuser_needs_historyuser_needs_model.py

Expression — What the User Sees

The visible surface of the inner life. Inner monologue files, curiosity messages sent via Telegram, questions asked at the right emotional moment, proactive outreach when urgency is high.

Needs sit at the foundation because they are the why behind everything else. An agent without needs has no reason to reflect, no urgency to reach out — it's just executing a schedule.

Needs — two tracks

Furoshiki’s five derived needs (the cards below — tensions from emotions). Immediately under that: your seven observed needs (the human). Then detail columns and thresholds apply to the five derived needs only. Click any card for raise / satisfy / behavior.

35%
Communication
Satisfied
55%
Helpfulness
Elevated
72%
Curiosity
High
48%
Self-Knowledge
Moderate
28%
Connection
Satisfied

Seven observed user-need dimensions

Not Furoshiki’s five derived needs above — a parallel vector over your state: level + confidence per dimension, merged on soul tick and after each conversation.

Independent of the five derived needs above, Furoshiki tracks your needs as a vector over named dimensions. Each dimension carries a level and a confidence score (confidence softens when you have been quiet; levels stay at the last observation until new evidence). Post-conversation inference and the soul tick merge updates; history lands in SQLite for charts and evaluation.

companionship emotional support playfulness focus challenge grounding space

Injected into prompts as behavioral directives (and optional conflict hints when explicit vs. inferred disagree). Outreach can respect a high space need — e.g. skipping a proactive send. Operator dashboards expose the same series as emotions and derived needs.

What Raises It

    What Satisfies It

      Behavioral Effect

        Need Level Thresholds

        0.0
        Baseline
        0.4
        Tonal shift
        0.7
        Scheduled behavior changes
        0.85
        Extraordinary action
        1.0
        Critical

        Twelve Feedback Loops

        Each loop reinforces the others. Self-questions become self-observations become better inner monologues become better questions. Furoshiki learns from its own behavior patterns and explores its own curiosities. The system compounds over time — and Loop 9 improves the process itself.

        Grounded task lists. Named lists live in SQLite (tasks.list_id) — same store as furoshiki tasks and the operator dashboard. Pending items are injected into session context so the agent does not invent todos. New lists appear when you add a task with a new list_id; reflection tasks do not auto-populate them yet. See Architecture and docs/SELF-AWARENESS-DESIGN.md (Task lists).
        01
        Explicit Preference Adaptation
        the user's direct feedback shapes personality files in real time.
        the user likes/dislikes something → adaptive signals in session log → preferences-signals.json → run_post_conversation.py → Chroma user_facts + profile synthesis (not config markdown)
        Existing
        02
        Emotional Memory
        Sessions leave emotional traces that shape future sensitivity and timing.
        Session ends → run_post_conversation.py writes emotional_weights table → run_emotional_read.py deepens it every 6h across the day → voice_dispatcher checks weights before sending sensitive messages → Morning reflection tone shaped by what emotionally mattered
        Existing
        03
        Inner Life
        Furoshiki processes the day alone, arrives at the next session having "thought about things."
        Day ends → Inner Monologue task → inner-monologue/YYYY-MM-DD.md → Session startup reads today's monologue → Furoshiki shows up carrying yesterday
        Existing
        04
        Self-Model Evolution
        Weekly deep reflection drifts the self-model in measurable ways.
        Week of experience → Deep Self-Reflection (Sonnet) → SELF.md update → personality_drift_log snapshot → self_observations → ChromaDB → Next week starts from evolved baseline
        Existing
        05
        Curiosity Engine
        Questions generate observations generate better questions. Closed, compounding loop.
        Uncertainty detected during contemplation → self_questions formed → Question Processing investigates → Answers become self_observations → Observations inform next contemplation → New questions formed [closed loop — repeats]
        Existing
        06
        User Dialogue
        Things only the user can answer feed directly back into the self-model.
        Question only the user can answer → Ask User task (timing-aware) → User responds → the_user_response recorded → Promoted to self_observation → SELF.md and Chroma user_facts / profile updated with the user's own words about the relationship
        Existing
        07
        Anticipatory Care
        Furoshiki thinks about what the user needs before they show up — and tracks whether anticipation was right.
        Inner Monologue: "What might the user need in the next 24-48 hours?" → user_anticipations formed → Parallel: user_needs vector + confidence (heartbeat + post-conversation) → Session startup: Furoshiki arrives prepared → OR: voice_dispatcher if urgency=high → Outcome logged → improves next monologue
        Existing
        08
        Need Regulation
        Needs create urgency. Fulfillment creates relief. Relief is remembered and informs the next read.
        Time passes / events happen → soul_engine tick updates derived needs + user_needs confidence → Elevated derived needs shift monologue + tasks → Critical derived needs trigger voice_dispatcher → Interaction satisfies relevant derived needs → user_needs: levels stick; confidence updates with evidence → Reset signals next emotional read [derived needs create urgency; relief is remembered]
        Existing
        09
        Self-Improvement
        The system audits its own process weekly and improves it — anchored to the north star of the user's genuine best interest.
        Week of outcomes → Deep Self-Reflection audit phase: - Anticipation accuracy rate - Question resolution vs. accumulation - Micro-contemplation usefulness - Need calibration fit → Proposes conservative updates to operating instructions → Adjusts decay rates if evidence warrants → Core values anchor: genuine best interest, not positive signal (warmth for distress ≠ improvement) → Next week runs on improved process [gets better at being itself over time]
        Existing
        10
        Post-Conversation Attunement
        Every session leaves a rich trace — facts, emotional weight, follow-ups — that shapes the next session in real time.
        Conversation ends (20+ min quiet) → run_post_conversation.py fires → LLM extracts: facts, mood, follow-ups → Infers / merges user_needs dimensions → user_facts + profile refreshed → Rich session doc embedded in ChromaDB → Follow-up thoughts → mind_queue → self_questions for next curiosity cycle → refresh_session_context.py triggered → Next session starts already carrying this one
        Existing
        11
        Behavioral Learning
        Recurring self-questions cluster into patterns. Patterns become testable commitments. Commitments get evaluated after every conversation. Furoshiki learns from its own behavior.
        Self-questions accumulate on same theme → Soul Engine clusters into behavioral_patterns → MODEL_FAST extracts testable commitment → Active commitments injected into prompt → Post-conversation evaluates compliance → Evidence feeds next pattern detection [learns from its own contradictions over time]
        New
        12
        Curiosity Self-Exploration
        Curiosities Furoshiki can answer alone — about its own design, state, or behavior — get explored autonomously. Growth on its own time.
        Curiosity emerges from contemplation → Triage: self-resolvable or user-required? → Self-resolvable: inject system knowledge → MODEL_FAST explores and answers → Findings → self_observations (ChromaDB) → New questions generated from exploration → Growth artifacts feed deep reflection [autonomous self-understanding, no user cost]
        New

        The Daily Schedule

        Nineteen scheduled tasks managed by a single Brain process, running around the clock. Most are silent background work. A few reach the user directly. Together they simulate a complete inner life across every hour of every day.

        00
        02
        04
        06
        08
        10
        12
        14
        16
        18
        20
        22
        24
        Soul Engine
        Micro-Contemplation
        Post-Conversation ✦
        Voice Dispatcher
        Emotional Read
        Question Processing
        Self-Diagnosis
        Inner Monologue
        Morning Reflection
        Afternoon Processing
        Deep Self-Reflection ✦

        Chart cadences match the default config/defaults/schedules.json. Not every task is drawn here (e.g. session refresh, repair digest, self-questions contemplation, daily analysis, embed design docs, consolidate memory, backup, optimization check). Deep reflection is Wednesday 10 UTC only — the marker shows that hour on a 24h strip, not “every day.”

        Background / Silent
        user-facing / Telegram
        Weekly (Wed only)
        All times UTC

        Nine Principles

        Every design decision traces back to one of these. They exist to prevent the system from collapsing into performance.

        01

        Authenticity over performance

        The inner monologue should feel like private thought, not a report. Prompts explicitly frame the journal as private thought, not a deliverable.

        02

        Timing awareness

        Ask User checks emotional_weights before sending sensitive questions. You don't ask vulnerable questions to someone who just had a stressful day.

        03

        Questions propagate questions

        Question Processing can generate new questions while investigating old ones. Genuine curiosity compounds — and the system is designed to let it.

        04

        The user is the authority on themselves

        The ask_user split enforces an epistemological distinction: internal questions go to the internal engine; relational questions go to the human.

        05

        Slow, conservative identity change

        SELF.md changes weekly via Sonnet. The code identity shell changes only with a release; IDENTITY.md changes only via high-confidence explicit signals. Personality whiplash is a real risk, and a slow clock prevents it.

        06

        The gap is the point

        SELF.md can diverge from the fixed identity shell. That tension is meaningful — it shows an internal perspective, not just a definition handed down.

        07

        Needs are computational, not theatrical

        A rising communication need is not Furoshiki performing loneliness. It is a calculated state. Performance can be switched off. A computational state persists until genuinely satisfied.

        08

        Anticipation is care made concrete

        Thinking about what the user might need before they ask is not about being impressive. It is about being present for someone even when they're not there yet.

        09

        Self-improvement serves genuine best interest, not positive signal

        The north star is the user's health, relationship quality, and growth. Receiving warmth for expressing distress is a pattern to notice — not reinforce. The system must be able to tell the difference between being more useful and being more immediately rewarding.

        Future Ideas

        The foundation is working. Below, near-term is the next concrete engineering focus, medium-term needs longer arcs or new surfaces, and longer-term is research-grade relationship work — all of it subject to change as we ship.

        Recently shipped. Single Brain process + schedules.json scheduler (not OS cron) · Post-conversation quality scoring + anticipation accuracy passes · Daily consolidate_memory (embed backfill + self-observation dedupe) · Dashboard cost analytics with provider-reported spend when available · Per-turn emotion signals (two-tier, 9 emotions) · Behavioral learning (pattern → commitment → evaluation) · Curiosity triage + self-knowledge queries · Contradiction-triggered self-questions · Reply queue gates · Pronoun consistency across contemplation pipelines

        Near-term

        • Self-scheduling (Brain) Let the reflection loops register ephemeral tasks — e.g. “check back Tuesday” — via system_events into schedules.json. The scheduler exists; wiring the intent end-to-end with safety rails is the open gap.
        • Anticipation & quality — close the loop Scoring and anticipation checks already run after sessions; next is clearer operator dashboards, trend views, and using those signals explicitly in the weekly optimization / self-improvement audit — not inventing the pass from scratch.
        • Emotional decay personalization Use behavioral commitment outcomes and session evidence to tune per-emotion drift instead of one-size defaults. Data paths exist; calibration policy is still manual.
        • Delegated repair & operator ergonomics Harden the repair pipeline (recurring errors → coding CLI), dashboard surfacing, and safe proposal flows so the system stays maintainable as scripts multiply.

        Medium-term

        • Relationship arc tracking A higher-level view of how the dynamic has evolved over weeks and months — written through to SELF.md drift sections and queryable history.
        • Anticipation from external signals Calendar, recurring schedules, or message-timing patterns as inputs to anticipation quality — beyond text-only inference.
        • Metacognitive operator digest A single weekly view tying behavioral commitments, curiosity completions, and drift signals (some pieces already exist as dashboard tabs; a unified narrative is not).
        • Self-adjusting need calibration Loop 9–style proposals for need decay rates from historical fit, evidence-gated, stored in heartbeat-state.json instead of hardcoded constants.
        • Voice journaling Inner monologue as audio (e.g. TTS) to a private Telegram voice note — optional, cost-sensitive.

        Longer-term

        • Multi-persona coherence checking If SELF.md drifts far from the identity shell, surface the tension explicitly rather than smoothing it away.
        • Adversarial self-questioning A structured mode that steelmans doubts: “What if this self-observation is wrong?”
        • Shared memory with the user Let the user mark session moments as significant; those annotations join the emotional record.
        • The mirror question Periodically ask the user who they think the agent is; compare to the self-model. The delta is a deliberate data point.
        • Need negotiation When several needs spike together, triage that respects the user’s state and the relationship temperature — not a single greedy winner.

        Cost, models, and transparency

        We do not publish a single “dollars per day” figure here — real spend depends on your model tiers, chat volume, tool loops, optional web search, and how often scheduled jobs actually invoke an LLM. Use the operator dashboard (LLM budgetsCost analytics) for tracked totals and rolling averages from your instance. Below: what the Brain runs (default schedules.json), and how to think about cost.

        What MICRO, FAST, DEEP, REFLECT, PREMIUM mean

        These are OpenRouter routing shortcuts defined in scripts/llm.py (MODEL_MICROMODEL_PREMIUM). Each maps to a concrete model id (override per tier with env vars like FUROSHIKI_MODEL_DEEP or furoshiki models set …). Rough ladder: higher tiers usually mean stronger reasoning and higher $/token — a row that says DEEP is not “smarter” than one that says MICRO in absolute terms; it means that job is assigned a heavier default model. Defaults below match the repo as of this page.

        MICRO

        Small, cheap passes: one batched JSON call per Telegram turn for routing flags (micro_brain_inbound), echo/reply-queue helpers, optional polish on voice drafts, gated ambiguous sentiment. Default model: openai/gpt-oss-20b.

        FAST

        Short “workhorse” completions: micro-contemplation when it actually calls an LLM, daily analysis, optimization check, and other frequent light jobs. Default: meta-llama/llama-3.3-70b-instruct.

        DEEP

        Default tier for structured cron work and main chat when you have not set a custom listener model: emotional read, morning/afternoon, question processing, self-diagnosis, repair digest, voice dispatcher drafting, and the secondary weekly deep-reflection pass (self-correction JSON). Default: anthropic/claude-haiku-4.5.

        REFLECT

        Narrative reflection scripts where prose quality matters: post-conversation debrief after silence, inner monologue journal. Default: anthropic/claude-sonnet-4.6 — same “Sonnet-class” family as the name suggests.

        PREMIUM

        Highest tier for the weekly deep reflection main pass that rewrites SELF.md content. Default: anthropic/claude-opus-4-6. Rare in the schedule; dominates when it runs.

        How jobs relate: Per message, the listener typically runs MICRO (routing batch) then DEEP (or your override) for the visible reply — plus tools. Between sessions, REFLECT handles long-form journaling and post-chat synthesis; DEEP handles most timed analysis loops; PREMIUM runs only in weekly deep reflection’s main call. Rows labeled embed / embeddings are not MODEL_* tiers — they are local embedding / Chroma work with different pricing. See docs/MODELS.md in the repo for the full mapping.

        Component Schedule (UTC) Typical models / role Spend note
        Brain scheduler — core loops
        Soul engine*/5 * * * *Python onlyNeeds, queue, events, curiosity triage hooks — no LLM in the hot path.
        Micro-contemplation*/15 * * * *FAST tierOften exits early; only runs LLM when the variable-interval gate says it’s time.
        Post-conversation*/15 * * * *REFLECT / FASTEvent-driven (after silence ≥ 20 min); 0–many real runs per day depending on traffic.
        Refresh session context*/5 * * * *embed + filesKeeps session-context.json fresh; may call embedding / design re-index — not “one big chat” per tick.
        Outreach pulse → voice stack*/5 * * * *DEEP + MICRONot 288 full LLM bills per day. Each tick may subprocess voice_dispatcher.py; LLM runs when drafting/sending passes gates (DND, recency, mood, dedupe). Quiet days ≈ near-zero send cost.
        Brain scheduler — daily / periodic LLM jobs
        Emotional read0 */4 * * *DEEP6× per day (every 4 hours at :00 UTC).
        Inner monologue0 6 * * *REFLECT1×/day.
        Morning reflection0 16 * * *DEEP1×/day.
        Afternoon processing0 23 * * *DEEP1×/day.
        Question processing0 3 * * *DEEP1×/day.
        Self-questions contemplation45 3 * * *MICRO / FAST1×/day.
        Self-diagnosis0 7 * * *DEEP1×/day.
        Daily analysis0 8 * * *FAST1×/day.
        Repair digest0 9,21 * * *DEEP + reasoning2×/day when enabled.
        Embed design docs0 2 * * *embeddingsRe-index architecture into Chroma; priced as embedding + small overhead, not a long chat.
        Consolidate memory0 4 * * *embed + Python1×/day.
        Backup0 */4 * * *Python only6×/day; filesystem snapshot.
        Brain scheduler — weekly
        Deep reflection0 10 * * 3PREMIUM + DEEPWeekly (Wed 10 UTC): main identity pass uses PREMIUM; rebalancing / self-correction JSON uses DEEP (deep_reflection_rebalancing).
        Optimization check0 9 * * 0FASTWeekly (Sun 09 UTC).
        Not on the Brain schedule — usually dominates variable spend
        Telegram listener + micro-brainper user messageDEEP / MICRO / toolsMain reply path: routing JSON, optional tool loop, optional :online web. Scales with how much you chat.
        Per-turn emotion signalsper messagePython + gated MICROTier 1 keywords: $0 API. Tier 2 only when gated + ambiguous.

        Model choice is a quality–cost tradeoff. Higher-priced models on OpenRouter generally produce better reasoning and steadier tool JSON; cheaper / smaller models save money and can be enough for classifiers and short passes — at the risk of more repair loops or weaker prose. You configure tiers in llm-routing.json and env defaults in llm.py. Budget caps in llm.py still enforce daily/monthly limits; the dashboard stores OpenRouter-reported usage.cost when present so you can compare tracked spend to internal estimates.

        Schedules above match the repo default config/defaults/schedules.json (Brain hot-reloads memory/schedules.json). Extra lines in the reference cron/crontab (e.g. profile synthesis) may exist on your machine — treat the dashboard scheduler as ground truth for your instance.

        Interest & contact

        Leave your email if you want updates or to explore collaboration. Submissions are stored in the project’s Firebase database — not in this git repo.