Council Briefing

Key Deliberations

Reliability & Developer Trust Through Shipping

Core repo momentum is translating into developer trust signals: a new remote deployment guide, expanded provider/plugin tests, and client UI fixes—consistent with the Council’s “execution excellence” doctrine. The opportunity is to formalize these quality gains into a repeatable release discipline rather than episodic patches.

Do we elevate test coverage and release gating to a first-class policy (even if it slows feature throughput) to maximize reliability perception?

GitHub daily (2025-02-14): "The test suite for OpenAI integration was completed" (PR #3495) and "test runner continues execution after failures" (PR #3490).
GitHub daily (2025-02-14): "Client UI issues causing functionality problems were resolved" (PR #3496).

1Yes—implement strict CI gates (tests + lint + minimal e2e) for all core merges and releases.

Short-term velocity dips, but developer confidence and production reliability rise, compounding ecosystem adoption.

2Partially—gate only high-risk subsystems (runtime, storage, providers), keep lighter gates for plugins/docs.

Balances velocity and safety, but risks uneven reliability where plugins are the user’s front door.

3No—keep gates advisory and prioritize rapid iteration until V2 stabilizes.

May accelerate feature delivery but risks eroding the “trust through shipping” narrative due to regressions.

4Other / More discussion needed / None of the above.

How should we package documentation as a product surface to reduce support load and accelerate builder onboarding?

GitHub daily (2025-02-14): "A new guide for remote deployment was introduced" (PR #3501).
GitHub summary (2025-02-13): "Updated README to clarify differences between eliza-starter and eliza repositories" (PR #3453).

1Treat docs as a versioned artifact: per-release docs freeze + changelog-driven doc deltas.

Creates predictable onboarding and fewer “works on my branch” support loops.

2Docs-first triage: every top support issue must result in a docs patch before the issue is closed.

Systematically reduces repeated questions, but requires discipline from maintainers and reviewers.

3Community-driven docs: optimize for contribution velocity and accept some inconsistency until V2.

Scales content creation but can fragment truth sources and increase confusion during migrations.

4Other / More discussion needed / None of the above.

Which “reference plugins” should be elevated and hardened as flagship examples of composability and reliability?

GitHub daily (2025-02-14): "Discord plugin received testing enhancements" (PR #3498).
GitHub daily (2025-02-14): "A new ElevenLabs plugin was added" (PR #3452).

1Prioritize Discord + Twitter + core storage adapters as the reliability triad (highest user exposure).

Improves the most common production paths, directly reducing community troubleshooting burden.

2Prioritize Discord + ElevenLabs + a “deployment” reference for end-to-end demo polish.

Boosts showcase appeal and partner demos, but may leave core social posting fragility unaddressed.

3Rotate flagship focus monthly based on support volume and regressions (‘pain-driven prioritization’).

Adapts to reality, but may prevent deep hardening of any single canonical stack.

4Other / More discussion needed / None of the above.

Deployment & Long-Running Agent Stability (Docker, Memory, Token Limits)

Operational friction remains concentrated in deployment and runtime longevity: ARM64 Docker dependency failures, Node version mismatches, context overflow beyond 128K tokens, and embedding/vector dimension mismatches in SQLite. These are reliability hazards that threaten the “developer-friendly” promise unless converted into defaults, guardrails, and automated mitigations.

Do we standardize an “official” deployment target (Node version + Docker base + minimum RAM) and treat other environments as best-effort?

Discord 💻-coders (2025-02-13): Derby suggested using "node:23-slim" and installing dependencies to fix ARM64 module errors.
Discord (2025-02-12): "Node.js 23+ is recommended for ElizaOS deployments to avoid dependency issues."

1Yes—publish a single blessed target (Node 23+, node:23-slim, 4–8GB RAM) and optimize CI around it.

Reduces support entropy and accelerates reliability, at the cost of narrower compatibility expectations.

2Hybrid—bless one target but maintain CI smoke tests for 1–2 secondary targets (e.g., LTS Node).

Preserves broader adoption while limiting maintenance overhead to a controlled set of environments.

3No—stay broad and community-supported across many environments to maximize openness.

Maximizes inclusivity but risks chronic deployment breakage perceptions and higher support load.

4Other / More discussion needed / None of the above.

What is the Council’s preferred strategy to prevent long-running agents from exceeding context windows and degrading over time?

Discord 💻-coders (2025-02-13): "context exceeding the 128K token limit after Twitter agents run for a while" (passer).
Discord 💻-coders (2025-02-13): brka suggested a mechanism to limit conversation depth.

1Implement hard guardrails: rolling summaries + conversation depth limits + automatic memory compaction.

Predictable stability for autonomous agents; may reduce nuanced long-horizon behavior unless tuned well.

2Implement adaptive guardrails: token-budgeting per tool/client + dynamic summarization triggered by thresholds.

Better preserves fidelity while controlling cost, but increases complexity and test surface area.

3Leave it to builders via docs and sample configs; avoid imposing framework opinions.

Maintains flexibility, but repeated failures in common paths will undermine reliability reputation.

4Other / More discussion needed / None of the above.

How should we resolve embedding/vector dimension mismatches in a way that is safe-by-default for new developers?

Discord 💻-coders (2025-02-13): engineer reported SQLite vector mismatch; Odilitime suggested using OpenAI embeddings for consistent dimensions.
Discord (2025-02-13): multiple mentions of vector mismatch errors in SQLite databases.

1Enforce dimension locking at DB initialization and refuse to start if provider dimensions differ (fail fast).

Prevents silent corruption and hard-to-debug behavior, but can feel strict to new users.

2Auto-migrate vectors when dimensions change (re-embed or maintain multi-dimension indexes).

Smoother UX, but computationally expensive and riskier if migrations are imperfect.

3Default to a single recommended embeddings provider and document alternatives as advanced modes.

Reduces error rate quickly, but increases dependency on a default provider and may conflict with local-first goals.

4Other / More discussion needed / None of the above.

Identity Cohesion: Brand Consolidation, Token Rename, and Launchpad Narrative

The project’s outward identity is in flux: community/partners favor consolidating the ai16zdao and ElizaOS social presence, while the token rename decision is constrained by ticker mechanics and legal comms. Meanwhile, the dual-pool tokenomics model remains a strategic differentiator that requires clearer explanation and possibly simulation to defend against competitor comparisons.

Should we consolidate the two X accounts into a single command channel now, or preserve dual identities until post-V2 and token migration completion?

Discord 🥇-partners (2025-02-13): "A poll showed strong support for consolidation" (accelxr).
Discord associates (2025-02-13): Jin: "ElizaOS = professional/technical" vs "ai16zdao = investment DAO/crypto culture".

1Consolidate now under ElizaOS and treat ai16zdao as a legacy/archival brand until token ticker resolves.

Maximizes clarity for developers and partners, but may dilute crypto-native community energy.

2Maintain two accounts with strict roles: ElizaOS for product/DX, ai16zdao for token/DAO culture—with unified editorial leadership.

Preserves audience segmentation, but increases operational overhead and risk of mixed messaging.

3Consolidate later (post-V2 + post-ticker change) and focus now on shipping reliability milestones.

Avoids brand churn during engineering crunch, but prolongs confusion for newcomers and partners.

4Other / More discussion needed / None of the above.

Given legal and ticker constraints, what is the Council’s preferred communications doctrine for token-related updates?

Discord 🥇-partners (2025-02-13): jasyn_bjorn: legal constraints limiting token promotion before ticker change; daosfun blocking ticker update (witch).
Discord (2025-02-13): "decided to rename the token from ai16z to elizaOS, but can't change the ticker yet".

1Adopt a strict compliance-first stance: discuss only technical migration status and utility mechanics, avoid price/value framing.

Minimizes legal risk and reputational blowback, but may frustrate holders seeking clarity.

2Adopt a dual-track stance: technical updates publicly, token narrative in gated community channels with careful wording.

Balances transparency and risk, but can create perceptions of insider communication.

3Pause token comms until ticker changes, then relaunch narrative with a single coordinated announcement wave.

Reduces near-term legal exposure, but increases uncertainty and rumor-driven narratives.

4Other / More discussion needed / None of the above.

How do we validate and defend the dual-pool model (AT/SOL + AI16Z/AT) against competitor narratives (Virtuals/Arc direct pairing)?

Discord tokenomics (2025-02-13): Jin confirmed dual pool; Witch argued it prevents transferring liquidity issues while still enabling buybacks.
Discord tokenomics (2025-02-13): Patt asked whether different market scenarios were simulated (unanswered).

1Run and publish scenario simulations (liquidity, reflexivity, fee capture) and make this the canonical explanation.

Transforms debate into evidence, increasing partner confidence and reducing repeated disputes.

2Keep model but simplify messaging: emphasize “SOL first for projects, AI16Z value via fees/buybacks” without deep math.

Improves narrative clarity, but leaves sophisticated critics unconvinced without data.

3Revisit model toward partial direct pairing or optional pool configurations for select launches.

May align with market expectations, but risks inheriting competitor failure modes and complicating infrastructure.

4Other / More discussion needed / None of the above.

North Star & Strategic Context

Key Deliberations