Council Briefing

Key Deliberations

V2 Reliability & Developer Experience Hardening

Engineering momentum is high (new providers, embedding model selection, plugin publishing workflow improvements), but field reports show persistent friction: installation/platform quirks, unclear v2 architecture docs, and agent lifecycle/storage confusion—directly impacting developer trust.

What is the Council’s definition of “v2 launch-ready” under Execution Excellence: feature-complete, or failure-intolerant stable for common paths (create → run → deploy → observe)?

Discord (Dev): “V2 Architecture Migration… ‘clients’ have been replaced with ‘plugins + services’” (mekpans helping standard, 2025-03-31)
Discord (Dev School): confusion on “where agents created via CLI are stored in v2dev” (mindxploit, 2025-03-31)
GitHub Daily Update (2025-04-01): “Improved plugin publishing workflow to enhance the developer experience” (#4132)

1Define launch-ready as stability for the top 3 developer journeys (local dev, plugin install, social client run) with strict error budgets.

Focuses the fleet on reliability and documentation, increasing trust even if some features slip.

2Define launch-ready as parity with v1 capabilities plus new v2 architecture, even if rough edges remain.

Maximizes feature narrative but risks churn if first-time runs fail or docs diverge from reality.

3Define launch-ready as “dogfooding-only” for a fixed burn-in period; publicly label as preview until metrics are met.

Protects reputation while enabling iteration, but may slow ecosystem adoption and partner timelines.

4Other / More discussion needed / None of the above.

Which single DX bottleneck should be elevated to a “mission-critical blocker” to preserve developer trust: installation/CLI behavior, plugin import resolution across OSes, or documentation alignment for v2?

Discord (Dev): “Plugin import errors… workaround… replace `@import` with hardcoded paths” (Tiki, 2025-03-30)
GitHub Issues: “How to run Eliza CLI?” (#4159) and “Quickstart doc issues” (#4336)
Discord (Main): “Recommended installation method changed… npm global → git clone v2-develop + bun” (2025-03-29/30)

1Make CLI/installation determinism the blocker (single blessed install path; consistent start/dev commands).

Reduces onboarding failure rate fastest and lowers community support load.

2Make cross-platform plugin import/module resolution the blocker (Linux/WSL/macOS parity).

Prevents silent ecosystem fragmentation and hard-to-debug community failures.

3Make docs/implementation alignment the blocker (v2 architecture, agent storage, plugin registry policy).

Improves comprehension and self-serve support, but may not stop immediate runtime failures.

4Other / More discussion needed / None of the above.

How aggressively should we expand model/provider integrations (Kluster AI, Mem0, DeepSeek) versus consolidating around a smaller “golden path” for reliability?

GitHub Daily Update (2025-04-01): “Integrated Kluster AI as a model provider” (#3938) and “Added Mem0 as an AI SDK provider” (#3927)
Discord (Dev): “How to use DeepSeekAI for V2… use DEEPSEEK_API_KEY env var” (loyce.eth / Sashimikun, 2025-03-31)
Discord (Coders): repeated provider issues (Anthropic rate limits; OpenRouter ‘hacky’ plugin) (2025-03-30/31)

1Consolidate: bless 2–3 providers and harden docs/tests/telemetry before expanding further.

Raises reliability and simplifies support, but slows composability narrative.

2Expand: keep integrating providers rapidly, but enforce plugin conformance tests and version gates.

Maintains open/composable leadership while containing blast radius via tooling.

3Hybrid: expand providers only via community-maintained plugins, while core maintains a strict golden path.

Scales ecosystem without overloading core team, but may produce uneven quality perception.

4Other / More discussion needed / None of the above.

Social Surfaces Stability (Twitter/Telegram) as Trust Multipliers or Reputation Hazards

Telegram capabilities improved materially (community manager, middleware docs, sync fixes), while Twitter remains a reliability and cost sink (redundant checks, mention handling, account suspensions). These surfaces shape public perception more than core commits.

Should the Council treat Twitter and Telegram as “flagship reliability surfaces” requiring tighter release gates than other plugins, given their outsized reputation impact?

GitHub Daily Update (2025-04-01): Telegram upgrades—community manager (#4134), middleware docs/sync (#4128)
GitHub New Issue (2025-04-01): “Twitter plugin… redundant checks… unnecessary API calls” (#4127)
Discord (Partners/Main): “ai16zNEWS Twitter account was suspended… posts reaching 100k views” (2025-03-31)

1Yes—treat as flagship surfaces with stricter CI, canary releases, and required observability.

Reduces public failures and cost blowups, strengthening “trust through shipping.”

2Partially—tighten Telegram (utility) but keep Twitter experimental due to X platform volatility.

Preserves momentum while acknowledging platform risk, but may weaken marketing automation.

3No—keep equal treatment; prioritize core runtime and let community iterate on social plugins.

Speeds core development but increases likelihood of public-facing failures and confusion.

4Other / More discussion needed / None of the above.

What is the preferred mitigation strategy for API rate limits and “cost spikes” causing agent crashes: multi-provider failover, smarter prompt budgets, or throttled job queues?

Discord (Coders): “Anthropic API rate limit… causing agent crashes… switch providers/reduce prompt length” (2025-03-30/31)
GitHub Daily Update (2025-04-01): multiple refactors/bugfixes, but still surface-level instability in social usage
Discord (Coders): VRAM issues and local model struggles affecting reliability (2025-03-30/31)

1Implement multi-provider failover with priority routing and graceful degradation.

Improves uptime but adds complexity and testing burden across providers.

2Enforce prompt budgets and structured outputs to reduce token pressure and retries.

Lowers costs and rate-limit risk while improving predictability, possibly at quality cost.

3Adopt throttled queues/backpressure (and clearer user-facing errors) to prevent crashes.

Stabilizes runtime under load, but may slow responsiveness and require UX messaging.

4Other / More discussion needed / None of the above.

How should we handle “undesired interactions” and safety controls (blocking accounts, preventing promotion of questionable projects) without turning ElizaOS into a closed system?

GitHub Issues Summary: “HOW do we block and ban interactions with specific accounts???” (#4117, closed)
Discord (Partners): “Improve AI prompting to prevent agents from promoting questionable projects” (jin, 2025-03-31)
Discord (Dev): “Security concerns… potential scam links” (Veight assisting ElizaBAO, 2025-03-31)

1Ship first-class policy and safety primitives (blocklists, verification hooks, provenance checks) in core.

Raises baseline safety and trust, but increases scope and governance over defaults.

2Provide reference plugins/patterns only; leave enforcement to deployers (opt-in safety).

Preserves openness and composability, but increases risk of public incidents by novices.

3Create “safe mode” profiles (strict defaults) with an explicit switch for advanced users.

Balances safety and freedom while giving clear expectation management to builders.

4Other / More discussion needed / None of the above.

auto.fun Launch Readiness & Token/DAO Narrative Coherence

Auto.fun is framed as imminent (“two weeks,” ~Apr 14) with 15 launch partners and ai16z buyback utility, yet community confusion persists on token relationships and DAO status. Narrative incoherence is now a strategic risk to developer and holder trust.

What is the Council’s canonical public narrative for the token and platform relationship (ai16z ↔ ElizaOS ↔ auto.fun), and how will it be enforced across docs/social/AMA?

Discord (Main): “There will not be a new token, the token stays $ai16z.” (7OROY, 2025-03-31)
Discord (Main): “Profits from auto.fun will be used to buy back ai16z tokens” (jin, 2025-03-31)
Discord (Main): Confusion after “auto.fun has no native token” messaging (2025-03-29)

1Publish a single “Token Relationship & Value Flow” spec (diagram + FAQ) and treat it as source of truth.

Reduces confusion and rumor cycles, improving trust and partner onboarding.

2Keep messaging minimal until launch; answer only when asked to avoid over-commitments.

Avoids premature promises but allows confusion to persist and compound.

3Split narratives: developer-facing ElizaOS neutrality; holder-facing ai16z utility via auto.fun buybacks.

Clarifies audiences but risks perception of misalignment if not tightly coordinated.

4Other / More discussion needed / None of the above.

How should progressive decentralization be staged given we are “not a DAO (yet)” but operate in DAO-adjacent spaces (daos.fun), and what near-term governance tools are acceptable?

Discord (dao-organization): “We’re not a DAO (yet). Weaving a community is a delicate art and science.” (vincentpaul, 2025-03-31)
Discord (dao-organization): references to MetaDAO/MNTDAO decision markets as models (Ka_yari, 2025-03-31)

1Publish a phased decentralization roadmap (milestones, powers, guardrails) and begin with information governance (summaries, proposals).

Builds legitimacy through transparency while keeping execution centralized enough to ship reliably.

2Delay formal governance; focus solely on shipping auto.fun + v2 reliability, revisit DAO framing later.

Maximizes execution focus but may frustrate community members seeking clarity and participation.

3Pilot limited decision markets/bounties for specific modules (plugins, docs) while explicitly excluding treasury control.

Tests governance primitives safely, but requires careful scoping to avoid “DAO theater” accusations.

4Other / More discussion needed / None of the above.

Given upcoming previews (HK/Paris) and launch partners, what is the acceptable launch risk posture: ship on time with known rough edges, or delay for a higher reliability threshold?

Discord / Daily Summary: “@autodotfun is ready to launch with partners in two weeks… previewed in Hong Kong and Paris” (2025-03-31)
Discord (Main): “Why delay?” and launch-day uncertainty questions appear repeatedly (2025-03-31)

1Ship on schedule with a tight scope and clear limitations; prioritize uptime and incident response readiness.

Captures momentum and partner timelines while containing blast radius through scope control.

2Delay until reliability metrics are met (successful launches, monitoring, rollback plans validated).

Protects long-term trust but risks narrative damage and partner churn if delays compound.

3Stage the launch: private/partner-only mainnet first, then public launch after a measured burn-in.

Balances deadlines and quality, but requires disciplined access control and communications.

4Other / More discussion needed / None of the above.

North Star & Strategic Context

Key Deliberations