Council Briefing

Key Deliberations

V2 Readiness: UX Polish vs. Stability Gates

UI improvements (notably the action viewer) continue to land, but the operational signal shows rising pressure to define a “release gate” that protects reliability and minimizes regression churn. Council alignment is needed on what constitutes “ready” in a developer-first universe where trust is earned by predictable behavior, not visual shine.

What is the Council’s minimum release gate for V2/Cloud-era trust: UX completeness, crash-free boot, or end-to-end “golden path” tutorials that actually run?

GitHub summary (2025-03-21): “Improved the action viewer UI… triaged new issues…”
Discord (2025-03-20, 💻-coders): “users appreciating the content but still struggling with implementation” (docs at eliza.how)

1Gate on reliability: crash-free startup + core agent loop stable across recommended environments.

Maximizes execution excellence and reduces support burden, but may delay UX polish and demo momentum.

2Gate on “golden path” DX: one blessed install + one blessed deployment path that works with docs verbatim.

Optimizes developer trust directly, but requires concentrated docs/testing resources and may postpone secondary features.

3Gate on flagship UX: ship when the UI/agent builder feels complete enough for broad onboarding.

Improves first impressions, but risks reputational damage if underlying reliability remains inconsistent.

4Other / More discussion needed / None of the above.

How should we sequence Spartan/DegenAI reactivation relative to V2 stabilization, given the dependency on the V2 stack?

Discord (2025-03-20, spartan_holders): “Enable Spartan chat functionality before V2 official launch” (Odilitime).
Discord (2025-03-19): “Current priority is getting open-source functionality working in v2 and deploying Spartan…” (rhota).

1Stabilize V2 first; keep Spartan in controlled beta until core regressions are resolved.

Prevents flagship agents from becoming public proof of instability, preserving long-term brand trust.

2Parallel-track: ship Spartan chat behind explicit “beta” framing while core team continues stabilization.

Maintains community energy and token-holder value, but increases incident surface area and support load.

3Prioritize Spartan as the flagship; treat its working chat loop as the acceptance test for V2 readiness.

Creates a single north-star integration test, but may bias architecture toward one agent’s needs over framework generality.

4Other / More discussion needed / None of the above.

Should Council mandate a single canonical “blessed” install command/path for the beta to reduce fragmentation across versions (0.25.9 vs 1.0.0-beta)?

Discord (2025-03-20): Beta install steps repeated across channels: “npm create eliza@beta … npx @elizaos/cli start”.
Discord (2025-03-20): “Users reported issues with… v0.25.9… could no longer interact with their agent via terminal” (FBRN).

1Yes—declare one supported beta track and clearly deprecate older quickstarts.

Reduces confusion and accelerates feedback quality, but may alienate users stuck on older setups.

2Maintain dual-track support temporarily, but add a versioned compatibility matrix and migration guide.

Minimizes user disruption, but increases docs and support complexity during an already unstable phase.

3Keep it flexible; let the community self-select versions while we focus on shipping features.

Maximizes velocity short-term, but directly undermines the “reliable, developer-friendly” North Star via fragmentation.

4Other / More discussion needed / None of the above.

Developer Trust Fault Lines: Packaging, IDs, and Provider Limits

Multiple “sharp edges” emerged that disproportionately harm DX: missing beta packages, UUID/id coercion failures, and model provider token-per-minute ceilings. These issues are existential to reliability perception because they break the first-run experience and force developers into archaeology rather than building.

How do we treat packaging integrity failures in beta (e.g., missing @elizaos/plugin-openai): as release blockers or as expected beta turbulence?

GitHub issue (2025-03-21): “@elizaos/plugin-openai package not found when using beta packages” (#4037).

1Release blocker: packaging must be consistent for any public beta we want developers to trust.

Aligns with “Developer First,” but may slow iteration as release engineering becomes mandatory.

2Non-blocker, but require a rapid hotfix SLA and a public incident log for transparency.

Maintains velocity while protecting trust through communication, but risks repeated paper cuts.

3Defer; advise users to pin versions or use alternative providers until the ecosystem settles.

Shifts cost to developers and erodes confidence precisely when we need adoption and feedback.

4Other / More discussion needed / None of the above.

Should we standardize and enforce an “ID hygiene protocol” across clients/plugins to prevent UUID conversion failures (especially for negative or non-UUID external IDs)?

GitHub issue (2025-03-21): “invalid input syntax for type uuid: \"-1002129157442\"” (#4042).
Recent merged work (PR #4052 listed in completed items): “Fix Telegram negative chat ID UUID conversion” (plugin/telegram-related).

1Yes—define a canonical ID normalization layer in core and require all plugins to use it.

Prevents recurring class of bugs and improves composability, but requires coordinated refactors.

2Plugin-owned: provide best-practice utilities, but let each client/plugin decide.

Faster locally, but higher long-term entropy and repeated regressions across the ecosystem.

3Database schema flexibility: relax UUID constraints where external IDs are common.

Reduces friction quickly, but may compromise data consistency and cross-system interoperability.

4Other / More discussion needed / None of the above.

How should we operationalize model/provider limits (e.g., Groq TPM caps) so agents fail gracefully and predictably rather than mysteriously?

GitHub issue (2025-03-21): “Groq tokens per minute (TPM) limit of 6000” (#4040).
GitHub (recent merged PR #4044): “Groq integration” introduced new provider surface area.

1Implement unified rate-limit and backoff handling in core runtime, surfaced in UI logs and CLI.

Creates consistent behavior across providers and strengthens reliability, at the cost of core complexity.

2Handle limits in each provider plugin with documented recommended defaults.

Keeps core lean, but produces inconsistent UX and repeated reinvention across plugins.

3Document the limits and rely on community best practices without adding runtime logic yet.

Fastest to ship, but violates “seamless UX over feature quantity” and increases support volume.

4Other / More discussion needed / None of the above.

Strategic Architecture: Golang Port vs. Hardening the TypeScript Core

A proposal surfaced to port ElizaOS to Golang for performance; this is a strategic fork decision with high opportunity cost. Council must decide whether performance/throughput is best achieved by a language port, by targeted optimization (Bun/Tauri/WebSockets), or by Cloud-managed infrastructure while keeping the framework developer-friendly.

Does pursuing a Golang port advance our North Star (reliability + developer-friendly) or distract from stabilizing V2 and Cloud launch execution?

GitHub (2025-03-21): “Needs Attention: discussion needed… Golang port of ElizaOS for performance improvements” (#4034).

1Reject for now: focus on stabilizing TypeScript core and Cloud reliability before any language port.

Protects near-term execution excellence and reduces strategic drift during a critical migration window.

2Explore as an R&D track: small spike/prototype with strict timebox and measurable performance targets.

Keeps optionality without derailing the roadmap, but still consumes senior attention and review bandwidth.

3Commit to a dual-runtime strategy: Go core for performance, TS SDK for plugins and DX.

Potentially best long-term throughput, but introduces major coordination risk, interoperability complexity, and community fragmentation.

4Other / More discussion needed / None of the above.

What is the Council’s preferred performance path in the near term: runtime optimization (Bun/WebSockets), Cloud-managed scaling, or architectural simplification (fewer moving parts)?

Discord (2025-03-18): “Shaw added websocket functionality… enabling direct agent connections to web interfaces.”
Recent PR activity: socket/web client refinements and UI performance work (multiple PRs in 2025-03-20 daily report).

1Lean into Cloud-managed scaling as the default; optimize framework for local dev and correctness.

Strengthens the platform narrative and reduces local performance pressures, but increases dependency on Cloud readiness.

2Optimize the existing TS runtime (Bun/WebSockets/DB hot paths) to reduce infra demands everywhere.

Improves self-hosting credibility and open-source strength, but may be slower than scaling via Cloud.

3Simplify architecture and reduce surface area (fewer clients, fewer modes) until stability is proven.

Maximizes reliability and reduces bug surface, but may frustrate ecosystem experimentation and integrations.

4Other / More discussion needed / None of the above.

How do we prevent “innovation drift” (new providers/plugins) from eroding reliability—should we formalize an “experimental zone” with stricter stability guarantees for the core?

Recent PRs: “Groq integration” (#4044), “Redpill support” (#4045), “DPSN Plugin” (#4043) alongside beta instability signals.
GitHub activity (2025-03-20 to 2025-03-21): “22 new PRs (16 merged)… 5 new issues” indicates high churn.

1Yes—create tiered maturity levels (experimental/beta/stable) with gating in registry and docs.

Preserves composability while protecting trust, but requires governance and tooling to enforce tiers.

2No—keep everything in one stream, but increase automated tests and CI gates across the monorepo.

Avoids ecosystem fragmentation, but demands significant engineering investment and may still allow noisy regressions.

3Freeze new integrations until post-V2 stabilization window completes.

Maximizes short-term stability, but risks losing momentum and community contributions that expand the ecosystem.

4Other / More discussion needed / None of the above.

North Star & Strategic Context

Key Deliberations