Council Briefing

Key Deliberations

Execution Excellence: CLI + Plugin Reliability

Engineering throughput is strong (high merge velocity) and targeted stability work landed (port availability fix, improved plugin installation strategy, clearer plugin command docs), yet field reports show friction in setup paths (Windows, branch confusion, plugin/version mismatches).

Do we freeze v2 feature surface temporarily to prioritize a "green path" installation and first-run experience across platforms?

ElizaOS Daily Update (Apr 6, 2025): "Enhanced the plugin installation strategy" and "Resolved the elizaos port availability issue" (#4202, #4199).
GitHub issue #4191: "Issue when running elizaos start on Windows (Node/NVM v23.3)" (Windows install errors, module import failures).

1Yes—declare a 2-week stabilization window with strict acceptance criteria (install/start/test) before adding features.

Accelerates trust-building and reduces support load, but may delay v2 feature promises.

2Partial—continue critical features, but gate merges behind cross-platform CI + reproducible quickstart validation.

Balances momentum with quality, but requires immediate investment in test infrastructure and release discipline.

3No—maintain current pace; rely on community troubleshooting and incremental patches.

Maximizes shipping velocity, but risks eroding developer confidence and worsening churn from setup failures.

4Other / More discussion needed / None of the above.

What is the Council’s desired source-of-truth for v2 compatibility (docs site, monorepo packages, plugin registry), and how visibly should incompatibilities be labeled?

GitHub issue #4164: "only plugins in the /packages directory of the v2-develop branch are fully compatible with v2"; suggestion to remove/mark incompatible plugins.
Discord (2025-04-05): users requested "documentation for plugin registration" and faced confusion finding/using plugins (brownie, 0xCryptoCooker).

1Make the plugin registry the canonical truth; docs auto-render compatibility badges from registry metadata.

Creates scalable clarity and supports ecosystem composability, but requires registry completeness and governance.

2Make the v2 monorepo /packages list canonical until the registry is fully reliable; docs mirror it exactly.

Reduces ambiguity immediately, but slows third-party plugin visibility and decentralization goals.

3Keep docs broad but add prominent v1/v2 labels and a hard warning banner on incompatible pages.

Fastest to implement, but ongoing confusion persists if labels drift from reality.

4Other / More discussion needed / None of the above.

Should we enforce a default plugin baseline (e.g., SQL + OpenAI) at agent creation to prevent common runtime failures, even if it reduces minimalism?

Discord (2025-04-03): px: "getTasks() is part of the sqlplugin, which is required but not installed by default"; errors: "Cannot read properties of undefined (reading 'init')".
Recent work: plugin install management and CLI improvements (e.g., #4202, #4185, #4196).

1Yes—ship a safe default baseline and warn on removal; optimize DX and reliability first.

Shrinks support burden and increases successful first runs, but constrains ultra-minimal deployments.

2Hybrid—baseline defaults in templates/GUI, but keep CLI advanced mode fully explicit.

Serves both newcomers and power users, but increases surface area for documentation and testing.

3No—keep everything explicit; fix errors via better messaging and docs rather than defaults.

Preserves composability philosophy, but prolongs early-user failure rates.

4Other / More discussion needed / None of the above.

Cross-Platform Social Integrations: Twitter/Telegram Readiness

Twitter remains the highest-friction integration for v2 (client non-functional while plugin works), creating reputational risk for flagship agents and launch readiness; Telegram is improving (buttons support) but needs coherent documentation and consistent behavior across platforms.

What is the Council’s threshold for declaring v2 “social-ready” given Twitter client instability—do we block releases on Twitter parity or ship with explicit constraints?

Discord (2025-04-05): jin/SpartanDev: "Is client Twitter working with v2 right now? No, only the plugin is working currently."
PRs: #4167 "Failed to create Twitter client"; #4192 "fix: twitter interaction" (stability improvements but not full client parity).

1Block—no “social-ready” claim until the Twitter client works end-to-end with validated templates, intervals, and mentions.

Protects trust-through-shipping, but delays public demos that drive ecosystem growth.

2Ship with constraints—label Twitter client as “beta/limited” and publish a known-issues matrix plus workarounds.

Preserves momentum while setting expectations, but requires disciplined comms and rapid iteration.

3Deprioritize Twitter—shift to other platforms (Discord/Telegram/Farcaster) and treat Twitter as optional.

Reduces dependency on volatile APIs, but may weaken flagship visibility and market narrative.

4Other / More discussion needed / None of the above.

How should we handle provider fragility (Anthropic rate limits/embedding handler errors) to protect reliability without forcing a single vendor?

Discord (2025-04-05): users hit Anthropic errors; Abderahman: "Switch to OpenAI"; reported: "No handler found for delegate type: TEXT_EMBEDDING".
Operational pattern: users self-mitigate via provider swapping rather than first-class fallback behavior.

1Implement automatic provider fallback policies (embeddings + chat) with clear observability and circuit breakers.

Improves uptime and DX, but increases complexity and may obscure cost/behavior differences.

2Offer a recommended “golden path” provider set for production (e.g., OpenAI for embeddings) while keeping others opt-in.

Simplifies reliability guidance and docs, but can be perceived as vendor preference.

3Keep provider choice fully manual; focus on better error messages and troubleshooting docs.

Lowest engineering overhead, but leaves reliability as an end-user burden.

4Other / More discussion needed / None of the above.

Should Telegram feature velocity (e.g., buttons) be used as the reference standard for "agent UX" across platforms via a unified interaction schema?

Dev Discord (2025-04-04): PR #4187 adds Telegram buttons; discussion of generic buttons design (platform-agnostic).
Project principle alignment: "Open & Composable" implies shared interaction primitives across clients.

1Yes—define platform-agnostic interaction primitives (buttons/forms) in core, with per-client renderers.

Strengthens composability and consistent UX, but requires coordination across client maintainers.

2Partial—prototype on Telegram first, then formalize in core only after usage proves value.

Reduces premature abstraction risk, but delays cross-platform coherence.

3No—allow each platform to evolve independently; prioritize fastest local wins.

Speeds short-term delivery, but increases fragmentation and documentation burden.

4Other / More discussion needed / None of the above.

Trust Through Shipping: Comms, Security, and Ecosystem Confidence

Community sentiment is stressed by token drawdown and launchpad uncertainty, while scams and operational confusion degrade trust; simultaneously, there is genuine excitement around v2 swarm/MCP capabilities and new projects—suggesting a need for tighter narrative discipline and security posture.

What is the Council’s stance on launch communications: do we lead with near-term shipping proofs (stability + docs) or with the longer arc (swarm tech, bazaar, agent commerce) to restore confidence?

Discord (2025-04-05): token down ~50% in a week; debate: "use cases vs marketing"; HoneyBadger: launchpad in ~10 days (Apr 14).
Discord (2025-04-05): jin: v2 includes "swarm tech" and project-manager agents that keep others (and humans) in check.

1Lead with proof—publish a reliability scorecard, fixed-issues list, and crisp v1→v2 migration guide before visionary narratives.

Reinforces execution excellence and rebuilds builder trust, but may undersell strategic ambition.

2Dual-track—pair every visionary claim (swarm/bazaar) with a demo repo and a timeline with owners and dates.

Balances inspiration and credibility, but requires disciplined program management and demo maintenance.

3Lead with vision—market the decentralized agent economy narrative aggressively; let engineering catch up.

May improve attention and liquidity narratives short-term, but risks reputational damage if experience lags.

4Other / More discussion needed / None of the above.

How aggressively should we harden community security controls (link posting restrictions, verification flows) at the cost of openness and virality?

Discord (2025-04-04): "Multiple scam attempts"; suggestion: "disable posting links except for team and moderators" (Osint).
Discord (2025-04-05): jin warned a user not to share a 2FA QR code publicly (operational security incident).

1High lockdown—restrict links, enforce verified roles for sensitive channels, and add automated scam detection.

Reduces exploit surface and protects newcomers, but may slow community growth and peer support.

2Balanced—restrict links only in high-risk channels and implement clear verification + education playbooks.

Maintains openness where safe while reducing common scam vectors, but requires moderator coordination.

3Minimal—keep policies light; rely on community warnings and reactive moderation.

Preserves frictionless engagement, but increases the probability of high-impact incidents.

4Other / More discussion needed / None of the above.

Given Spartan leadership transition and pending launch presence, should we prioritize “flagship agent uptime + posting reliability” as a governance KPI for ecosystem trust?

Discord (2025-04-05): Odilitime: interim PM after Rhota departure; Spartan v2 underway; new X account: https://x.com/SpartanVersus; goal: "Get v2 tweeting" before release.
Discord (2025-04-05): widespread Twitter client dysfunction and deployment issues threaten public-facing reliability.

1Yes—adopt flagship reliability KPIs (uptime, posting success rate, response latency) and publish them.

Transforms trust into measurable governance, but exposes failures publicly and pressures teams.

2Internal-only—track KPIs privately to drive engineering priorities without public commitments yet.

Improves operations while limiting reputational risk, but provides less external reassurance.

3No—flagships are showcases, not guarantees; focus KPIs on framework stability and DX instead.

Keeps focus on the platform, but may miss the narrative value of reliable public agents.

4Other / More discussion needed / None of the above.

North Star & Strategic Context

Key Deliberations