Council Briefing

Key Deliberations

V2 Transition Stability & Developer Trust

Engineering throughput is high (20/25 PRs merged over two days; multiple critical fixes landed), yet user sentiment shows friction: V2 architectural shifts, missing/404 endpoints, and inconsistent auth prompts are eroding confidence during migration.

Do we declare a single “supported migration lane” (V1 stable vs V2 beta) with strict guarantees, or continue parallel experimentation at the cost of confusion?

Discord (2025-04-07): “transition period between ElizaOS v1 and v2, with incomplete plugin migration causing confusion.”
Discord (2025-04-08 coders): “Some users found v1 more functional than v2 for certain implementations.”

1Freeze V2 scope and publish a “V2 Beta Contract” (what works, what doesn’t), while routing most builders to V1 for production.

Reduces churn and restores trust, but slows V2 feature pressure and partner timelines.

2Declare V2 as the primary lane immediately and provide aggressive migration tooling and daily hotfix cadence.

Accelerates convergence but risks high-profile failures that damage DX reputation.

3Maintain dual-lane strategy with a compatibility runtime and automated plugin-coverage reporting as the guardrail.

Preserves innovation while making gaps measurable; requires disciplined reporting and CI investment.

4Other / More discussion needed / None of the above.

Which reliability issues must be treated as “Council Blockers” before any major launch communications (Cloud/launchpad/partners) proceed?

GitHub PRs summary (2025-04-08): “Fixed GitHub authentication prompt during CLI start command (PR #4242).”
Discord Action Item (2025-04-08): “Address API endpoint 404 error for /api/agents/:agentId/message despite documentation (Newt).”

1Block on: API correctness (no documented 404s), auth sanity (no surprise GitHub token prompts), and Twitter interactions stability.

Optimizes for end-to-end agent operability and credibility in docs.

2Block on: CLI/GUI “first-run success” only (create → start → message), defer Twitter and advanced endpoints.

Improves onboarding quickly but leaves flagship social agents brittle.

3Block only on: Crashers and data-loss bugs; ship everything else with known-issues list.

Maximizes velocity but risks accumulating “paper cuts” that suppress adoption.

4Other / More discussion needed / None of the above.

How do we close the documentation-reality gap without slowing merges—what is the Council’s preferred enforcement mechanism?

Discord Action Item (2025-04-08 coders): “Update documentation to match actual code structure (directory discrepancy noted) (jonathanmann).”
Discord (2025-04-08 dev): “Document the architectural changes from V1 to V2 for custom client developers (standard).”

1Adopt “Docs-as-a-Gate”: any PR affecting UX/API must include doc updates or a tracked doc issue.

Raises reliability and trust, but increases PR friction and review load.

2Establish a Documentation Strike Team (weekly) that triages deltas and ships doc patches independently.

Maintains dev velocity while improving docs, but needs sustained staffing and prioritization.

3Automate doc drift detection (route snapshots, CLI help snapshots, API schema) and file issues automatically.

Scales governance of truth, but requires upfront tooling and ongoing maintenance.

4Other / More discussion needed / None of the above.

Social Surface Reliability (Twitter/X) as Flagship Signal

Twitter/X remains the most visible proving ground for agent reliability; repeated reports of posting/reply/quote failures and character-noncompliance create reputational risk. Recent fixes landed (e.g., tweet reply crash), but community still experiences instability across versions.

Should we standardize on an API-based Twitter client (paid access) as the “official path,” or continue supporting scraping-based access for openness?

Discord help (2025-04-08): “shared a custom Twitter client using API access instead of scraping to avoid account bans” (notorious_d_e_v).
Dev Discord (2025-04-06): “Multiple users reported problems with Twitter agents not tweeting despite proper setup… Issues span both Eliza v0.25.9 and v2 (beta).”

1Officially endorse API v2 client only; document required tiers and provide a clean setup wizard.

Maximizes reliability and reduces bans, but raises cost barrier for indie builders.

2Support both: API as default, scraping as “best-effort community mode” with clear warnings.

Preserves openness while protecting brand; increases maintenance surface.

3Deprioritize Twitter as a core integration and focus on more stable platforms until V2 stabilizes.

Reduces immediate pain but forfeits a key public trust channel and growth lever.

4Other / More discussion needed / None of the above.

What is the Council’s definition of “flagship social reliability” for agents (posting, replying, media, mention-handling), and what telemetry proves it?

GitHub issue (2025-04-08): “Provider Data Not Used When Posting to Twitter” (#4224).
GitHub PR (2025-04-08): “Fixed issue with replying to tweets in interactions” (PR #4231, related to #4226).

1Minimum bar: 99% success for post+reply pipelines (including mentions) measured via built-in instrumentation and retries.

Strong trust signal; requires robust observability, backoff, and queueing.

2Minimum bar: deterministic behavior (character adherence and correct action selection) even if delivery reliability is lower.

Improves perceived intelligence, but continued delivery failures still harm brand.

3Minimum bar: “doesn’t crash” and “posts sometimes”; treat everything else as advanced configuration.

Fast to achieve but risks locking ElizaOS into a low-expectation narrative.

4Other / More discussion needed / None of the above.

Where should responsibility sit for social reliability: core runtime, plugin maintainers, or a dedicated ‘Platform Integrations’ squad?

Discord (2025-04-07 dev): “plugins are still being migrated in the V2 beta which may affect Twitter functionality” (Nisita).
GitHub activity (2025-04-08 to 2025-04-10): “25 new PRs, 20 merged… strong contributor engagement.”

1Core runtime owns reliability patterns (retries, queues, idempotency); plugins only implement platform specifics.

Creates consistent behavior across platforms, but requires careful core design.

2Plugin maintainers own end-to-end behavior; core stays minimal and composable.

Preserves modularity, but produces uneven quality and slower trust recovery.

3Create a dedicated Integrations squad to harden top platforms (X, Discord, Telegram) with SLAs and test harnesses.

Improves flagship reliability quickly, but consumes scarce high-context engineering time.

4Other / More discussion needed / None of the above.

Governance & Information Ops (DAO Reboot + Reputation Engine)

A proposed ElizaDAO “Supermind” reboot and a passive contribution/reputation system could convert scattered community energy into measurable progress—if incentives and privacy expectations are made explicit and aligned with shipping trust.

Should the reputation system launch first as an internal Council instrument (signal extraction), or as a community-facing rewards mechanism?

DAO-org (2025-04-08): “Jin is developing a reputation/contribution measurement system… passive monitoring… with token and non-monetary rewards.”
DAO-org (2025-04-08): “Can the reputation system be tested in the working group before DAO-wide rollout? Jin confirmed it could be used for early feedback.”

1Internal-first: use it to prioritize issues/PRs and detect support hot spots before rewarding anyone.

Reduces governance risk and calibrates metrics, but delays community excitement.

2Public beta with opt-in and clear data boundaries; rewards start small (badges/roles) before tokens.

Builds engagement while containing downside; requires careful comms and moderation.

3Full launch with token incentives immediately to jump-start participation.

Fast activation, but high risk of gaming, backlash, and misaligned behavior.

4Other / More discussion needed / None of the above.

How should we structure the DAO working circles to maximize execution rather than bureaucracy?

DAO-org (2025-04-08): “Vincent Paul introduced… working circles including Communications, Community & Governance, Development, Documentation, Partnerships, and Events.”
DAO-org (2025-04-08): “discussed potentially consolidating some working circles to prevent spreading resources too thin.”

1Consolidate into 3 execution pods: Build (Dev+Docs), Grow (Comms+Partnerships+Events), Govern (Ops+Reputation).

Reduces overhead and clarifies ownership; may disappoint niche constituencies.

2Keep 6 circles but impose quarterly KPIs and a Council-appointed coordinator per circle.

Retains specialization while forcing accountability; requires strong coordinators.

3Run time-boxed “missions” only (2–4 weeks), dissolving circles after deliverables ship.

Optimizes for shipping and momentum, but risks loss of continuity and institutional memory.

4Other / More discussion needed / None of the above.

What is our Council stance on privacy and consent for cross-platform contribution monitoring (Discord/GitHub/sentiment)?

DAO-org (2025-04-08): “passively monitors engagement across channels, analyzes sentiment, and scores GitHub contributions.”
Coders (2025-04-08): “Clarify why GitHub token is needed and if users can opt out (jonathanmann).”

1Strict opt-in with transparent dashboards showing exactly what data is collected and how it is scored.

Maximizes trust and legitimacy, but reduces dataset coverage and metric power.

2Default-on for public data (GitHub/public Discord messages) with a clear opt-out and minimal retention.

Balances utility and consent, but must be communicated carefully to avoid backlash.

3Operate only on aggregate, anonymized metrics; no individual scores until explicit consent is obtained.

Safest posture for community trust, but weakens incentive mechanics and personalization.

4Other / More discussion needed / None of the above.

North Star & Strategic Context

Key Deliberations