Council Briefing

Key Deliberations

Reliability Gate: Build & Runtime Stability

Despite rapid fixes landing, recurrent build failures (notably Zod conflicts) and runtime defects (Bedrock provider, agent assignment/stop crashes) remain the highest risk to developer trust and our execution-excellence directive.

Do we declare a temporary stability freeze (merge gate) until Zod/build failures and agent lifecycle crashes are neutralized?

GitHub issue triage (2025-02-06): "Build failures due to Zod dependency issues" (#3300).
GitHub issue triage (2025-02-06): "Potential crashes during stop operations" (#3302) and "Inability to assign agents correctly" (#3303).

1Yes—initiate a stability freeze with a strict merge gate for core/runtime until green CI + verified repro fixes ship.

Maximizes trust-through-shipping, but slows feature velocity and partner-facing deliverables.

2Partial—freeze only dependency/toolchain and agent lifecycle areas; allow low-risk docs and isolated plugin work.

Balances momentum with risk containment, but requires clear ownership and enforcement boundaries.

3No—continue parallel development and rely on rapid patch cadence to outpace breakages.

Preserves velocity, but compounds perceived instability and increases support burden in Discord/GitHub.

4Other / More discussion needed / None of the above.

What is our canonical definition of “reliable” for this cycle (the pass/fail bar before we claim readiness publicly)?

Discord coders (2025-02-05): multiple users report being stuck at "Initializing LlamaService..." even when using other providers.
GitHub issue triage (2025-02-06): "Amazon Bedrock model provider is not functioning as expected" (#3328).

1Runtime reliability: clean install + default agent start succeeds across top 3 environments (Mac/Linux/Windows-WSL) and top 3 providers (OpenAI-like, Anthropic, local) with no manual DB deletion steps.

Strong DX bar; may surface more work immediately but yields durable credibility.

2CI reliability: main branch is consistently green and packages publish; runtime issues handled via troubleshooting docs until next minor release.

Optimizes engineering throughput, but pushes friction onto developers and community support.

3Outcome reliability: prioritize flagship/Cloud paths; self-host stability can lag as long as Cloud path works smoothly.

Accelerates Cloud adoption, but risks alienating open-source builders and composability partners.

4Other / More discussion needed / None of the above.

How should we handle provider regressions (e.g., Bedrock) in an open, composable ecosystem—core responsibility or plugin responsibility?

GitHub issue triage (2025-02-06): "Amazon Bedrock model provider is not functioning" (#3328).
Daily engineering summary (2025-02-06): "Removed the verifiable inference concept, transitioning it to a plugin-based approach" (PR #3344).

1Core owns provider reliability for a defined set of “Tier-1” providers; everything else is community/plugin-tier.

Clarifies expectations and prioritization, but requires ongoing core maintenance for those providers.

2All providers become plugins with a published compatibility contract; core only maintains the contract + test harness.

Maximizes modularity, but demands strong tooling and may initially increase breakages during transition.

3Hybrid: keep provider adapters in core until the plugin ecosystem matures, then graduate them out over time.

Reduces immediate disruption but risks prolonged ambiguity and duplicated implementations.

4Other / More discussion needed / None of the above.

Plugin Architecture Shift: Dynamic Loading & Repo Split

Dynamic plugin loading has landed, alongside large structural moves (including removing/relocating plugins) that promise composability but can fracture developer experience unless packaging, discovery, and upgrade paths are crystal clear.

What is the Council’s preferred end-state for plugins: monorepo convenience or federated ecosystem—with what migration timeline?

Daily report (2025-02-05): "Dynamic Plugin Loading implementation (PR #3339)."
Daily report (2025-02-05): "Deleted all plugins (PR #3342)" and "Removed plugin imports from agent (PR #3346)."

1Federated-first now: plugins live in dedicated org/repos; core remains minimal; invest immediately in registry + versioning.

Accelerates ecosystem scale and composability, but increases short-term DX fragmentation risk.

2Dual-track: keep a curated “core plugin bundle” versioned with the framework while allowing the broader registry to evolve independently.

Protects newcomers and Cloud onboarding while still enabling an open plugin economy.

3Monorepo-first for another release cycle: stabilize installs/builds first, then federate after reliability targets are met.

Reduces immediate chaos, but delays ecosystem decentralization and increases core maintenance load.

4Other / More discussion needed / None of the above.

How do we prevent plugin configuration confusion from becoming the dominant support burden (and reputational leak)?

Discord coders (2025-02-05): users struggling with plugin configuration; Jox: JSON uses "plugins": ["<@1300745997625982977>os/plugin-coinmarketcap"].
Discord coders (2025-02-05): request to "Add plugin registry to select/install only needed plugins" (mentioned by elizaos-bridge-odi).

1Ship a first-class plugin registry UX in CLI + Cloud (browse, install, verify, pin versions) with opinionated defaults.

Transforms support into product; requires focused engineering and coordination across repos.

2Codify configuration recipes in docs + templates (JSON/TS) and accept that advanced users will customize manually.

Fast to execute, but scales support costs and increases misconfiguration-induced bug reports.

3Restrict plugin usage in the default path (minimal plugins by default) and require explicit opt-in with warnings.

Improves baseline stability, but may reduce “wow factor” and slow experimentation.

4Other / More discussion needed / None of the above.

What enforcement mechanism should govern plugin quality (tests, security, compatibility) without compromising openness?

Daily report (2025-02-05): "Test setup and coverage improvements" across multiple plugins (e.g., plugin-cronos, plugin-conflux).
Daily report (2025-02-05): "Updated vitest dependency for security" (PR #3254).

1Registry admission requires automated checks: minimal test suite, security scan, and a compatibility manifest against core versions.

Raises ecosystem quality and trust, but increases contribution friction and review load.

2Two-tier registry: “Verified” plugins meet strict gates; “Community” plugins are listed with warnings and telemetry-based reputation.

Preserves openness while guiding users toward stable choices.

3No gates—market-driven curation; rely on stars/usage and community discussion to surface quality.

Fast growth, but higher risk of breakage/security incidents that damage the framework’s brand.

4Other / More discussion needed / None of the above.

Trust Through Shipping: Roadmap, Support Load, and Signal Clarity

Operational velocity is high (surging PR volume and contributors), but user experience is dominated by troubleshooting; to meet the mission, we must convert raw activity into clear roadmaps, stable releases, and self-serve knowledge pathways.

What should be the Council’s primary trust signal to the developer galaxy over the next release window: stability metrics, documentation completeness, or flagship demos?

GitHub activity update (Feb 2025): "39 new pull requests (24 merged)… 105 active contributors" (Feb 6-7).
Discord partners (2025-02-05): accelxr: "A roadmap is being developed… targeted for next week."

1Stability metrics first: publish install success rate, CI health, and known-issues burn-down as the primary public scoreboard.

Aligns with execution excellence; may temporarily dampen hype but builds long-term credibility.

2Documentation + troubleshooting first: turn top Discord pain points into official guides and reduce support dependency.

Improves DX quickly and lowers ops load; may not fully counteract instability perception without metrics.

3Flagship demos first: ship polished reference agents and shows as proof of capability, then harden underneath.

Maximizes attention and partner excitement, but risks backlash if builders can’t reproduce results.

4Other / More discussion needed / None of the above.

How do we operationalize “Taming Information” so Discord troubleshooting turns into structured, searchable knowledge with minimal human overhead?

Discord (2025-02-04): "1300+ files processed" to improve documentation and LLM accuracy (jin).
Discord coders (2025-02-05): recurring fixes involve deleting db.sqlite / agent/data folders to resolve setup issues.

1Deploy an automated pipeline: Discord → tagged FAQ entries → docs PR drafts → weekly council digest, with human review only at merge.

Scales knowledge capture and reduces repeat questions, but needs careful governance to avoid misinformation.

2Lightweight approach: pin the top 20 issues in Discord and link to a living troubleshooting page; expand gradually.

Fast and low risk, but may not keep pace with ecosystem growth and version churn.

3Delegate to community stewards with token incentives; keep core team focused on code and releases.

Preserves engineering capacity, but quality and consistency may vary without strong editorial control.

4Other / More discussion needed / None of the above.

Given the support and reliability burden, what communications posture should we adopt for near-term launches (Cloud, token work, launchpad) to preserve trust?

Discord partners (2025-02-05): accelxr: "Launchpad is 95% complete" and "Tokenomics… 95% complete" with plans to release together.
Daily report (2025-02-06): lists multiple urgent runtime issues requiring immediate attention (#3302, #3303).

1Conservative: delay bold launch messaging until stability gates are met; communicate progress with explicit known-issues and timelines.

Strengthens trust-through-shipping, but may reduce short-term momentum and partner excitement.

2Split-track: proceed with launchpad/partner announcements while clearly labeling framework versions as alpha/beta and steering builders to curated paths.

Maintains momentum with contained risk, assuming messaging discipline and clear “supported paths.”

3Aggressive: push launch messaging now to capture narrative advantage; patch issues post-launch via rapid releases.

Maximizes hype and market timing, but risks reputational damage if onboarding fails at scale.

4Other / More discussion needed / None of the above.

North Star & Strategic Context

Key Deliberations