Council Briefing

Strategic Deliberation
North Star & Strategic Context

North Star & Strategic Context



This file combines the overall project mission (North Star) and summaries of key strategic documents for use in AI prompts, particularly for the AI Agent Council context generation.

Last Updated: December 2025

---

North Star: To build the most reliable, developer-friendly open-source AI agent framework and cloud platform—enabling builders worldwide to deploy autonomous agents that work seamlessly across chains and platforms. We create infrastructure where agents and humans collaborate, forming the foundation for a decentralized AI economy that accelerates the path toward beneficial AGI.

---

Core Principles: 1. **Execution Excellence** - Reliability and seamless UX over feature quantity 2. **Developer First** - Great DX attracts builders; builders create ecosystem value 3. **Open & Composable** - Multi-agent systems that interoperate across platforms 4. **Trust Through Shipping** - Build community confidence through consistent delivery

---

Current Product Focus (Dec 2025):
  • **ElizaOS Framework** (v1.6.x) - The core TypeScript toolkit for building persistent, interoperable agents
  • **ElizaOS Cloud** - Managed deployment platform with integrated storage and cross-chain capabilities
  • **Flagship Agents** - Reference implementations (Eli5, Otaku) demonstrating platform capabilities
  • **Cross-Chain Infrastructure** - Native support for multi-chain agent operations via Jeju/x402


  • ---

    ElizaOS Mission Summary: ElizaOS is an open-source "operating system for AI agents" aimed at decentralizing AI development. Built on three pillars: 1) The Eliza Framework (TypeScript toolkit for persistent agents), 2) AI-Enhanced Governance (building toward autonomous DAOs), and 3) Eliza Labs (R&D driving cloud, cross-chain, and multi-agent capabilities). The native token coordinates the ecosystem. The vision is an intelligent internet built on open protocols and collaboration.

    ---

    Taming Information Summary: Addresses the challenge of information scattered across platforms (Discord, GitHub, X). Uses AI agents as "bridges" to collect, wrangle (summarize/tag), and distribute information in various formats (JSON, MD, RSS, dashboards, council episodes). Treats documentation as a first-class citizen to empower AI assistants and streamline community operations.
    Daily Strategic Focus
    The fleet surged in plugin and infrastructure throughput, but the Council’s strategic bottleneck remains reliability/DX: model selection, database startup, and install friction are eroding trust faster than new capabilities can redeem it.
    Monthly Goal
    December 2025: Execution excellence—complete token migration with high success rate, launch ElizaOS Cloud, stabilize flagship agents, and build developer trust through reliability and clear documentation.

    Key Deliberations

    Reliability & DX Triage (Config, DB, Install)
    Operational chatter indicates recurring failures in basic onboarding paths: model selection flags are ignored, SQLite/Supabase adapters fail unpredictably (notably with node plugin), and package install/start failures continue to spawn new issues—directly conflicting with the execution-excellence directive.
    Q1
    Which reliability defect should be declared a Priority-0 “ship-stopper” for the next release train to protect developer trust?
    • Discord (2025-01-20, coders): Users reported character files with "model": "small" still default to large models (configuration confusion).
    • Discord (2025-01-20, coders): "Database connection not open" / SQLite connection problems, especially with node plugin.
    1Fix model selection and modelClass enforcement (small/medium/large mapping) end-to-end.
    Reduces surprise cost/latency and restores configuration credibility—critical for Cloud and enterprise adoption.
    2Stabilize database adapters and node plugin startup (SQLite + Supabase) with deterministic defaults and clearer errors.
    Improves first-run success rate and lowers support load, directly increasing builder retention.
    3Resolve package installation/start failures (npm/pnpm packaging, missing modules, model download failures) via a hardened quickstart path.
    Maximizes onboarding throughput, but may defer deeper runtime correctness issues that reappear later.
    4Other / More discussion needed / None of the above.
    Q2
    Do we formalize a single blessed “golden path” repo (main eliza) and effectively deprecate eliza-starter until it meets reliability targets?
    • Discord (2025-01-20, coders): Community advised using main eliza repository instead of eliza-starter due to dependency issues.
    1Yes—declare main repo the golden path; mark eliza-starter as experimental until parity is restored.
    Short-term clarity and fewer broken installs; potential backlash from starter users but less fragmentation.
    2No—invest immediately to fix eliza-starter and keep it as the primary onboarding path.
    Better long-term onboarding UX, but consumes bandwidth that could stabilize core runtime and Cloud launch.
    3Hybrid—golden path is main repo now; starter remains supported only for a narrow “hello agent” scenario with CI gates.
    Balances focus and clarity, while keeping an entry ramp for non-experts without overpromising.
    4Other / More discussion needed / None of the above.
    Q3
    What is the Council’s minimum acceptable “first-run success rate” and what enforcement mechanism do we adopt to achieve it?
    • GitHub Daily Update (2025-01-21): New issues include inability to install `@elizaos/agent` (#2624) and agent start failures due to model download failures (#2623).
    1Set a hard gate: ≥90% first-run success in CI smoke tests across OS targets before release.
    Strong trust signal, but may slow feature velocity and require test infra expansion.
    2Set a soft target: ≥75% success with rapid hotfix cadence and transparent known-issues ledger.
    Keeps shipping momentum, but risks continued churn and reputational drag.
    3Segmented targets: 95% for Cloud path, 70% for self-host; prioritize commercial reliability first.
    Optimizes for revenue and managed UX, but may alienate open-source self-hosters if neglected.
    4Other / More discussion needed / None of the above.
    Throughput vs Coherence (Plugin Expansion & Governance of Quality)
    The ecosystem is adding plugins at a high tempo (NIM, Cronos EVM, router nitro, holdstation swap, MongoDB adapter, etc.), but without stronger quality gates this growth can amplify support burden and reduce perceived reliability—contradicting “Execution Excellence.”
    Q1
    How should the Council govern plugin intake to preserve composability while preventing reliability debt from exploding?
    • GitHub Activity (Jan 20–22): "29 new pull requests (19 merged)... jump to 66 active contributors" (rapid intake).
    • Daily Report (2025-01-20): Multiple new plugins landed (e.g., NVIDIA NIM #2599, Holdstation swap #2596, Router Nitro #2590, Cronos EVM #2585).
    1Adopt strict plugin admission standards: tests + minimal docs + security review required before merge/registry inclusion.
    Higher trust and lower breakage, but reduces contributor velocity and increases maintainer workload.
    2Two-tier system: “Core/Verified” plugins with high gates; “Community/Experimental” plugins with lightweight gates and clear labeling.
    Preserves innovation while protecting newcomers; requires consistent labeling and registry tooling.
    3Max velocity: merge quickly, rely on community to surface issues; fix regressions post-merge.
    Short-term expansion, long-term support overload and perceived instability—risks North Star alignment.
    4Other / More discussion needed / None of the above.
    Q2
    Do we pause net-new plugins for a defined stabilization window to align with execution excellence, or keep parallel lanes?
    • Discord (2025-01-20): Team prioritizing V2 development over PR activities; ongoing backlog includes model selection + DB issues.
    1Pause net-new plugins for 1–2 sprints; focus on core stability, docs, and onboarding success rate.
    Improves reliability quickly, but may dampen community excitement and partner integrations.
    2Parallel lanes: core team stabilizes; community plugins continue under a strict “experimental” banner.
    Maintains momentum while protecting core; requires clear governance and moderation bandwidth.
    3No pause; rely on tooling (CI, linters, bots) to keep quality acceptable at scale.
    Works only if automation coverage is strong; otherwise risks repeated regressions and contributor frustration.
    4Other / More discussion needed / None of the above.
    Model & Provider Strategy (DeepSeek R1, NVIDIA NIM, Cost/Performance)
    Community signal indicates a strategic opening: DeepSeek R1 claims near-frontier reasoning at drastically lower cost with permissive licensing, while NVIDIA NIM integration expands provider optionality—yet model selection bugs and inconsistent provider behavior undermine the ability to exploit these options safely.
    Q1
    Should the Council elevate DeepSeek R1 integration to a strategic priority, and if so, what role should it play (default vs optional vs Cloud-only)?
    • Discord (2025-01-20, partners/coders): "DeepSeek's R1... O1/Sonnet-level performance at 30x lower cost with MIT licensing."
    • Daily Report (2025-01-20): DeepSeek provider support and related fixes appear in the repo activity stream.
    1Make R1 a first-class, documented option and recommend it for cost-optimized deployments.
    Increases competitiveness and developer delight, but increases surface area for provider-specific bugs.
    2Keep R1 experimental until model selection + provider parity issues are resolved.
    Protects reliability narrative; may miss a window to capture builders seeking cheaper reasoning.
    3Offer R1 primarily via ElizaOS Cloud with curated configs and guardrails; keep self-host optional.
    Turns provider advantage into managed UX and revenue leverage, but may be seen as gating capability.
    4Other / More discussion needed / None of the above.
    Q2
    How do we reconcile “Open & Composable” with an exploding matrix of providers (OpenAI/Anthropic/DeepSeek/NVIDIA NIM/etc.) without sacrificing reliability?
    • GitHub Daily Update (2025-01-21): Added NVIDIA NIM plugin (#2599) and multiple provider-related improvements.
    • Discord (2025-01-20): Users report provider-specific failures (e.g., Anthropic issues in Discord; switching to OpenAI resolved an error).
    1Define a provider compatibility contract (streaming, tools, vision, embeddings) and certify providers against it.
    Creates a reliable composability baseline and supports future certification programs.
    2Limit official support to a small set of “Council-approved” providers; others remain community-supported.
    Reduces QA load, but constrains openness and may slow ecosystem growth.
    3Embrace full provider plurality; invest in runtime adapters and robust fallback logic to smooth differences.
    Most aligned with openness, but demands significant engineering investment in abstraction and testing.
    4Other / More discussion needed / None of the above.
    Q3
    What is our canonical performance target: lower cost per agent, lower latency, or higher autonomy (memory/RAG/tooling), given current community pain points?
    • Discord (2025-01-20, coders): Need for better memory management so agents persist data between messages.
    • Discord (2025-01-20): Model selection confusion causing unintended use of large models (cost/latency risk).
    1Prioritize cost control (correct model selection + cheaper reasoning providers) to maximize adoption.
    Boosts builder experimentation and Cloud unit economics, but may leave autonomy gaps unresolved.
    2Prioritize autonomy (memory/RAG correctness and persistence) even if cost/latency stays higher short-term.
    Improves flagship-agent credibility and “agents that work,” but may reduce casual developer adoption.
    3Prioritize latency/UX (streaming, responsiveness, client stability) to make agents feel alive across platforms.
    Strengthens perceived quality and retention, but without autonomy gains agents may remain shallow.
    4Other / More discussion needed / None of the above.