Skip to content

unbug/monday

Repository files navigation

Monday - Browser AI Chat

Run open-source AI models directly in your browser. No server, no install, 100% private.

Deploy License: MIT

Live Demo · Changelog


Features

  • Zero Install — Pure browser experience, no downloads needed
  • Browser-Native Inference — Models run locally via WebGPU + WASM using Web-LLM
  • 23 Pre-configured Models — Qwen 2.5/3/3.5, SmolLM2, Gemma 2/3, Phi 3.5/4, Llama 3.2, DeepSeek R1, and more
  • Streaming Output — Token-by-token real-time response
  • Chat History — Persistent multi-session conversations via IndexedDB
  • Changelog — In-app version history with expandable release details
  • Usage Statistics — Dashboard with daily charts, per-model breakdown, and provider analytics
  • Model Comparison — Side-by-side generation from two models with real-time token stats
  • Model Benchmark — Built-in benchmark tool to measure tokens/sec and latency
  • Custom Model Import — Load custom MLC-compiled models from any HuggingFace URL
  • Download Resume — Resume interrupted model downloads from where you left off
  • Session Search — Search conversations by title with date filtering
  • Command Palette — Quick navigation with ⌘K
  • Prompt Templates & Personas — 8 built-in personas + custom persona creation
  • Message Actions — Edit and regenerate user messages inline
  • Generation Parameters — Per-session temperature, top-p, max tokens sliders
  • System Prompts — Customizable per-session system prompts
  • Token Counter — Real-time tokens/sec and total token usage
  • Model Cache Manager — View and delete cached models
  • Recent Models — Quick access to recently used models
  • Recommended Models — Top models based on your usage history
  • Storage Quota — Monitor browser storage usage
  • Markdown Rendering — Code highlighting, LaTeX math, GFM tables
  • Chat Export — Export conversations as Markdown
  • BorderBeam UI — Animated border effects with ocean/colorful/mono variants
  • Theme Toggle — Light / Dark / System with auto-detection
  • Mobile Responsive — Sidebar overlay, auto-close, safe-area support
  • PWA Ready — Web app manifest, apple-touch-icon
  • 100% Private — Nothing leaves your browser

Architecture

High-Level System Architecture

graph TB
    subgraph Browser["🌐 Browser (Client-Side Only)"]
        UI["React UI<br/>Vite + TypeScript"]
        Engine["Web-LLM Engine<br/>WebGPU / WASM"]
        IDB["IndexedDB<br/>Chat Persistence"]
        Cache["Browser Cache<br/>Model Weights"]
    end

    subgraph External["☁️ External (Read-Only)"]
        HF["HuggingFace CDN<br/>MLC Model Registry"]
        GHP["GitHub Pages<br/>Static Hosting"]
    end

    User(("👤 User")) --> UI
    UI -->|"streamChat()"| Engine
    Engine -->|"Token Stream"| UI
    UI -->|"saveSessions()"| IDB
    IDB -->|"loadSessions()"| UI
    Engine <-->|"Download Once"| HF
    Engine -->|"Cache Weights"| Cache
    GHP -->|"Serve SPA"| Browser

    style Browser fill:#1a1a2e,stroke:#a78bfa,color:#e5e5e5
    style External fill:#0d1117,stroke:#444,color:#999
    style Engine fill:#7c3aed,stroke:#a78bfa,color:#fff
Loading

Routing Architecture

Monday uses a zero-dependency URL routing system built on the HTML5 History API — no React Router, no hash fragments.

All 14 named views and their paths (defined in src/App.tsx):

View key URL path
chat /monday/
models /monday/models
changelog /monday/changelog
cache /monday/cache
stats /monday/stats
comparison /monday/comparison
benchmark /monday/benchmark
custom-models /monday/custom-models
persona-marketplace /monday/persona-marketplace
knowledge /monday/knowledge
plugins /monday/plugins
mcp-servers /monday/mcp-servers
webdav /monday/webdav
memory /monday/memory

How it works:

flowchart LR
    URL["URL\n/monday/…"] -->|popstate| VFP["viewFromPath()\nURL → View enum"]
    VFP --> State["view state\nReact useState"]
    State -->|useEffect| PS["history.pushState"]
    PS --> URL

    subgraph GH["GitHub Pages compat"]
        F["public/404.html\nsaves path → sessionStorage"]
        R["Redirect → /monday/"]
        A["App init reads sessionStorage\nhistory.replaceState"]
        F --> R --> A
    end

    style State fill:#7c3aed,stroke:#a78bfa,color:#fff
    style GH fill:#0d1117,stroke:#444,color:#999
Loading

Key behaviours:

  • Calling setView(v) triggers a useEffect that does history.pushState to the mapped URL — the URL bar updates instantly without a page reload.
  • popstate events (browser back / forward) call viewFromPath(pathname) to resolve the URL back into a View and update React state.
  • GitHub Pages 404 compatibility: public/404.html captures the requested path in sessionStorage and redirects to /monday/. On first render App.tsx reads it back and calls history.replaceState to restore the original URL before the SPA mounts.

Rule for adding a new view:

  1. Add the key + path to VIEW_PATH in App.tsx.
  2. Add a view === 'new-view' render branch in the JSX return.
  3. Add a navigation callback to useKeyboardShortcuts and a menu item in Sidebar.

Component Architecture

graph TD
    App["App.tsx<br/>URL Router + Global State"]
    App --> Sidebar["Sidebar<br/>Session List + Nav"]
    App --> Header["Header<br/>Model Badge + Theme"]

    subgraph Views["Routed Views (view state)"]
        V_chat["chat\n ChatLayout"]
        V_models["models\n ModelSelector"]
        V_knowledge["knowledge\n KnowledgePanel"]
        V_plugins["plugins\n PluginManager"]
        V_mcp["mcp-servers\n McpServerManager"]
        V_memory["memory\n MemoryPanel"]
        V_webdav["webdav\n WebDAVSettings"]
        V_stats["stats\n ModelStats"]
        V_cmp["comparison\n ModelComparison"]
        V_bench["benchmark\n ModelBenchmark"]
        V_persona["persona-marketplace\n PersonaMarketplace"]
        V_custom["custom-models\n CustomModelImport"]
        V_cache["cache\n (cache manager)"]
        V_cl["changelog\n Changelog"]
    end

    App --> Views

    Header --> TT["ThemeToggle<br/>Light/Dark/System"]
    Header --> WG["WebGPUCheck"]
    V_chat --> ML["MessageList"]
    V_chat --> CI["ChatInput<br/>BorderBeam textarea"]

    subgraph Hooks["Custom Hooks"]
        useModel["useModel<br/>Load/Unload/Progress"]
        useChat["useChat<br/>Sessions/Messages/Stream"]
        useTheme["useTheme<br/>Light/Dark/System"]
        useKnowledge["useKnowledge / useKnowledgeBases"]
        useVectorStore["useVectorStore<br/>IndexedDB vectors"]
        useEmbedding["useEmbeddingModel<br/>GTE-small MLC"]
        useMcp["useMcpServers"]
    end

    subgraph Lib["Core Library"]
        engine["engine.ts<br/>Web-LLM Singleton"]
        models["models.ts<br/>Model Registry"]
        storage["storage.ts<br/>IndexedDB Ops"]
        changelog["changelog.ts<br/>Version Data"]
    end

    App --> Hooks
    Hooks --> Lib
    engine -->|"CreateMLCEngine"| WEBLLM["@mlc-ai/web-llm"]

    style App fill:#7c3aed,stroke:#a78bfa,color:#fff
    style Views fill:#1a1a2e,stroke:#a78bfa,color:#e5e5e5
    style Hooks fill:#1e3a5f,stroke:#3b82f6,color:#e5e5e5
    style Lib fill:#1a3328,stroke:#22c55e,color:#e5e5e5
Loading

Data Flow: Chat Message Lifecycle

sequenceDiagram
    participant User
    participant ChatInput
    participant useChat
    participant engine.ts
    participant WebLLM
    participant IndexedDB

    User->>ChatInput: Type message + Enter
    ChatInput->>useChat: sendMessage(content)
    useChat->>useChat: Create user msg + empty assistant msg
    useChat->>engine.ts: streamChat(history)
    engine.ts->>WebLLM: chat.completions.create(stream:true)

    loop Token Streaming
        WebLLM-->>engine.ts: yield delta token
        engine.ts-->>useChat: yield token
        useChat-->>useChat: Append to assistant msg
        useChat-->>ChatInput: Re-render (streaming)
    end

    useChat->>useChat: Finalize msg, generate title
    useChat->>IndexedDB: saveSessions(updated)
    useChat-->>User: Complete response displayed
Loading

Model Loading Flow

sequenceDiagram
    participant User
    participant ModelSelector
    participant useModel
    participant engine.ts
    participant WebLLM
    participant HuggingFace as HF CDN

    User->>ModelSelector: Click model card
    ModelSelector->>useModel: load(modelId)
    useModel->>useModel: setState(downloading, 0%)

    useModel->>engine.ts: loadModel(modelId, onProgress)
    engine.ts->>WebLLM: CreateMLCEngine(modelId)
    WebLLM->>HuggingFace: Fetch model weights (WASM/WebGPU)

    loop Download Progress
        HuggingFace-->>WebLLM: Chunk data
        WebLLM-->>engine.ts: InitProgressReport
        engine.ts-->>useModel: progress callback
        useModel-->>ModelSelector: Update progress bar
    end

    WebLLM-->>engine.ts: Engine ready
    engine.ts-->>useModel: resolve
    useModel->>useModel: setState(ready, 100%)
    useModel-->>User: Model badge shown ✓
Loading

Tech Stack

Layer Technology
Framework Vite 8 + React 19 + TypeScript 6
AI Runtime @mlc-ai/web-llm (WebGPU + WASM)
UI Effects border-beam
Persistence IndexedDB (sessions, messages)
Deployment GitHub Pages via GitHub Actions
Build Vite, ESNext target

Supported Models

Model Parameters Size Provider
Qwen 3 0.6B 0.6B ~400 MB Alibaba
Qwen 3 1.7B 1.7B ~1 GB Alibaba
Qwen 3 4B 4B ~2.5 GB Alibaba
Qwen 3.5 0.8B 0.8B ~500 MB Alibaba
Qwen 3.5 2B 2B ~1.2 GB Alibaba
Qwen 2.5 0.5B 0.5B ~350 MB Alibaba
Qwen 2.5 1.5B 1.5B ~900 MB Alibaba
Qwen 2.5 3B 3B ~1.8 GB Alibaba
Qwen 2.5 Coder 1.5B 1.5B ~900 MB Alibaba
SmolLM2 360M 360M ~200 MB HuggingFace
SmolLM2 1.7B 1.7B ~1 GB HuggingFace
Gemma 2 2B 2B ~1.3 GB Google
Gemma 3 4B 4B ~2.5 GB Google
Gemma 3 1B 1B ~700 MB Google
Phi 3.5 Mini 3.8B ~2 GB Microsoft
Phi 4 Mini 3.8B ~2.2 GB Microsoft
DeepSeek R1 Distill Qwen 1.5B 1.5B ~1 GB DeepSeek
Llama 3.2 1B 1B ~700 MB Meta
Llama 3.2 3B 3B ~1.8 GB Meta
TinyLlama 1.1B 1.1B ~600 MB Community
StableLM 2 Zephyr 1.6B 1.6B ~950 MB Stability AI
InternLM 2.5 1.8B 1.8B ~1.1 GB Shanghai AI Lab
OLMo 1B 1B ~600 MB Allen Institute

⭐ = Recommended for most users


Competitive Analysis

Roadmap informed by deep analysis of these leading AI chat platforms:

Product Stars Key Differentiator Monday Relevance
OpenClaw 364k Personal always-on AI assistant: multi-channel (WhatsApp/Telegram/Slack/Discord/…), SOUL.md identity, AgentSkills/SKILL.md ecosystem, ClawHub registry (52.7k skills, 12M installs), Skill Workshop AI, per-agent allowlists Skills system (SKILL.md spec), skill marketplace, SOUL.md persona persistence, self-improving agent memory
Open WebUI 132k Full-featured self-hosted AI platform: RAG, pipelines, MCP, RBAC, voice/video, image gen Feature-complete reference for chat UX, RAG, tools
NextChat 88k Lightweight cross-platform AI client: Vercel deploy, MCP, masks, artifacts, Tauri desktop Lightweight UX, prompt templates, artifacts rendering
LobeHub 75k Agent-as-unit-of-work platform: 10k+ plugins, agent groups, personal memory, TTS/STT Agent system, plugin ecosystem, memory architecture
Jan 42k Offline desktop ChatGPT: local LLMs via llama.cpp, custom assistants, OpenAI-compatible API Offline-first philosophy, model management, MCP integration
GPT-Runner 379 AI presets for code: conversations with code files, IDE integration, version-controlled prompts Preset system, project-scoped AI configuration
Claude Code 118k Terminal coding agent: deep codebase understanding, multi-step agentic task execution, bash/git/test tools, CLAUDE.md project config, plugins directory, @claude GitHub tagging, computer_use tool (screenshot→observe→action loop in sandboxed VM) Agent mode (v0.30), plugin system (v0.27), task brief (CLAUDE.md equivalent), computer-use loop (v1.3)
browser-use 90.4k LLM-controlled browser automation: Playwright-backed agent, click/type/scroll/navigate/screenshot/extract-text primitives, skills directory, AGENTS.md + CLAUDE.md conventions, CLI, cloud hosting, 100-task real-world benchmark Browser-use agent (v1.3): action primitives, sandboxed iframe execution loop, DOM-state context, screenshot observation
Playwright MCP 31.4k MCP server for browser automation: navigate/click/fill/screenshot/DOM tools via accessibility tree (no vision model needed); vision + coordinate-based modes opt-in; used by VS Code, Cursor, Claude Desktop, Codex, Copilot Playwright MCP bridge (v1.3): connect Monday's v0.27 MCP client to @playwright/mcp for full external-browser control

Roadmap

North Star (immutable): A local-first, browser-native AI workstation. WebGPU inference + optional remote providers, with first-class memory, tools and offline capability — all running entirely in the user's browser.

Three non-negotiable axes every release must satisfy:

  1. Local-first — every feature works with WebGPU + IndexedDB only; cloud providers are an option, never a requirement.
  2. Phase progression — releases ship the earliest unreleased version in ### Versioned task breakdown end-to-end. No skipping versions, no scope outside the listed checkboxes.
  3. Release gate — a version is "done" only when its release gate is green. Trivial built-ins or polish do not unlock the next version.

Versioned task breakdown

The autonomous Cron picks the first unchecked, unblocked item in the earliest unreleased version and ships it end-to-end (code + build green

  • visible in UI + entry in CHANGELOG). It never invents scope outside this list, and never skips a version. Past versions remain documented as a historical record below.

v0.25 — Knowledge & RAG (storage layer) (current target)

Phase 5 of the legacy plan, split for scope safety. RAG is the highest-value unmet feature in the product.

  • Document upload — Upload PDFs / TXT / MD files into a "Knowledge" panel (PDF parsing via pdfjs-dist)
  • Client-side chunking — Split documents into ~500-token chunks in-browser (no server)
  • Browser vector store — IndexedDB-backed vector store with cosine similarity, schema migration registered in storage.ts
  • Knowledge bases — Organize documents into named collections; attach a collection to a session

Release gate: a user uploads a 5-page PDF, sees chunks indexed, and a search box returns the top-K matching chunks (no LLM yet — that's v0.26).

Released: 2026-04-25

v0.26 — RAG (retrieval + citation)

  • Embedding model — Run a small embedding model via Web-LLM (e.g. gte-small MLC build) and persist embeddings
  • Semantic search — On send, query the active knowledge base and inject top-K chunks into the system prompt
  • Citation display — Show which chunks were used per assistant message, with click-to-open
  • Citation persistence — Citations survive page reload (stored alongside message in IndexedDB)

Release gate: a question answered using a chunk shows a citation that opens to the exact span of the source document; reload preserves it.

Released: 2026-04-26

v0.27 — Tools, Function calling, MCP

Phase 6 advanced — the only sanctioned tools work. Net-new built-in mini-tools (calculator / clock / unit converter / JSON formatter / one-shot web-search button / standalone formatter) are out of scope: they distract from the function-calling / plugin / MCP work that actually lets users plug in any tool. Mini-tools, if at all, ship later as plugins through the system below.

  • Function calling — Parse model tool-call outputs (OpenAI-style tool_calls JSON) and dispatch to in-browser functions
  • Plugin system — Load third-party tool plugins from URL (JSON manifest declaring name / description / inputSchema / handlerUrl)
  • MCP client — Connect to an MCP server (WebSocket transport) and expose its tools to the model
  • Tool call inspector — A panel that shows the request / response / latency of every tool call in a session

Release gate: a user installs one external plugin from URL or connects to one MCP server, the model invokes a tool from it, and the inspector shows the full request / response.

Released: 2026-04-26

v0.28 — Collaboration & Sharing

  • Share conversations — Generate a shareable static HTML export (no server)
  • Import/export — Full data import / export (sessions, personas, settings, knowledge bases) as a single .monday zip
  • WebDAV sync — Cross-device sync via user-supplied WebDAV server
  • Shared personas — Publish a persona to a static community registry (curated JSON file in the repo)
  • Conversation forking — Branch a session at any message; branches are siblings, navigable in the sidebar

Release gate: round-trip import → export → re-import preserves every session, persona and knowledge base byte-for-byte.

Released: 2026-04-26

v0.29 — Desktop, PWA polish & shortcuts

  • Update prompt — Banner when a new service worker is installed
  • Offline indicator — Header chip when offline; gracefully disable cloud-only features
  • Background notifications — Notify when a long generation completes while the tab is hidden (uses existing useNotifications)
  • Desktop app — Tauri wrapper that targets macOS / Windows / Linux
  • Keyboard shortcuts overlay? opens a list of every shortcut (Cmd+K / Cmd+N / Cmd+⇧S / Cmd+E …); shortcuts also documented in the README
  • Multi-window — Open a conversation in a separate browser window / Tauri window with shared IndexedDB

Release gate: a Tauri build runs on macOS with full chat + RAG + tools functionality; offline mode degrades gracefully.

v0.30 — Agent mode & analytics

  • Multi-turn memory — Auto-summarize early turns when the context window is exceeded; summaries are visible and editable
  • Agent mode — Multi-step task execution with tool use (an outer planner loop on top of v0.27 function calling)
  • Model chaining — Pipeline: fast model drafts → large model refines, configurable per persona
  • Batch generation — Generate N responses in parallel and pick the best
  • Usage analytics — Local-only dashboard: model usage, tokens consumed, average tps, sessions per day
  • i18n — Multi-language interface (English, 中文, 日本語) with language picker in settings
  • Accessibility — Screen-reader landmarks, keyboard-only navigation, high-contrast theme

Release gate: a documented agent-mode demo solves a 3-step task (search → summarize → save) end-to-end with zero manual intervention.

Released: 2026-04-27

v0.31 — Code Arena / Showdown Mode

A richer evolution of the existing Model Comparison view, inspired by WebDev Arena, Design Arena and the indie "Grass Field challenge" rigs that show up in Twitter dual-pane screenshots: same prompt → two models → live HTML/canvas preview → shareable recording. Net-new vs. v0.2's plain text-only comparison.

  • Dual artifact panes — Side-by-side terminal-style cards with provider badge, model name, status (pending / streaming / done) and generation duration in seconds
  • Sandboxed iframe preview — Each pane mounts the streamed HTML/CSS/JS into a sandbox="allow-scripts" iframe, refreshed on every chunk (debounced) and on a manual ↻ Run button
  • Code ↔ Preview tabs — Per-pane toggle between rendered preview and source view, with a Copy button on each
  • Synchronized scroll — Code view in both panes scrolls in lockstep (line-aligned) to make diffs obvious
  • Challenge prompt library — Curated presets (Grass Field, Solar System, Pelican on a Bicycle, Tetris, Snake, Bouncing Balls, Particle System, CSS Loader Gallery), one-click load into the arena
  • Recording & video exportMediaRecorder captures both iframes as a synchronized timelapse .webm (default 30 fps, configurable), with a small "@username" watermark from settings
  • PNG share card — Export a single PNG with both final previews, model names, durations and watermark — sized for Twitter (16:9)
  • Verdict & local leaderboardTeam A / Tie / Team B voting UI; results persisted in IndexedDB and aggregated into a per-model win/tie/loss table (purely local, no upload)

Release gate: a user picks two models, loads the "Grass Field" preset, hits Send, sees both iframes animate side-by-side, exports a .webm with watermark, votes a winner, and the leaderboard updates.

Released: 2026-04-29

v1.0 — External LLM Providers & Web Search (stable)

The "1.0" promise: anything saved in v1.0 keeps working until v2.0.

  • OpenAI-compatible API — Configure any OpenAI-compatible endpoint (custom base URL + API key, stored encrypted in IndexedDB)
  • Ollama integration — Connect to a local Ollama server (http://localhost:11434) with model auto-discovery
  • LM Studio — Connect to LM Studio's local OpenAI-compatible server
  • llama.cpp server — Connect to llama.cpp --server HTTP mode
  • vLLM — Connect to a vLLM inference endpoint
  • DeepSeek API — First-class DeepSeek cloud provider (chat + reasoner models)
  • Provider switcher — Per-session toggle between WebGPU local inference and external API providers
  • SearXNG integration — Web search via a user-supplied SearXNG URL
  • Stable storage schema v1 — Migration registry frozen; future migrations must add, not break, fields

Release gate: a 24-hour soak test (1 hour with each provider) passes; the storage migration test from v0.25 → v1.0 round-trips without loss.

Released: 2026-04-30

v1.1 — Skills System

Inspired by OpenClaw's AgentSkills/SKILL.md ecosystem and ClawHub (52.7k tools, 12M downloads). Skills sit between personas (identity) and plugins (tools): a skill is a structured capability pack that teaches the model how to behave in a specialized domain — e.g. "Python Debugger", "Technical Writer", "SQL Analyst". Multiple skills can be stacked in one session.

Persona = who the AI is. Plugin = what tools the AI has. Skill = what the AI knows how to do (domain instructions, workflow steps, required-plugin declarations).

  • Skill format — Skill spec stored in IndexedDB: name, description, instructions (markdown injected into system prompt), requiredPlugins (list of plugin URLs/IDs), version, tags, icon
  • Skill composer — Per-session skill panel: attach 1–N skills alongside a persona; active skills shown as chips in the session header; skill instructions appended to the system prompt before each turn
  • Skill registry — Community skill registry (curated JSON file in the repo, like persona registry) with 20+ launch skills across categories: Coding, Writing, Research, Data, Language, Creative
  • Skill builder UI — In-app skill editor: name, description, tag picker, markdown instructions with live token-count estimate, required-plugin picker, export as .monday-skill JSON
  • Skill + plugin binding — A skill can declare required plugins by URL/ID; installing a skill from the registry auto-prompts to install any missing plugins (same flow as v0.27 plugin install)
  • SOUL.md equivalent — "Soul" tab in the persona editor: a persistent cross-session identity prompt that survives /new and session resets; stored in IndexedDB alongside the persona; separate from the per-session system prompt
  • Skill marketplace UI — Browse/search/install from the community registry; show tags, install count (local-only counter), author; one-click install
  • Skill hot-reload — Changes to an active skill take effect on the next message send (no session restart required)

Release gate: a user installs a "Python Debugger" skill from the registry, attaches it to a new session alongside a persona, sends a debugging question, and the model follows the skill's specialized workflow; the skill persists on page reload; the session header shows the active skill chip.

Released: 2026-05-03

v1.2 — Self-Improving Agent & Persistent Memory

Inspired by the top-trending ClawHub skills: self-improving-agent (411k downloads), ontology typed memory graph (171k downloads), and self-improving + proactive agent (174k downloads). All state is local-only — nothing leaves IndexedDB.

  • Persistent memory store — Cross-session key-value memory backed by IndexedDB; the model can read memories at session start and write new ones during the conversation; memories panel shows all entries with edit/delete
  • Memory namespaces — Memories scoped to three levels: global (all sessions), per-persona, per-skill; the active session inherits the union of applicable namespaces
  • Correction capture — When a user edits or regenerates a message, optionally record the correction as a named memory entry ("Prefer concise answers", "Always use TypeScript strict mode"); visible in the memories panel
  • Ontology store — Typed entity graph: Person, Project, Task, Event, Document; entities have properties + relationships; browsable/editable in a side panel; injected as a compact context block when relevant entities are mentioned
  • Session compaction with learning — When compacting long sessions (v0.30 multi-turn memory), extract preference signals and entity mentions into the memory store, not just a plain summary; user reviews before committing
  • Skill Workshop (browser edition) — After a session ends, the model proposes skill refinements based on corrections, regenerations, and user edits; proposals shown in a diff view; user approves → saved to the relevant skill in IndexedDB
  • Memory-aware personas — A persona can declare which memory namespaces it reads on activation (e.g. "global" + "per:this-persona"); persona editor shows a memory preview panel

Release gate: after 3 sessions with a persona, the memory panel shows ≥5 automatically captured preferences; a Skill Workshop proposal is generated, approved, and the next session reflects the updated skill instructions.

Released: 2026-05-03

v1.3 — Browser-Use & Computer-Use (In-Browser Agent Loop)

Inspired by Claude Code's computer_use tool (screenshot → observe → action loop in a sandboxed VM), browser-use (90.4k ⭐ — LLM-controlled Playwright agent with skills directory and action primitives), Playwright MCP (31.4k ⭐ — accessibility-tree MCP server used by VS Code / Cursor / Codex), and Codex CLI's sandboxed bash/file/edit execution model. Extends the v0.30 agent loop and v0.27 MCP client into a full browser-use and in-browser computer-use system.

Three execution tiers of increasing capability:

  • Tier 1 — Sandboxed iframe agent: model generates HTML/CSS/JS → renders in a sandbox="allow-scripts" iframe → html2canvas screenshot → model observes → next action. 100% in-browser, zero external dependencies.

  • Tier 2 — DOM-state computer-use: serialize the active iframe's accessibility tree to compact JSON and inject as a context block; model issues action commands (click, type, scroll, navigate); dispatcher translates to DOM events. Inspired by Playwright MCP's accessibility-tree approach — no vision model required.

  • Tier 3 — Playwright MCP bridge: connect Monday's v0.27 MCP client to a locally-running @playwright/mcp server; full external-browser control (navigate real URLs, fill forms, run tests) with model in the loop; every action logged in the existing tool-call inspector.

  • Agent action primitivesnavigate, click, type, scroll, extract-text, take-screenshot, read-dom; each is a named MCP-style tool callable by the model via the v0.27 function-calling layer

  • Sandboxed iframe execution loop (Tier 1) — generate → render in sandbox="allow-scripts" iframe → html2canvas screenshot → attach as image to next LLM call → iterate; debounced auto-refresh + manual ↻ Run; reuses the iframe infra planned for v0.31 Code Arena

  • DOM-state capture (Tier 2) — serialize active iframe's accessibility tree (ARIA roles, labels, input states) to compact JSON injected into context before each model turn; depth + node-count budget to stay token-safe

  • Vision mode (Tier 1/2)OffscreenCanvas / html2canvas screenshot attached as base64 image in the next LLM call; requires a multimodal model (e.g. Qwen-VL); falls back to DOM-state mode for non-vision models automatically

  • Playwright MCP bridge (Tier 3) — one-click connect in the MCP panel; Monday auto-discovers @playwright/mcp if already configured; domain allowlist + blocked-origins enforced per task brief

  • Task brief (AGENTS.md / CLAUDE.md equivalent) — per-task markdown config declaring goal, allowed domains, step budget, and stop criteria; stored in IndexedDB; shown as a collapsible header above the agent thread

  • Agent audit trail — chronological log of every action + observation + screenshot thumbnail; collapsible per step inside the chat thread; inspired by Codex CLI's terminal-log citation model and Codex Web's task-delegation audit view

  • Async task queue — "delegate and come back" UI inspired by Codex Web: submit a browser task, minimize the panel, get notified via the v0.29 background notification system when the agent finishes or needs human input

  • Sandbox security model — Tier 1: sandbox="allow-scripts" only (no allow-same-origin); Tier 3: domain allowlist + --blocked-origins forwarded to Playwright MCP; credentials redacted in audit trail logs (mirrors browser-use's fill() debug-log redaction practice)

Release gate: a user opens the agent panel, gives the task "fill in the sandboxed form and click Submit", the agent executes ≥5 actions (screenshot → click → type → submit → screenshot), the audit trail shows every step with thumbnails, the async task queue marks it done and triggers a notification, and the final iframe state is visible in the panel.

Released: 2026-05-12

v1.4 — Persona Marketplace & Image Input

Extending the persona system (v1.1) with community discovery and multimodal input.

  • Persona marketplace browsing — Browse community personas from the curated registry (already exists as PERSONA_REGISTRY); add search/filter by category, sort by install count; one-click install to the local persona store; shows persona preview (system prompt snippet, params, soul) before installing
  • Image input — Paste or drop an image into the chat input; for vision-capable models, the image is attached as a base64 data URL in the next LLM call; non-vision models show a graceful "vision not available" message ✅
  • Full PWA — Service worker with cache-first strategy for app shell + model weights; offline fallback page; install banner on repeat visits from desktop ✅

Release gate: a user browses the persona marketplace, installs a new persona, switches to it in a session, and the persona's soul and system prompt are active; a user pastes an image into chat and the vision model processes it.

Released: 2026-05-13

v1.5 — Voice & TTS — Multimodal I/O

Voice input and text-to-speech output for hands-free interaction, inspired by Open WebUI's voice features and NextChat's voice support.

  • Voice input — Browser Speech Recognition API for voice-to-text in the chat input; real-time transcription with interim results shown as placeholder text; stop button to end recording; automatic send on silence detection (configurable timeout) ✅
  • TTS output — Web Speech API text-to-speech for assistant responses; per-message play/pause/stop controls; voice selector (if available); auto-play toggle in settings; graceful fallback message when TTS is not supported ✅

Release gate: a user speaks into the microphone, sees real-time transcription, and the text is sent as a message; a user plays TTS on an assistant message and the browser speaks the response.

Released: 2026-05-13

v1.6 — Context Injection

Allow users to attach reusable text and code snippets to any session. Snippets are injected into the system prompt before each turn, giving the model persistent context without requiring full RAG.

  • Context library — Create, name, and organize text/code snippets; each snippet has a title, content (markdown), and optional category tag
  • Session context attachment — Attach one or more snippets to a session; attached snippets appear as a collapsible context block in the chat header ✅
  • Context injection — Attached snippets are prepended to the system prompt before each turn; context is visible in a "Context" panel alongside the message thread
  • Quick context — One-click context templates (e.g. "Project README", "API Reference", "Coding Standards") loaded from a built-in catalog ✅
  • Context search — Search the snippet library by title and content; filter by category

Release gate: a user creates a snippet, attaches it to a session, and the model's response reflects knowledge of the snippet content.

Released: 2026-05-13

Cross-cutting standing rules

These apply to every version and are enforced by the cron:

  1. No "miscellaneous mini-tool" releases. A built-in tool that takes <1 day to implement (calculator, clock, formatter, converter, one-shot web-search button) does not count as a version and must not be added directly. Such utilities ship later as first-class plugins via the v0.27 plugin / MCP system, not as bespoke React components.
  2. HEARTBEAT.md cites the current target version + the exact checkbox(es) in flight. The Next Steps list is taken verbatim from this file, not invented.
  3. Local-first invariant — every new feature must work with the default WebGPU + IndexedDB stack; remote providers are additive.
  4. Storage schema is versioned — any IndexedDB schema change ships with a forward migration in src/lib/storage.ts.
  5. No skipping versions — if v0.25 is unfinished, work on v0.25 only. If every checkbox in the current version is blocked, the cron must spend the slot on tests, docs, refactors or accessibility for that version, not on a later version or on net-new mini-tools.

Historical phases (for reference)

Phases 1–3, Phase 0.8 and parts of Phases 4 / 6 shipped in v0.2 → v0.21. They remain documented below as a record but are not authoritative for future work — the ### Versioned task breakdown above is.

Note: v0.22–v0.24 (Calculator / Web Search / Unit Converter / JSON Formatter / Current Time) were rolled back on 2026-04-25 because they bypassed this Roadmap. Those features will return only as plugins via v0.27.

Phase 1 — Core Chat Enhancement (v0.2.x)

Bring chat to feature parity with basic ChatGPT UX

  • Markdown rendering — Render assistant responses with proper Markdown, code blocks, syntax highlighting
  • Code copy button — One-click copy for code blocks
  • LaTeX support — Math equation rendering with KaTeX
  • System prompt — Customizable system prompt per session
  • Generation params — Temperature, top_p, max_tokens sliders
  • Auto-scroll control — Pause auto-scroll when user scrolls up
  • Chat export — Export conversations as Markdown/JSON
  • Token counter — Display tokens/sec and total token usage
  • Message actions — Copy, regenerate, edit user messages

Phase 2 — Model Management (v0.3.x)

Rich model lifecycle and expanded model support

  • Model cache manager — View/delete cached models, show disk usage
  • More models — Add Llama 3.2 1B/3B, DeepSeek-R1-Distill, Mistral 7B, Stable Code 3B
  • Model benchmarks — Auto-run speed benchmark on load, show tokens/sec
  • Custom model import — Load custom MLC-compiled models from URL
  • Model comparison — Side-by-side generation from two models
  • Download resume — Resume interrupted model downloads with progress persistence
  • Storage quota — Show browser storage used vs available

Phase 3 — Prompt Templates & Personas (v0.7.x)

Inspired by NextChat masks, GPT-Runner presets, LobeHub agents

  • Prompt templates — Pre-built conversation starters (coding assistant, translator, tutor, etc.)
  • Custom personas — Create/save/share AI personas with system prompts + params
  • Persona marketplace — Browse community-shared personas (static JSON registry)
  • Quick prompts — Slash commands (/translate, /code, /explain) in chat input
  • Context injection — Attach text/code snippets as context before sending

Phase 4 — Multimodal & Rich Input (v0.5.x)

Add vision and file capabilities as models support them

  • Image input — Paste/upload images for vision models (when WebGPU vision models available)
  • File upload — Attach text files as conversation context
  • Drag & drop — Drag files directly into chat
  • Clipboard paste — Intelligent paste handling (images, code, rich text)
  • Voice input — Browser Speech Recognition API for voice-to-text
  • TTS output — Read assistant responses aloud via Web Speech API

Phase 5 — Knowledge & RAG (v0.6.x)

Local-first retrieval augmented generation, inspired by Open WebUI RAG

  • Document upload — Upload PDFs, TXT, MD files
  • Client-side chunking — Split documents into chunks in-browser
  • Browser vector store — IndexedDB-based vector storage
  • Embedding model — Run small embedding model via Web-LLM
  • Semantic search — Query uploaded documents before sending to LLM
  • Citation display — Show which document chunks were used in response
  • Knowledge bases — Organize documents into named collections

Phase 6 — Tools & Plugins (v0.7.x)

Function calling and tool use, inspired by LobeHub plugins and Open WebUI tools

  • Function calling — Parse model tool-call outputs and execute browser-side functions
  • Built-in tools — Calculator, current time, unit converter, JSON formatter
  • Web search — Browser-side web search integration (via public APIs)
  • Code execution — Sandboxed JavaScript execution in iframe
  • Artifacts — Render generated HTML/SVG/Mermaid in preview panel (like NextChat artifacts)
  • Plugin system — Load third-party tool plugins from URL (JSON manifest)
  • MCP client — Model Context Protocol support for external tool servers

Phase 0.8 — Personalization & Discovery (v0.8.x)

Personalized experience and easier conversation discovery

  • Model usage tracking — Automatically track which models you use most
  • Recommended models — Top 3 most-used models displayed in Model Selector
  • Reset recommendations — Clear usage history to reset model recommendations
  • Session search — Search conversations by title in the sidebar
  • Date filtering — Filter sessions by Today, Yesterday, This Week, This Month
  • Model usage stats — Visual chart of model usage frequency
  • Recent models — Quick access to recently used models

Phase 7 — Collaboration & Sharing (v0.9.x)

Social features inspired by LobeHub channels and Open WebUI community

  • Share conversations — Generate shareable link (static HTML export)
  • Import/export — Full data import/export (sessions, personas, settings)
  • WebDAV sync — Sync data across devices via WebDAV (like NextChat)
  • Shared personas — Publish personas to community registry
  • Conversation forking — Branch a conversation at any message

Phase 8 — Desktop & PWA (v0.9.x)

Expand beyond browser tab, inspired by Jan desktop and NextChat Tauri

  • Full PWA — Offline-capable progressive web app with service worker
  • Install prompt — Smart install banner for mobile and desktop
  • Notifications — Background generation completion notifications
  • Desktop app — Tauri wrapper for native macOS/Windows/Linux
  • Keyboard shortcuts — Full keyboard navigation (Cmd+K, Cmd+N, etc.)
  • Multi-window — Open conversations in separate windows/tabs

Phase 9 — Advanced AI Features (v1.0.x)

Towards a complete local AI workstation

  • Multi-turn memory — Compress long conversations for extended context
  • Agent mode — Multi-step task execution with tool use
  • Model chaining — Pipeline: fast model drafts → large model refines
  • Batch generation — Generate multiple responses and pick best
  • A/B testing — Compare model outputs with user ratings
  • Usage analytics — Local analytics dashboard (model usage, tokens, sessions)
  • i18n — Multi-language interface (English, 中文, 日本語, etc.)
  • Accessibility — Screen reader support, keyboard navigation, high contrast

Phase 10 — External LLM Providers & Web Search (v1.1.x)

Connect to cloud and local AI servers alongside native WebGPU inference

  • OpenAI-compatible API — Configure any OpenAI-compatible endpoint (custom base URL + API key)
  • Ollama integration — Connect to a local Ollama server (http://localhost:11434)
  • LM Studio — Connect to LM Studio's built-in OpenAI-compatible local server
  • llama.cpp server — Connect to llama.cpp's HTTP server (--server mode)
  • vLLM — Connect to a vLLM inference server endpoint
  • DeepSeek API — First-class DeepSeek cloud API provider (chat + reasoner models)
  • Provider switcher — Toggle between WebGPU local inference and external API providers in-session
  • SearXNG integration — Web search via a self-hosted SearXNG instance URL
  • Web search tool — Inject search results as context before sending to the model

Keyboard Shortcuts

Shortcut Action
⌘K Toggle Command Palette
⌘N New Chat
⌘⇧S Stop Generation
⌘1 Models
⌘2 Model Cache
⌘3 Usage Statistics
⌘4 Persona Marketplace
⌘5 Knowledge
⌘6 Model Comparison
⌘7 Model Benchmark
⌘8 Custom Model Import
⌘9 Plugins
⌘0 MCP Servers
⌘⇧E Export All Data
⌘⇧I Import Data
? Keyboard Shortcuts Overlay

On Windows/Linux, replace with Ctrl.


Development

npm install          # Install dependencies
npm run dev          # Start dev server (http://localhost:5173)
npm run build        # Production build to dist/
npm run preview      # Preview production build

Requirements

  • Chrome 113+ or Edge 113+ (WebGPU support required)
  • GPU with 2GB+ VRAM recommended
  • ~200MB–2GB storage per model (cached in browser)

Project Structure

monday/
├── public/
│   ├── favicon.svg            # App icon (purple gradient smiley)
│   ├── apple-touch-icon.svg   # iOS home screen icon
│   └── manifest.json          # PWA manifest
├── src/
│   ├── App.tsx                # Root: view router, state orchestration
│   ├── App.css                # All component styles
│   ├── components/
│   │   ├── Sidebar.tsx        # Session list, brand, version link
│   │   ├── ModelSelector.tsx  # Model cards with BorderBeam
│   │   ├── MessageList.tsx    # Chat message rendering
│   │   ├── ChatInput.tsx      # Input textarea with send/stop
│   │   ├── Changelog.tsx      # Expandable release history
│   │   ├── ThemeToggle.tsx    # Light/Dark/System switcher
│   │   └── WebGPUCheck.tsx    # WebGPU compatibility warning
│   ├── hooks/
│   │   ├── useChat.ts         # Session/message/streaming state
│   │   ├── useModel.ts        # Model load/unload/progress
│   │   └── useTheme.ts        # Theme persistence + system detection
│   ├── lib/
│   │   ├── engine.ts          # Web-LLM singleton, streamChat()
│   │   ├── models.ts          # Model registry (7 models)
│   │   ├── storage.ts         # IndexedDB CRUD
│   │   └── changelog.ts       # Version history data
│   └── types/
│       └── index.ts           # TypeScript interfaces
├── index.html                 # Entry HTML with mobile meta tags
├── vite.config.ts             # Vite config (base: '/monday/')
└── package.json               # v0.1.0

License

MIT

About

Run models in broswers without install anything.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors