Grepedia
IN

inbrowser

A collection of libraries for resumable, grounded AI inference, agent runtimes, and on-device model execution directly within the browser.

Score0
About

The inbrowser stack is a collection of TypeScript libraries designed to simplify running AI inference, managing agents, and executing code in the browser environment. Created as a comprehensive monorepo, it provides modular packages that handle the complexities of durable job engines, resumable LLM inference, and on-device model execution. The project aims to empower developers to build robust, browser-native AI applications that remain performant and resilient even through network disruptions or browser reloads. By offering a unified set of primitives, it allows for rapid prototyping and the creation of sophisticated agentic workflows that operate directly within the user's browser or across distributed environments.

Functionality of the stack centers on decoupling the AI orchestration logic from specific model providers. It provides an agent runtime that supports multi-turn conversations, tool calling, and structured output, all driven by a pluggable architecture. The relay component ensures inference remains resumable by persisting event streams, while the model package bridges the gap between various cloud APIs and lightweight, on-device execution using WebGPU and WebAssembly. This architecture enables developers to build agents that can perform tasks, manage files, and interact with external services through standardized interfaces like the Model Context Protocol (MCP).

Some of the key features are:

  • Resumable Inference: Ensures that LLM generation streams are never lost due to network drops, backgrounded tabs, or page reloads.
  • Agent Runtime: Provides a standardized TypeScript-first runtime for building agents with custom strategies, tool registries, and state management.
  • On-Device Models: Supports loading and running small language models directly in the browser using the ONNX Runtime Web for privacy and offline capability.
  • Pluggable Architecture: Allows easy integration of various cloud model providers (such as Gemini, Anthropic, or OpenAI-compatible servers) through a shared ModelClient contract.
  • Durable Job Engine: Features an append-only, replayable log system that enables job tracking and persistence via memory or Firebase RTDB.
  • Browser-Native Workspace: Offers a sandboxed environment with file system, shell, and git capabilities suitable for app-builder agents.
  • MCP Support: Includes built-in support for the Model Context Protocol to expose agent tools to external hosts like Claude Desktop.
  • CLI Tooling: Provides a robust command-line interface for running, testing, auditing, and undoing agent sessions.

The system operates by composing independent packages that handle distinct layers of the AI application. Developers initialize a session or engine, register their tool handlers and model clients, and subscribe to typed event streams. The event-sourced architecture, managed by a centralized JobEngine, keeps a record of all interactions, allowing the state to be reconstructed or resumed from any sequence number. By abstracting the transport layer and the inference protocol, the stack allows developers to swap between different models and storage backends without rewriting their core application logic.

Some common use cases include:

  • Browser-Native App Builders: Creating agents that write, compile, and preview React applications directly inside a sandbox environment.
  • Resilient AI Assistants: Building chat interfaces that continue generating responses even if the user experiences temporary network connectivity issues.
  • On-Device Documentation Search: Implementing intelligent docs assistants that run models locally without requiring an API key or exposing user data to the cloud.
  • Multi-Agent Systems: Orchestrating complex workflows where multiple specialized agents interact through a unified MCP-based tool registry.