LocAI - Local Multi-Model AI

LocAI is a fully client-side AI chat application that lets you download, run, and switch between multiple large language models directly in your browser—no signup or remote server required. It caches model weights and conversation history in IndexedDB, uses a Service Worker for offline support, and leverages WebGPU for accelerated inference. Every chat stays private on your device, and your last-used model, open conversation, and slider settings persist across reloads.

Features

Client-Side Inference: Download and run large language models entirely in your browser via WebGPU—no remote servers or API keys required.
Multi-Model Support: Browse, select, and switch between a growing catalog of MLC-AI models (e.g. Qwen2, Llama-3, Phi-3, Gemma-2) from the https://mlc.ai/models directory.
Offline-First: Service Worker caches assets and model weights for offline use; once downloaded, models load instantly without network.
Persistent Chats: Conversations and messages are stored in IndexedDB; your chat history, open conversation, and slider settings persist across page reloads.
Real-Time Streaming: Partial responses stream in with live rendering and auto-formatted code blocks; dynamic sliders control temperature, top-p, penalties, max tokens, and choice count.
WebGPU Enforcement: Detects and blocks unsupported browsers (Firefox, Safari); guides users to Chrome/Chromium with WebGPU enabled.
Responsive UI: Modernized design with custom components (no external UI library), adaptive layout for desktop and mobile, and accessible modals for Info, Model Selection, Advanced Settings, and WebGPU errors.
Localhost Mode: Last‐used model and last open chat automatically reloaded from localStorage for seamless restarts.
Privacy & Security: All data remains on-device—clearing browser storage removes everything, ensuring zero-trust, zero-signup privacy.

Application

locai.mp4

A more extensive showcase is available in my portfolio!

Tools Used

Astro 5: Static-site generator powering the modern, lightning-fast frontend.
React 19: UI library for building interactive chat components.
TypeScript: Adds static typing across components, hooks, and data models.
WebGPU: Browser API for GPU-accelerated inference of LLMs.
IndexedDB: Client-side storage of model weights and conversation history.
Service Worker (PWA): Offline caching of assets and model files.
LocalStorage: Persists last-used model, open chat, and slider settings.
@mlc-ai/web-llm: Library for loading and running LLMs in the browser.
Framer Motion: Animations and transitions for a polished UX.
Tailwind CSS: Utility-first styling across the app.
react-markdown + remark-gfm: Rendering of markdown formatted responses and code blocks.
react-syntax-highlighter: Code formatting and highlighting during streaming.
pnpm: Fast, disk-efficient package manager.
GitHub Actions: CI/CD workflows for builds, tests, and deploys to GitHub Pages.

Installation

Prerequisites

Docker

Steps

Clone the repository:

git clone https://github.com/vladimircuriel/locai-chat

Navigate to the project directory:

cd locai-chat

Run the commands:

docker build -t locai:latest .

docker run -p 4321:4321 locai:latest

Access the application:

Open your browser and visit http://localhost:4321 to access the user interface.

Areas for Improvement

Models must be re-downloaded when switching quantization or major versions, leading to extra wait and disk use.
Very large models can exhaust device RAM/VRAM and may crash or hang browsers on lower-end hardware.
Chat history in IndexedDB is unencrypted and tied to the browser; no built-in export or sync across devices.
Mobile browsers with experimental or missing WebGPU support are blocked entirely—no lightweight fallback.
No built-in search or filtering within long conversation histories.
Context window is limited by model token capacity; very long chats may lose early context when truncated.
No support for custom prompts or system-level instruction templates beyond a single “system” message.
No collaborative or multi-user features—every session is isolated to the local device.
Lack of model fine-tuning or personalization options; you’re limited to public pre-trained checkpoints.
Clearing browser storage (IndexedDB/localStorage) deletes all chats and model caches without warning.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github		.github
.vscode		.vscode
public		public
src		src
.dockerignore		.dockerignore
.gitignore		.gitignore
.npmrc		.npmrc
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
astro.config.mjs		astro.config.mjs
biome.json		biome.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
postcss.config.js		postcss.config.js
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LocAI - Local Multi-Model AI

Table of Contents

Features

Application

Tools Used

Installation

Prerequisites

Steps

Areas for Improvement

About

Uh oh!

Releases 1

Contributors 2

Uh oh!

Languages

License

vladimircuriel/locai-chat

Folders and files

Latest commit

History

Repository files navigation

LocAI - Local Multi-Model AI

Table of Contents

Features

Application

Tools Used

Installation

Prerequisites

Steps

Areas for Improvement

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors 2

Uh oh!

Languages