v1.5.1 · Production Ready · Open Source

Your AI forgets.
GLIA remembers.

Persistent memory for every AI coding agent and browser chat. One local database. Zero cloud. Zero subscriptions.

$npx glia-ai-setup

View on GitHub See how it works

Works with

Claude

ChatGPT

Gemini

DeepSeek

Grok

Copilot

Mistral

Cursor

Claude Code

Windsurf

VS Code

Production Benchmarks — v1.5.1

Numbers that matter

Audited against 1,000-chunk noise haystacks. No cherry-picked queries.

Recall Accuracy

Web & MCP both

Context Compression

vs raw chunk injection

Project Isolation

Zero cross-tenant leaks

0 T/s

Graph Ingestion

1,087 triples stress test

Web Context Engine

PASS

Scale: 1,000-chunk haystack

Recall90.0%

Compression95.0%

MCP Context Engine

PASS

Scale: 30 queries × 3 phrasings

Recall90.0%

Compression81.3%

MCP Project Isolation

ELITE

Scale: 10 concurrent projects

Recall100%

Compression—

Knowledge Graph Stress

ELITE

Scale: 1,087 triples @ 4,056 T/s

Recall—

Compression—

View full reports in repository

Two modes. One memory.

GLIA runs as a browser extension and an MCP server simultaneously, sharing the same database. Use either or both.

🌐

Web Extension

Claude · ChatGPT · Gemini · DeepSeek + 3 more

For quick, everyday chats. The browser extension invisibly injects context from your codebase directly into your prompts on Claude.ai and ChatGPT, letting you chat without copy-pasting.

Auto-intercepts prompts before sending
Prepends relevant project context silently
Save full conversations with one click
Works across 7 AI platforms

claude.ai

[GLIA] Session: AuthService

[GLIA] Injecting 3 context chunks...

[GLIA] Context injected (81% compression)

⌨️

MCP Server

Claude Desktop · Cursor · Windsurf · VS Code

For local development. The MCP Server hooks directly into your code editor, allowing the AI to recall memories automatically based on your current project path.

Native tools: recall_context, store_memory
search_memory across all projects globally
Auto-identifies project from working directory
Zero-Docker — single SQLite file

mcp_servers configclaude_desktop_config.json

{
  "mcpServers": {
    "glia": {
      "command": "node",
      "args": ["/path/to/Glia-AI/backend/dist/mcp/server.js"],
      "env": {
        "GLIA_STORAGE_MODE": "sqlite",
        "SQLITE_DB_PATH": "/path/to/Glia-AI/backend/glia.db"
      }
    }
  }
}

🌐

Browser Chat

claude.ai, chatgpt.com...

glia.db

Shared SQLite

⌨️

Coding Tool

Cursor, Claude Code...

Both interfaces read and write the same database. Save in ChatGPT, recall in Cursor. Instantly.

Everything your AI needs to actually remember

Not a wrapper. Not a cloud service. A local memory infrastructure that plugs into every tool you already use.

Core

Hybrid RAG Engine

Three search layers fused: Sentence Vector + Chunk Vector + FTS5 keyword. Surgical trimming returns only the matching sentences.

Extension

7 AI Platforms

Auto-intercepts prompts on Claude, ChatGPT, Gemini, DeepSeek, Grok, Copilot, and Mistral. No copy-paste required.

MCP

Native MCP Tools

recall_context, store_memory, search_memory, list_projects, identify_project and more — native tool calls in every coding agent.

Architecture

Shared Memory Bridge

Memory saved in a browser chat is instantly available in your coding tool. One SQLite database. Two interfaces.

Security

100% Project Isolation

recall_context is SQL-scoped to the project. Project A's data never leaks into Project B, even with semantically similar queries.

Sync & Share

Portable JSON Sessions

Share context across machines or with teammates instantly. Download any project session as a clean JSON file and import it on another PC.

Infra

Zero-Docker Mode

Set GLIA_STORAGE_MODE=sqlite and eliminate all containers. SQLite + sqlite-vec delivers full RAG on any machine.

Graph

Knowledge Graph

Conversations are extracted into a D3 force-directed graph of entities and relationships. Browse your project's architecture visually.

RAG

HyDE Retrieval

Hypothetical Document Embeddings generate a synthetic answer to your query, then search by that embedding — improving recall on rephrased queries.

Stop re-explaining yourself.

One command. Persistent memory across every AI tool you use. Runs entirely on your machine.