cognitive memory
Production MCP Server — Persistent AI Memory with Graph, Search, and Validation
| Metric | Value | Significance | |
|---|---|---|---|
| MCP Tools | 36 | Memory storage, insight management, graph operations, validation, admin | MCP Tools 36 Memory storage, insight management, graph operations, validation, admin |
| Test Cases | ~2,938 | 180 test files covering unit, integration, and validation | Test Cases ~2,938 180 test files covering unit, integration, and validation |
| Migrations | 49 | Database schema evolution with RLS enforcement | Migrations 49 Database schema evolution with RLS enforcement |
Challenge
LLMs forget everything between sessions. Building AI systems that maintain context, track relationships between people and concepts, and evolve their understanding over time requires more than a vector database — it requires structured, multi-layered memory with search, validation, and security isolation.
Solution
A production MCP server on PostgreSQL + pgvector exposing 36 tools and 5 resources. Five memory layers: raw transcripts, compressed insights, working memory, episodic memory, and archive. Hybrid-RAG combining semantic search (60%), keyword matching (20%), and graph traversal (20%) via RRF fusion. GraphRAG for relational knowledge. Dual-Judge evaluation (GPT-4o + Haiku) with Cohen's Kappa calculation for quality validation. Row-Level Security for multi-project isolation.
Built an MCP server using FastMCP 2.14 with PostgreSQL + pgvector on Neon Cloud. Designed five memory layers with different persistence and consent semantics. Implemented GraphRAG with nodes, edges, and traversal for relational queries. Added Dual-Judge evaluation for automated quality validation. Enforced Row-Level Security so each project (tethr, I/O, njord, semantic-memory) sees only its own data. Deployed as systemd service with watchdog heartbeat and auto-restart.
Cognitive Memory: Production MCP Server — Persistent AI Memory
The Infrastructure Behind Everything Else
Cognitive Memory is a production MCP server that provides persistent memory to every AI system I build. Running on PostgreSQL + pgvector with 36 MCP tools and 5 MCP resources, it serves as the shared memory layer for tethr, I/O System, njord, and semantic-memory.
The Problem: AI Systems Need Persistent, Structured Memory
LLMs forget everything between sessions. Building AI systems that maintain context, track relationships, and evolve over time requires infrastructure — not just a vector database, but a multi-layered memory system with search, graph traversal, validation, and security isolation.
The Solution: Multi-Layer Memory with Graph and Validation
Cognitive Memory implements five memory layers: L0 (raw transcripts), L2 (compressed insights), working memory (session context), episodic memory (behavioral learning), and stale memory (archive). Hybrid-RAG combines 60% semantic search, 20% keyword matching, and 20% graph traversal with Reciprocal Rank Fusion.
Key Features
- 36 MCP Tools: Memory storage, insight management, graph operations, validation, admin
- 5 MCP Resources: Read-only state exposure via RFC 6570 URIs
- GraphRAG: Nodes and edges with UUID primary keys, JSONB properties, graph traversal
- Dual-Judge Evaluation: GPT-4o + Haiku for quality validation with Cohen's Kappa calculation
- Row-Level Security: Multi-project isolation — each project sees only its own data
- SMF (Safety Mechanism Framework): Proposal-based memory changes with approval workflow
- systemd Service: Watchdog heartbeat, auto-restart, production deployment on Neon Cloud
Technical Stack
- Python, FastMCP 2.14
- PostgreSQL + pgvector (Neon Cloud)
- OpenAI embeddings (text-embedding-3-small, 1536 dimensions)
- Anthropic Haiku + GPT-4o (Dual-Judge)
- 180 test files, ~2,938 test cases
Technologies & Skills Demonstrated: MCP Server Development, PostgreSQL, pgvector, RAG Architecture, GraphRAG, Row-Level Security, Multi-Tenant Design, Python, Production Deployment
Timeline: 2025 — ongoing | Role: Architect & Developer
Screenshots



Backend
Tools & Services
Database
AI Stack Connections
Impact
Production memory layer serving four active projects. 180 test files with ~2,938 test cases. 49 database migrations. Running as systemd service on Neon Cloud. The system that tethr, I/O System, njord, and semantic-memory all depend on for persistent memory.
Key Learnings
- Multi-project isolation is non-negotiable — without Row-Level Security, one project's data leaks into another's context. RLS enforcement caught real bugs during migration.
- Five memory layers solved different problems: raw transcripts for audit, compressed insights for retrieval, working memory for session state, episodic for behavioral learning, stale for cleanup.
- Hybrid-RAG outperforms any single retrieval method — semantic search alone misses keyword-specific queries, keyword alone misses meaning, graph alone misses content. RRF fusion combines all three.