semantic memory
Knowledge Pipeline — Curated External Knowledge for AI Agents
| Metric | Value | Significance | |
|---|---|---|---|
| Chunking Strategy | 3-Level | Document → Section → Paragraph hierarchy preserves semantic meaning | Chunking Strategy 3-Level Document → Section → Paragraph hierarchy preserves semantic meaning |
| Ingestion Sources | arXiv, PDF, manual | Multiple ingestion pipelines for different knowledge sources | Ingestion Sources arXiv, PDF, manual Multiple ingestion pipelines for different knowledge sources |
| CLI Commands | ~15 | Full CLI for ingestion, search, graph operations, and exploration | CLI Commands ~15 Full CLI for ingestion, search, graph operations, and exploration |
Challenge
AI agents need curated external knowledge, not raw documents. Feeding PDFs to LLMs with fixed-size chunking destroys semantic boundaries, fragments meaning, and produces retrieval results that lack context.
Solution
3-Level Chunking Pipeline (Document → Section → Paragraph) with semantic boundary detection. Multiple ingestion sources (arXiv, PDF, manual upload). Hybrid search combining semantic, keyword, and graph traversal. MCP server exposing knowledge to any connected AI agent.
Built an MCP server on PostgreSQL + pgvector with FastMCP. Designed hierarchical chunking that preserves document structure. Implemented ingestion pipelines for arXiv (direct API) and PDF (marker-pdf). Built a CLI with ~15 subcommands for ingestion, search, and graph operations. Integrated graph analysis for knowledge visualization.
Semantic Memory: Knowledge Pipeline for AI Agents
Curated External Knowledge, Served to AI Agents via MCP
Semantic Memory is the knowledge module in a multi-agent ecosystem. Where Cognitive Memory handles relational and episodic memory, Semantic Memory ingests and serves curated external knowledge — research papers, PDFs, domain-specific sources. Think NotebookLM, but built to serve AI agents directly via MCP.
The Problem: LLMs Need Curated Knowledge, Not Raw Documents
Feeding raw PDFs to LLMs destroys semantic boundaries. Fixed-size chunking breaks paragraphs mid-sentence, fragments meaning, and produces retrieval results that lack context. AI agents need knowledge that preserves the structure of the original source.
The Solution: 3-Level Chunking with Hybrid Search
Semantic Memory implements hierarchical chunking (Document → Section → Paragraph) with semantic boundary detection. Each chunk preserves its position in the document hierarchy. Hybrid search combines semantic, keyword, and graph traversal for optimal retrieval.
Key Features
- 3-Level Chunking: Document → Section → Paragraph hierarchy preserves semantic meaning
- Multiple Ingestion Sources: arXiv (direct API), PDF (marker-pdf), manual upload with metadata editing
- Hybrid Search: Semantic + keyword + graph traversal with RRF ranking
- MCP Server: Exposes knowledge to any connected AI agent
- CLI: ~15 subcommands for ingestion, search, graph operations, category management
- Graph Analysis: Knowledge visualization via graphviz
Technical Stack
- Python, FastMCP
- PostgreSQL + pgvector
- OpenAI embeddings, sentence-transformers
- FastAPI, Typer (CLI)
- arXiv API, marker-pdf
Technologies & Skills Demonstrated: Knowledge Pipeline, MCP Server, Document Processing, Semantic Chunking, PostgreSQL, pgvector, Hybrid Search, CLI Design
Timeline: 2025 — ongoing | Role: Architect & Developer
Screenshots


Backend
Tools & Services
Database
AI Stack Connections
Impact
Knowledge module complementing Cognitive Memory in a multi-agent ecosystem. SM provides curated external knowledge, CM provides relational and episodic memory. Both serve the same agents (tethr, I/O, njord) via MCP.
Key Learnings
- 3-level chunking preserves meaning that fixed-size approaches destroy — Document → Section → Paragraph hierarchy respects how knowledge is structured, not just how tokens are counted.
- The distinction between knowledge and memory matters architecturally — external facts (SM) and personal experience (CM) need different storage, retrieval, and update semantics.
- CLI-first design accelerated development — ingesting, searching, and debugging via terminal before building the MCP layer meant the core logic was solid before any agent connected.