Memory & Context

Intelligent memory management for AI applications. Conversation history, semantic search, temporal decay, and automatic compression.

npm install @rana/memory

Conversation Memory

Persistent memory that tracks conversation history across sessions

import { ConversationMemory } from '@rana/memory';

const memory = new ConversationMemory({
  maxMessages: 100,
  storage: 'redis',  // or 'postgresql', 'memory'
  ttl: '7d'
});

// Add messages
await memory.add({
  role: 'user',
  content: 'What is machine learning?'
});

await memory.add({
  role: 'assistant',
  content: 'Machine learning is a subset of AI...'
});

// Get conversation history
const history = await memory.getHistory(sessionId);

// Clear memory
await memory.clear(sessionId);

Semantic Memory

Long-term memory with semantic search and retrieval

import { SemanticMemory } from '@rana/memory';

const memory = new SemanticMemory({
  vectorStore: 'pinecone',
  embeddingModel: 'text-embedding-3-small',
  namespace: 'user-memories'
});

// Store a memory
await memory.store({
  content: 'User prefers dark mode and concise responses',
  metadata: { userId: '123', type: 'preference' }
});

// Search memories by meaning
const relevant = await memory.search(
  'What display settings does the user prefer?',
  { limit: 5, threshold: 0.7 }
);

// Get memories by metadata
const preferences = await memory.findByMetadata({
  userId: '123',
  type: 'preference'
});

Working Memory

Short-term context for the current conversation

import { WorkingMemory } from '@rana/memory';

const working = new WorkingMemory({
  maxTokens: 4000,
  compressionStrategy: 'summarize'
});

// Add context
working.add({
  type: 'system',
  content: 'You are a helpful assistant'
});

working.add({
  type: 'context',
  content: 'User is asking about their recent order'
});

// Get optimized context for prompt
const context = await working.getContext();
// Automatically summarizes/compresses if too long

// Clear working memory
working.clear();

Temporal Memory

Time-aware memory with decay and importance scoring

import { TemporalMemory } from '@rana/memory';

const temporal = new TemporalMemory({
  decayRate: 0.1,      // Memories fade over time
  boostOnAccess: true, // Accessing memory strengthens it
  importanceThreshold: 0.3
});

// Store with importance
await temporal.store({
  content: 'User birthday is March 15',
  importance: 1.0  // High importance, decays slower
});

await temporal.store({
  content: 'User mentioned they like coffee',
  importance: 0.5  // Medium importance
});

// Get current memories (filtered by decay)
const active = await temporal.getActive();

// Manually boost a memory
await temporal.boost(memoryId, 0.5);

Memory Search

Unified search across all memory types

import { MemoryManager } from '@rana/memory';

const manager = new MemoryManager({
  memories: {
    conversation: conversationMemory,
    semantic: semanticMemory,
    temporal: temporalMemory
  }
});

// Search across all memories
const results = await manager.search('user preferences', {
  types: ['semantic', 'temporal'],
  limit: 10,
  deduplicate: true
});

// Get unified context for a prompt
const context = await manager.getContext({
  query: 'What does the user prefer?',
  maxTokens: 2000,
  includeRecent: true
});

Memory Compression

Intelligent compression and summarization of memories

import { MemoryCompressor } from '@rana/memory';

const compressor = new MemoryCompressor({
  model: 'gpt-4o-mini',
  targetTokens: 1000,
  preserveRecent: 5  // Keep last 5 messages intact
});

// Compress conversation history
const compressed = await compressor.compress(history);

// Summarize long conversations
const summary = await compressor.summarize(history, {
  style: 'bullet-points',
  maxLength: 500
});

// Extract key facts
const facts = await compressor.extractFacts(history);
// ['User is a software developer', 'Prefers Python', ...]

Supported Storage Backends

Redis

PostgreSQL

MongoDB

SQLite

Pinecone

Qdrant

Weaviate

In-Memory