Cost Optimization¶

Optimize GraphMem costs for production.

Cost Breakdown¶

Typical Monthly Costs¶

Component	Cloud Service	Self-Hosted
Neo4j	$500-2,500 (Aura)	$300-500 (EC2)
Redis	$200-500 (ElastiCache)	$100-200 (EC2)
LLM API	$500-5,000	$500-5,000
Embeddings API	$100-1,000	$0-100 (local)
Compute	$300-1,000	$200-500
Total	$1,600-10,000	$1,100-6,300

Optimization Strategies¶

1. Use Turso Instead of Neo4j¶

For smaller deployments, Turso offers: - Free tier available - No server costs - Native vector search

config = MemoryConfig(
    turso_db_path="memory.db",  # Local file, zero cost
    # OR
    turso_url="https://your-db.turso.io",  # Cloud, low cost
)

Savings: $200-2,500/month

2. Use Local Embeddings¶

Replace API-based embeddings with local models:

# Install sentence-transformers
pip install sentence-transformers

# Use local model
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('BAAI/bge-large-en-v1.5')

# 100x cheaper, faster for high volume

Savings: $100-1,000/month (at scale)

3. Use GPT-4o-mini Instead of GPT-4¶

config = MemoryConfig(
    llm_model="gpt-4o-mini",  # 10x cheaper than gpt-4
)

Savings: 90% on LLM costs

4. Redis Caching¶

Reduces redundant API calls:

config = MemoryConfig(
    redis_url="redis://...",  # Cache queries and embeddings
)

Impact: 50-80% reduction in API calls

5. Batch Ingestion¶

Reduces per-document overhead:

# ❌ Expensive: Sequential ingestion
for doc in documents:
    memory.ingest(doc)

# ✅ Cheaper: Batch ingestion
memory.ingest_batch(documents, max_workers=20)

Impact: 3-5x faster, lower API costs

Cost Calculator¶

Per 1M Documents¶

Step	OpenAI	Azure	Local
Embeddings	$130	$130	$10
Extraction	$2,000	$2,000	$2,000
Storage	$50	$50	$20
Total	$2,180	$2,180	$2,030

Per 1M Queries¶

Component	Cached	Uncached
Retrieval	$0	$50
LLM Answer	$200	$200
Total	$200	$250

Budget Tiers¶

Startup (< $500/month)¶

config = MemoryConfig(
    llm_model="gpt-4o-mini",
    turso_db_path="memory.db",
    # No Redis (use in-memory)
)

Growth ($500-2,000/month)¶

config = MemoryConfig(
    llm_model="gpt-4o-mini",
    neo4j_uri="neo4j+s://...",  # Aura Free/Basic
    redis_url="redis://...",    # Small Redis
)

Enterprise ($2,000+/month)¶

config = MemoryConfig(
    llm_model="gpt-4o",         # Best quality
    neo4j_uri="neo4j+s://...",  # Aura Professional
    redis_url="redis://...",    # Redis Enterprise
)

Monitoring Costs¶

Track API Usage¶

import tiktoken

def count_tokens(text: str, model: str = "gpt-4o-mini") -> int:
    enc = tiktoken.encoding_for_model(model)
    return len(enc.encode(text))

# Track per query
response = memory.query(question)
tokens_used = count_tokens(response.context)
cost = tokens_used * 0.00015 / 1000  # GPT-4o-mini pricing

# Log for monitoring
logger.info(f"Query cost: ${cost:.6f}")

Set Spending Limits¶

from openai import OpenAI

client = OpenAI(
    api_key="sk-...",
    # Set spending limit in OpenAI dashboard
)

Free Tier Options¶

Service	Free Tier
Turso	9GB storage, 500M rows read
Neo4j Aura	Free tier available
Redis Cloud	30MB free
OpenAI	$5 credit for new users
Azure OpenAI	Pay-as-you-go

Summary¶

Start with Turso - Free/cheap, works offline
Use GPT-4o-mini - 10x cheaper than GPT-4
Enable Redis caching - 50-80% fewer API calls
Batch operations - Lower overhead
Monitor usage - Track tokens and costs