Overview
Thecaching module provides LLM response caching to reduce API costs and improve latency. It supports in-memory and Redis-based caching with TTL expiration.
Classes
CacheEntry
Data structure for cached values.Attributes
| Name | Type | Description |
|---|---|---|
key | str | Cache key |
value | Any | Cached value |
created_at | float | Creation timestamp (Unix time) |
ttl | float | None | Time-to-live in seconds (None = no expiration) |
metadata | dict[str, Any] | None | Optional metadata |
Methods
is_expired
Check if entry has expired.bool- True if expired, False otherwise
BaseCache
Abstract base class for cache implementations.Abstract Methods
All cache implementations must implement these methods:| Method | Description | Parameters | Returns |
|---|---|---|---|
get(key) | Retrieve value from cache | key: str | Any | None |
set(key, value, ttl) | Store value in cache | key: str, value: Any, ttl: float | None | None |
delete(key) | Remove value from cache | key: str | None |
clear() | Clear all cache entries | None | None |
exists(key) | Check if key exists | key: str | bool |
InMemoryCache
In-memory cache implementation.Constructor
default_ttl(float | None) - Default TTL in seconds. IfNone, entries never expire.
Methods
get
Retrieve value from cache. Automatically removes expired entries.set
Store value in cache.ttl- Time-to-live in seconds. IfNone, usesdefault_ttl.
delete
Remove entry from cache.clear
Remove all entries from cache.exists
Check if key exists (and is not expired).cleanup_expired
Manually remove all expired entries.ResponseCache
LLM response caching with automatic key generation.Constructor
cache- Backend cache implementation (InMemoryCache, RedisCache, etc.)namespace- Cache key namespace prefixinclude_model_params- Include model parameters (temperature, etc.) in cache key
Methods
make_key
Generate cache key from prompt and parameters.prompt- String prompt or message listmodel- Model name (e.g., “gpt-4”)**kwargs- Model parameters (temperature, max_tokens, top_p, etc.)
str- Cache key in format:namespace:prompt_hash:model:params_hash
temperaturemax_tokenstop_pfrequency_penaltypresence_penalty
get
Retrieve cached response.set
Store response in cache.delete
Remove specific cache entry.clear
Clear all entries in this namespace.InMemoryCache, only clears entries matching the namespace. For other caches, clears everything.
CachedNode
Mixin class to add caching capability to nodes.Constructor
response_cache- ResponseCache instance to use
Methods
enable_cache
Enable caching for this node.disable_cache
Disable caching for this node.is_cache_enabled
Check if caching is enabled.clear_cache
Clear all cache entries for this node.RedisCache
Redis-based distributed cache (requiresredis package).
Constructor
host- Redis server hostnameport- Redis server portdb- Redis database number (0-15)password- Redis password (if required)default_ttl- Default TTL in secondsprefix- Key prefix for namespacing
Features
- JSON Serialization: Automatically serializes/deserializes JSON
- TTL Support: Automatic expiration with SETEX
- Prefix Support: Namespace isolation with key prefix
- Distributed: Share cache across multiple processes/servers
Methods
Same asBaseCache: get, set, delete, clear, exists
Usage Patterns
Basic LLM Caching
Per-Temperature Caching
Disable Parameter Caching
Distributed Caching with Redis
Conditional Caching
Cache Warming
Cache Statistics
TTL Strategies
Best Practices
Use Appropriate TTL
Use Appropriate TTL
Set TTL based on content freshness requirements:
Choose Right Cache Backend
Choose Right Cache Backend
- InMemoryCache: Single process, fast, simple
- RedisCache: Multi-process/server, distributed, persistent
Include Relevant Parameters
Include Relevant Parameters
Only cache based on parameters that affect output:
Monitor Cache Performance
Monitor Cache Performance
Track hit rates to optimize TTL and size:
Clear Cache Strategically
Clear Cache Strategically
Clear cache when underlying data changes:
Handle Cache Failures Gracefully
Handle Cache Failures Gracefully
Cache should enhance, not break functionality:
Cleanup Expired Entries
Cleanup Expired Entries
Periodically cleanup to prevent memory bloat:
Cache Key Design
Key Format
Hash Collision
SHA-256 hashes are truncated to 16 characters. Collision probability is negligible for practical use, but for absolute safety, store full prompt in metadata.Custom Namespaces
See Also
- Rate Limiting - Combine caching with rate limiting
- Resilience - Retry patterns
- Callbacks - Monitor cache events