Chapter 9 – The Emergence of Adaptive Storage
From Static Tiers to Dynamic Placement
We’ve spent eight chapters establishing constraints: the physics of distance, the costs of write amplification, the complexity of sharding, the trade-offs between consistency models, and the non-negotiable requirements of compliance. Each chapter revealed a different dimension of the problem space, and each dimension constrains the others.
The traditional approach to distributed systems is to make these trade-offs upfront. Choose your consistency level. Pick your partition strategy. Decide where data lives. Deploy your cluster. Hope you got it right.
But what if access patterns change? What if the data that was cold becomes hot? What if regulatory requirements shift? What if your user base grows in an unexpected geography?
The answer for the past two decades has been: “You redesign, migrate, and hope for minimal downtime.” Manual intervention. Operational toil. Architectural rewrites.
Part III explores a different approach: systems that adapt. That observe actual behavior—query patterns, data temperature, geographic distribution, cost trends—and automatically adjust data placement in response. Systems where data locality is not a static architectural decision but a continuous optimization problem.
This is the beginning of the synthesis. Welcome to adaptive storage.
The Problem with Static Tiers
Most storage systems organize data into tiers:
Traditional three-tier architecture:
Tier 1 (Hot): RAM or high-speed SSD, millisecond access, expensive ($10-100/GB/month)
Tier 2 (Warm): Standard SSD or HDD, tens of milliseconds, moderate cost ($1-5/GB/month)
Tier 3 (Cold): Object storage (S3, GCS), hundreds of milliseconds, cheap ($0.02-0.10/GB/month)
Data moves between tiers based on age-based policies:
Rule: Age-Based Tiering
IF data.age < 7 days THEN tier = 1 (hot)
IF data.age >= 7 days AND data.age < 30 days THEN tier = 2 (warm)
IF data.age >= 30 days THEN tier = 3 (cold)
This is simple, deterministic, and wrong for most workloads.
Why it fails:
Failure 1: Age ≠ Access Frequency
Your application logs data constantly. Most logs are never read again. But some logs—error traces, security events—are accessed frequently days or weeks later during incident investigation.
Age-based tiering puts recent logs in expensive hot storage even though 99% will never be read. Meanwhile, critical error traces from 10 days ago get demoted to slow storage just as engineers need them for debugging.
Failure 2: Access Patterns Are Not Uniform
E-commerce scenario: 80% of queries target 5% of products (bestsellers). Age-based tiering keeps all recent products in hot storage, including the ones nobody views. Meanwhile, perennial bestsellers (your “classics” that sell steadily for years) get demoted to cold storage despite consistent access.
Failure 3: Patterns Change Over Time
Social media post goes viral. Created 6 months ago, it was in cold storage (age-based rule). Suddenly it’s accessed 10,000× per second. By the time your system reacts and promotes it, you’ve served millions of slow queries from cold storage.
The fundamental problem: Static rules make decisions based on metadata (age, size, creation time) rather than actual behavior (access frequency, query latency, geographic distribution).
The Shift: From Rules to Telemetry
Adaptive storage systems replace static rules with telemetry-driven feedback loops.
The core insight: Watch what’s actually happening, not what you predicted would happen.
Telemetry to collect:
Access frequency: How often is this data queried? (queries per hour)
Access recency: When was it last accessed? (minutes since last query)
Access latency: How long do queries take? (P50, P99 latency)
Geographic distribution: Where are queries coming from? (region breakdown)
Query type: Read-only vs. write-heavy? (read/write ratio)
Data size: How much storage does it consume? (bytes)
Cost: What’s it costing in current tier? ($/month)
Example telemetry for a database record:
record_id: 12345
access_frequency: 250 queries/hour
last_accessed: 2 minutes ago
p99_latency: 45ms
geographic_distribution: {US-East: 60%, EU-West: 30%, APAC: 10%}
read_write_ratio: 95% reads, 5% writes
size: 2.3 KB
current_tier: tier-2 (SSD)
current_cost: $0.005/month
tier-1_estimated_cost: $0.12/month
tier-1_estimated_latency: 2ms
With this data, the system can ask: “Should this record be in a different tier?”
Decision logic:
High access frequency (250/hour) suggests hot data
Recent access (2 min ago) confirms it’s active
P99 latency of 45ms is acceptable but not great
Moving to Tier 1 would reduce latency to 2ms (20× improvement)
Cost would increase from $0.005 to $0.12/month (24× increase)
But with 250 queries/hour, each getting 43ms faster, total latency saved: 10,750ms/hour
Latency saved per dollar spent: 93,500ms/$
If your application values low latency, this record should be promoted. The telemetry makes the case.
Redpanda: Tiered Storage with Cloud Object Stores
Redpanda, a Kafka-compatible event streaming platform, pioneered adaptive tiered storage for streaming workloads[1].
Architecture:
Local SSD: Recent events (configurable retention, e.g., last 24 hours)
Object storage (S3/GCS): Historical events (unlimited retention)
Automatic migration: Events age out from SSD to object storage
The adaptive component: Redpanda monitors query patterns. If older events are accessed frequently, it caches them from object storage to local SSD temporarily.
Example flow:
T=0: Event written to topic “orders”
T=1ms: Event in local SSD (fast access: 1-5ms)
T=25hr: Event ages out to S3 (age-based rule: >24hr)
T=26hr: Query for this event → 150ms (S3 retrieval)
T=27hr: 10 more queries for same event → Redpanda detects pattern
T=28hr: Event cached back to SSD → subsequent queries: 2ms
T=36hr: No queries for 8 hours → cache evicted
Performance impact:
99% of queries hit recent data in SSD: 2ms average latency
1% of queries hit S3: 150ms average latency
Overall P99 latency: 5ms (dominated by SSD access)
Storage cost: 95% of data in S3 (1/50th the cost of SSD)
The key innovation: The system learns from access patterns and adapts cache contents dynamically[1]. Not purely age-based.
FaunaDB: Global Data Distribution with Regional Allocation
FaunaDB (now rebranded but the architecture remains instructive) demonstrated adaptive geographic placement[2].
Problem: Global application with users in US, EU, and APAC. Each region queries different subsets of data.
Traditional approach: Replicate everything everywhere (expensive) or partition by region manually (inflexible).
FaunaDB’s approach: Adaptive replication based on query geography.
How it works:
Initial state: All data in primary region (US-East)
Telemetry collection: Track where queries originate
Detect geographic patterns: “Record X is queried 80% from EU”
Adaptive replication: Automatically replicate record X to EU region
Route optimization: Direct EU queries to EU replica (50ms → 5ms)
Continuous monitoring: If pattern changes, adjust replication
Example:
User record 99832 (German user)
Query sources: EU-West: 85%, US-East: 15%, APAC: 0%
Action: Replicate to EU-West, remove from APAC (if present)
Result:
- EU queries: 5ms (local replica)
- US queries: 85ms (cross-region)
- Storage cost: 2× (replicated to 2 regions, not all 6)
The adaptive insight: Don’t replicate based on data type or age—replicate based on where queries actually come from[2].
SurrealDB: Multi-Model Co-Location
SurrealDB is a newer entrant exploring adaptive multi-model storage[3].
Concept: Different data types benefit from different storage models. Instead of forcing everything into one model (relational, document, graph), co-locate multiple models and dynamically route queries to the optimal engine.
Example: Social network application
User profiles: Document model (flexible schema)
Friend connections: Graph model (traversal-optimized)
Activity feed: Columnar model (analytical queries)
Real-time state: In-memory model (ultra-low latency)
Adaptive component: SurrealDB observes query patterns and can migrate data between storage models.
Scenario: User’s profile starts as document. Application begins running complex graph queries on friend relationships. System detects pattern, materializes graph index on profile connections. Future queries use graph model for 10× speedup.
The innovation: Storage model is not declared upfront—it’s discovered through query patterns[3].
The Telemetry Loop: Sense, Decide, Act, Measure
Adaptive systems follow a continuous feedback loop.
Step 1: Sense (Collect Telemetry)
Instrument all data operations:
Query log entry:
{
“timestamp”: “2025-01-15T14:32:17Z”,
“query_id”: “q_893kd8s”,
“record_id”: “user_12345”,
“operation”: “read”,
“latency_ms”: 45,
“source_region”: “eu-west-1”,
“result_size_bytes”: 2048,
“cache_hit”: false
}
Aggregate into access statistics:
Record: user_12345
Time window: Last 1 hour
Metrics:
- query_count: 250
- unique_queries: 180
- avg_latency: 42ms
- p99_latency: 95ms
- region_distribution: {eu-west-1: 180, us-east-1: 50, ap-south-1: 20}
- operation_mix: {read: 240, write: 10}
Step 2: Decide (Optimize Placement)
Run optimization algorithm:
FOR EACH record WITH query_count > threshold:
current_placement = get_current_placement(record)
current_cost = calculate_cost(record, current_placement)
current_latency = calculate_latency(record, current_placement)
FOR EACH alternative_placement IN possible_placements:
alternative_cost = calculate_cost(record, alternative_placement)
alternative_latency = calculate_latency(record, alternative_placement)
improvement_score = (
(current_latency - alternative_latency) * query_frequency * latency_weight
- (alternative_cost - current_cost) * cost_weight
)
IF improvement_score > threshold:
schedule_migration(record, alternative_placement)
Optimization constraints:
Don’t migrate during high-traffic periods (schedule during low-traffic windows)
Don’t migrate too frequently (minimum time between migrations: 1 hour)
Don’t migrate if improvement is marginal (must exceed threshold)
Respect compliance boundaries (GDPR data stays in EU)
Step 3: Act (Execute Migration)
Background migration process:
Migration: user_12345 from tier-2 (US-SSD) to tier-1 (EU-Memory)
1. Allocate space in target tier
2. Copy data to target
3. Verify integrity (checksum)
4. Update routing table: “user_12345 → eu-west-1/tier-1”
5. Allow brief propagation delay (100-500ms)
6. Mark source for cleanup
7. Deallocate source after grace period
During migration, queries must handle dual state:
New queries use new location
In-flight queries may use old location
Routing table versioning handles this
Step 4: Measure (Validate Improvement)
After migration, measure actual impact:
Record: user_12345
Post-migration metrics (1 hour):
- query_count: 280 (increased 12%)
- avg_latency: 3ms (was 42ms, improved 93%)
- p99_latency: 8ms (was 95ms, improved 92%)
- cost: $0.15/month (was $0.005/month, increased 30×)
Outcome: Latency massively improved, cost increased but within budget
Decision: Keep in tier-1
If metrics don’t improve as expected, revert migration.
Critical insight: The loop never stops. Access patterns change continuously. The system adapts continuously.
The Adaptive Pyramid: From RAM to Glacier
Visualizing adaptive storage as a pyramid:
/\
/ \ RAM Cache
/ \ (microseconds, $$$$)
/------\
/ \ Local SSD
/ \ (milliseconds, $$$)
/------------\
/ \ Regional SSD Cluster
/ \ (5-10ms, $$)
/------------------\
/ \ Cross-Region Replicas
/ \ (50-150ms, $)
/------------------------\
/ \ Object Storage (S3)
/ \ (100-500ms, ¢)
/------------------------------\
/ \ Glacier / Archive
/ \ (hours, ¢¢)
/____________________________________\
Data flows up and down based on access patterns
Traditional approach: Data flows down only (ages out from hot to cold).
Adaptive approach: Data flows in both directions:
Promotion: Cold data accessed frequently → moves up pyramid
Demotion: Hot data no longer accessed → moves down pyramid
Lateral movement: Data replicates geographically based on query sources
Example data lifecycle:
T=0: Record created → Tier 1 (RAM) [default for new data]
T=1hr: No access → Demoted to Tier 2 (Local SSD)
T=24hr: Still no access → Demoted to Tier 3 (Regional cluster)
T=7d: Still no access → Demoted to Tier 4 (Object storage)
T=30d: Sudden spike in queries (article goes viral)
→ Promoted to Tier 2 (Local SSD)
T=31d: Query rate decreases → Demoted to Tier 3
T=90d: No access for 60 days → Demoted to Tier 5 (Glacier)
The system responds to actual behavior, not predicted behavior.
Data Temperature: The Key Metric
“Temperature” is a metaphor for access frequency and recency.
Hot data:
Accessed frequently (>10 queries/hour)
Accessed recently (last 5 minutes)
Should be in fast storage (RAM, local SSD)
Warm data:
Accessed occasionally (1-10 queries/hour)
Accessed somewhat recently (last hour)
Should be in moderate storage (regional SSD)
Cold data:
Accessed rarely (<1 query/hour)
Not accessed recently (hours/days ago)
Should be in cheap storage (object store)
Frozen data:
Never accessed (months/years)
Should be in archival storage (Glacier)
Temperature formula (simplified):
temperature = (
access_frequency * recency_weight +
(time_since_last_access)^-1 * recency_weight +
access_growth_rate * trend_weight
)
Where:
- access_frequency: queries per hour
- time_since_last_access: hours since last query
- access_growth_rate: change in frequency over time
- weights: tunable parameters
Example calculations:
Record A: 50 queries/hour, last accessed 1 minute ago
temperature = 50 * 0.5 + (0.0167)^-1 * 0.3 + 0 * 0.2 = 43
Status: HOT
Record B: 1 query/hour, last accessed 3 hours ago
temperature = 1 * 0.5 + (3)^-1 * 0.3 + 0 * 0.2 = 0.6
Status: COLD
Record C: 5 queries/hour currently, was 1 query/hour yesterday (5× growth)
temperature = 5 * 0.5 + (0.5)^-1 * 0.3 + 4 * 0.2 = 3.9
Status: WARMING (promote proactively)
Temperature guides placement decisions automatically.
Real-World Implementation: CloudFlare R2 with Automatic Tiering
CloudFlare R2 (object storage) introduced automatic class transitions[4].
Concept: Instead of manual lifecycle rules, let the system decide.
How it works:
All objects start in “frequent access” class (fast, expensive)
System monitors access patterns per object
Objects not accessed for 30 days automatically transition to “infrequent access” (slower, cheaper)
Objects accessed again automatically transition back to “frequent access”
Example:
Upload: image.jpg → Frequent Access ($0.10/GB/month)
Day 35: Not accessed for 30 days → Infrequent Access ($0.01/GB/month)
Day 40: Image accessed → Back to Frequent Access
Day 70: Not accessed for 30 days → Infrequent Access
Cost impact: If 80% of data is never accessed after 30 days, automatic tiering saves 80% × 90% = 72% of storage costs.
The key: No manual lifecycle rules. The system observes and adapts[4].
The Performance-Cost Frontier
Adaptive systems navigate the performance-cost frontier dynamically.
Static system: Fixed point on the frontier
Either: Fast (expensive) for all data
Or: Cheap (slow) for all data
Or: Manual tiering with lots of operational overhead
Adaptive system: Moves along the frontier based on requirements
Hot data → fast tier (pay for performance)
Cold data → cheap tier (save money)
Adjusts automatically as temperature changes
Optimization goal: Minimize cost subject to latency constraints.
minimize: total_cost
subject to:
p99_latency <= target_latency (e.g., 50ms)
compliance_constraints_satisfied
migration_rate <= max_migrations_per_hour
Real numbers (100TB dataset):
Scenario 1: All data in hot storage (static)
Cost: 100TB × $100/TB/month = $10,000/month
P99 latency: 5ms
Result: Fast but expensive
Scenario 2: All data in cold storage (static)
Cost: 100TB × $2/TB/month = $200/month
P99 latency: 200ms
Result: Cheap but slow
Scenario 3: Adaptive storage
Hot data (5TB): $100/TB = $500/month
Warm data (20TB): $20/TB = $400/month
Cold data (75TB): $2/TB = $150/month
Total cost: $1,050/month (10× cheaper than all-hot)
P99 latency: 8ms (hot data accessed 95% of time)
Result: Fast AND cheap
The adaptive advantage: 10× cost reduction with minimal latency impact.
Challenges: When Adaptation Goes Wrong
Adaptive systems aren’t perfect. They introduce new failure modes.
Challenge 1: Thrashing
Data oscillates between tiers due to access pattern noise.
Scenario:
10:00 AM: Data accessed → Promoted to hot
10:30 AM: No access for 30 min → Demoted to cold
11:00 AM: Data accessed → Promoted to hot
11:30 AM: No access for 30 min → Demoted to cold
Constant migration burns CPU, bandwidth, and increases latency.
Solution: Hysteresis—require sustained pattern before migrating.
Promote only if accessed >N times in M minutes
Demote only if not accessed for >P minutes
Minimum time between migrations: Q hours
Challenge 2: Migration Cost
Moving data isn’t free. Large migrations can saturate networks or storage systems.
Scenario: Viral event causes 1TB of cold data to suddenly become hot. System decides to promote it all. Migration saturates network bandwidth. Application queries suffer.
Solution: Rate limiting and prioritization.
Limit concurrent migrations (e.g., max 100GB/hour)
Prioritize migrations by improvement score
Migrate most impactful data first
Challenge 3: Compliance Violations
Adaptive system migrates EU user data to US region for performance, violating GDPR.
Scenario: EU user’s data is in EU storage. US office queries it frequently. System detects pattern, considers replicating to US for performance. This would violate data residency requirements.
Solution: Compliance constraints as hard limits.
Tag data with regulatory requirements
Filter possible placements before optimization
Never consider placements that violate compliance
Challenge 4: Cost Runaway
Adaptive system over-optimizes for latency, ignoring cost.
Scenario: System detects slight latency improvements from promoting data. Promotes aggressively. Storage costs explode from $1k/month to $20k/month.
Solution: Multi-objective optimization with cost budget.
Set maximum cost budget
Optimize latency subject to cost constraint
Alert when approaching budget limits
The Path Forward
Adaptive storage is the first component of the Intelligent Data Plane. It demonstrates that:
Telemetry beats prediction: Observing actual behavior outperforms predicting behavior
Continuous optimization beats static decisions: Access patterns change; placement should too
Automation reduces operational burden: Systems that adapt themselves require less manual tuning
But adaptive storage is just the beginning. It optimizes data placement within predefined constraints—tiers, regions, storage classes. It doesn’t fundamentally change the architecture.
In Chapter 10, we’ll introduce data gravity—the concept that data and compute have mutual attraction. Data has “weight” that pulls compute toward it, and compute has “demand” that pulls data toward it. We’ll explore what happens when both data and compute can move freely, and how systems can optimize the placement of both simultaneously.
Then in Chapter 11, we’ll introduce Vector Sharding—a novel approach that models data distribution as multidimensional vectors and uses predictive algorithms to anticipate optimal placement before demand spikes. This moves beyond reactive optimization (respond to patterns) to proactive optimization (predict and prepare).
The synthesis is forming. Static placement is giving way to adaptive placement. But even adaptive placement is reactive. The ultimate goal is predictive placement—systems that anticipate needs and optimize ahead of demand.
References
[1] Redpanda, “Tiered Storage: Unlimited Retention at a Fraction of the Cost,” Redpanda Documentation, 2024. [Online]. Available: https://redpanda.com/blog/tiered-storage-architecture
[2] FaunaDB, “Adaptive Query Distribution in Global Databases,” Fauna Technical Blog, 2022. [Online]. Available: https://fauna.com/blog
[3] SurrealDB, “Multi-Model Database Architecture,” SurrealDB Documentation, 2024. [Online]. Available: https://surrealdb.com/docs
[4] Cloudflare, “Introducing R2 Automatic Storage Class Transitions,” Cloudflare Blog, 2024. [Online]. Available: https://blog.cloudflare.com/r2-automatic-storage-class-transitions/
[5] J. Wilkes, “More Google Cluster Data,” Google Research Blog, 2011. [Online]. Available: https://research.google/blog/
[6] K. Ousterhout et al., “Making Sense of Performance in Data Analytics Frameworks,” Proc. 12th USENIX Symposium on Networked Systems Design and Implementation, pp. 293-307, 2015.
[7] A. Verma et al., “Large-scale Cluster Management at Google with Borg,” Proc. 10th European Conference on Computer Systems, pp. 1-17, 2015.
Next in this series: Chapter 10 - Data Gravity and Motion, where we’ll explore the dynamic relationship between data and compute, and discover why static placement wastes 30-60% of potential efficiency.

