Chapter 2 – The Physics of Distance

Why Perfect Software Still Can't Beat Geography

Oct 27, 2025

There’s a humbling moment in every distributed systems architect’s career. You’ve optimized your code, eliminated unnecessary allocations, tuned your thread pools, and squeezed every microsecond out of your hot paths. Your profiler shows beautiful, tight execution. Your benchmarks are phenomenal. Then you deploy across regions and discover that all your optimization bought you 3 milliseconds of improvement on a request that takes 150 milliseconds end-to-end.

The other 147 milliseconds? That’s physics. And physics doesn’t care about your benchmarks.

The Speed of Light Is Not a Suggestion

Let’s start with the fundamental constraint: light in fiber optic cable travels at approximately 200,000 kilometers per second—roughly 67% the speed of light in vacuum[1]. This isn’t a limitation of current technology. This is the refractive index of glass. Short of replacing the entire internet with vacuum tubes (which introduces its own problems), we’re stuck with this number.

What does this mean in practice? Let’s map out the one-way latency for light to travel various distances:

Within a datacenter:

Same rack: ~0.1 meters = 0.0000005 seconds (0.5 nanoseconds)
Cross-rack in same row: ~10 meters = 0.00005 milliseconds (50 nanoseconds)
Across datacenter floor: ~100 meters = 0.0005 milliseconds (500 nanoseconds)

Within a region:

Same availability zone: ~1 km = 0.005 milliseconds (5 microseconds)
Cross-AZ in same region: ~10 km = 0.05 milliseconds (50 microseconds)
Metro area (e.g., SF Bay Area): ~50 km = 0.25 milliseconds

Continental:

San Francisco to New York: ~4,100 km = 20.5 milliseconds
London to Moscow: ~2,500 km = 12.5 milliseconds
Sydney to Perth: ~3,300 km = 16.5 milliseconds

Transcontinental:

New York to London: ~5,600 km = 28 milliseconds
San Francisco to Tokyo: ~8,300 km = 41.5 milliseconds
London to Singapore: ~10,800 km = 54 milliseconds

These are one-way times for light itself. Double them for round-trip. Then add everything else.

The “Everything Else” Tax

Those numbers assume a straight line through perfect fiber with zero processing overhead. Reality is messier. Here’s what actually happens to your database query crossing the continent:

Serialization overhead: Your query object must be serialized to bytes (typically 0.01-0.1ms for small queries, but can be milliseconds for large payloads).

TCP handshake: Before any data flows, TCP requires a three-way handshake. That’s 1.5 round trips—if you’re going SF to NYC, that’s 60-75ms before you’ve sent a single byte of actual data[2].

TLS handshake: If you’re using encryption (and you should be), add another 2 round trips for the TLS handshake. Another 80-100ms[3].

Router hops: Your packet doesn’t travel in a straight line. It hops through 10-30 routers between datacenters, each adding microseconds of queuing and processing delay. These add up to 5-20ms in aggregate.

Switch backplane latency: Each switch your packet traverses adds 5-50 microseconds. In a large datacenter, your packet might traverse a dozen switches before reaching the destination rack.

Congestion and buffering: When networks get busy, routers queue packets. This is the most variable component—under light load it’s negligible, under heavy load it can add 10-100ms[4].

Protocol overhead: HTTP/2 framing, TCP acknowledgments, retransmissions for lost packets—each adds latency.

Let’s be concrete. Here’s the realistic end-to-end latency for a simple database query at different points on our spectrum:

Same process (embedded database):

Wire time: 0ms (no network)
Processing: 0.01-1ms (depends on query complexity)
Total: 0.01-1ms

Same rack:

Wire time: ~0.0001ms (negligible)
TCP overhead: 0.1ms (connection reuse helps here)
Processing: 0.5ms
Total: ~0.6ms

Same datacenter, different rack:

Wire time: 0.001ms
TCP overhead: 0.15ms
Switch hops: 0.05ms
Processing: 0.5ms
Total: ~0.7ms

Same region, different AZ:

Wire time: 0.1ms
TCP overhead: 0.2ms
Router hops: 1ms
Processing: 0.5ms
Total: ~1.8ms

Cross-continent (SF to NYC):

Wire time: 41ms (round trip)
TCP overhead: 2ms
Router hops: 8ms
Congestion (avg): 5ms
Processing: 0.5ms
Total: ~56ms (if connection is warm)
Total: ~136ms (if connection is cold and needs TCP+TLS handshake)

Transoceanic (SF to London):

Wire time: 94ms (round trip)
TCP overhead: 3ms
Router hops: 12ms
Subsea cable latency variation: 5-15ms
Congestion (avg): 5ms
Processing: 0.5ms
Total: ~120ms (warm connection)
Total: ~200ms (cold connection)

No amount of software optimization touches the wire time. You can squeeze the processing overhead, you can keep connections warm, you can use more efficient protocols—but you cannot make light travel faster.

Bandwidth: The Economic Dimension of Distance

Latency is what users feel. Bandwidth is what you pay for.

It’s a common misconception that bandwidth and latency are related. They’re not. Latency is how long it takes for a single bit to travel from source to destination. Bandwidth is how many bits can be in flight simultaneously. You can have high bandwidth and high latency (transcontinental fiber) or low bandwidth and low latency (same-rack copper).

Here’s why this matters for distributed systems: cross-region bandwidth is expensive, even when it’s fast.

Current cloud pricing (as of 2024-2025) for data transfer:

Within the same region:

AWS: $0.01/GB
GCP: $0.01/GB
Azure: Free (within same region)

Cross-region (same provider):

AWS US-East to US-West: $0.02/GB
GCP US to Europe: $0.05-0.08/GB
Azure US to Europe: $0.05/GB

Internet egress:

AWS to internet: $0.09-0.15/GB
GCP to internet: $0.08-0.12/GB
Azure to internet: $0.087-0.12/GB

Let’s model a system handling 1 billion requests per day, where each request involves 10KB of data:

Same-region deployment:

Data transfer: 10 TB/day × $0.01/GB = $100/day = $3,000/month

Multi-region with synchronous replication (3 regions):

Data transfer: 30 TB/day × $0.05/GB = $1,500/day = $45,000/month

That’s 15× more expensive just for the data transfer. And that’s before you factor in the additional compute for serialization, the additional storage for replicas, and the additional network engineering time.

The bandwidth constraint creates a different kind of locality pressure than latency does. Latency says “put data close to where it’s queried.” Bandwidth says “don’t move data unless you have to.”

The Tail at Scale

Here’s the cruelest aspect of distributed systems: average latency doesn’t matter. Users experience the tail.

If 99% of your queries complete in 10ms but 1% take 500ms, and your typical web page makes 50 backend calls, what’s the user experience?

The probability that all 50 calls hit the fast path is 0.99^50 = 60%. That means 40% of page loads will include at least one slow query. Your P50 page load time is dominated by your P99 query time[5].

This is “the tail at scale” problem, and geography makes it worse. Consider a multi-region database with three replicas:

Local replica: P50 = 5ms, P99 = 15ms
Cross-region replica: P50 = 60ms, P99 = 200ms

If you’re doing quorum reads (must read from 2 of 3 replicas), your latency is determined by the second-fastest response. If one replica is cross-region, you’re paying the geography tax on every quorum read that doesn’t get lucky with replica selection.

Now add cascading failures. When one region starts running hot, it slows down. Clients timeout and retry. The retries add load, slowing things further. The slow region starts failing health checks and gets removed from the load balancer—now the remaining regions have even more load. This is how a localized latency spike becomes a multi-region outage[6].

The tail behavior of distance-related latency is particularly nasty because it’s unpredictable. A packet taking the “wrong” route through the internet backbone, a brief spike in cross-region traffic, a BGP flap—any of these can cause a latency outlier. In a same-rack deployment, your latency variance is microseconds. Cross-region, it’s tens or hundreds of milliseconds.

Packet Loss: The Probability of Silence

Latency is what happens when things work. Packet loss is what happens when they don’t.

Typical packet loss rates by distance:

Same rack: 0.001% (1 in 100,000 packets)
Same datacenter: 0.01% (1 in 10,000 packets)
Same region: 0.1% (1 in 1,000 packets)
Cross-region: 0.5-2% (5-20 in 1,000 packets)
Transoceanic: 1-5% (10-50 in 1,000 packets)

These seem like small numbers until you consider what happens when TCP encounters packet loss. TCP assumes packet loss means congestion and cuts its congestion window in half. This means your throughput drops by 50% every time you lose a packet[7].

For a cross-continental connection losing 1% of packets, you might see:

Average throughput: 70-80% of theoretical maximum
P99 request latency: 2-3× the baseline (due to retransmits)
Connection stalls: occasional multi-second freezes when multiple retransmits are needed

This is why UDP-based protocols like QUIC have become popular for long-distance communication—they can handle packet loss more gracefully than TCP’s conservative approach[8].

Distance Affects Everything, Not Just Queries

We’ve focused on database queries, but distance impacts every distributed operation:

Service mesh health checks: If services need to health-check across regions, you’re burning CPU and network capacity on cross-region heartbeats every second.

Distributed locks: Any consensus protocol (Raft, Paxos) requires multiple round trips across all nodes. Cross-region consensus is 5-10× slower than single-region.

Cache invalidation: Sending cache invalidation messages across regions is slow. By the time the invalidation arrives, the stale data might have been read thousands of times.

Log shipping: Streaming logs across regions for observability means you’re paying both the latency tax (logs arrive delayed) and the bandwidth tax (logs are typically high-volume).

Backup and disaster recovery: Taking a backup from one region and shipping it to another for DR means moving hundreds of gigabytes or terabytes across expensive, high-latency links.

Reframing Distributed Design as Applied Physics

Here’s the uncomfortable truth: distributed systems design is not primarily a software engineering problem. It’s a physics problem with a software interface.

When you design a distributed system, you’re not really choosing algorithms and data structures. You’re choosing which physical constraints to accept and which to fight against. Every architectural decision is a bet on physics:

“We’ll use synchronous replication across three regions” = We’re willing to pay 100-200ms of latency on every write in exchange for not having to think about eventual consistency.

“We’ll cache aggressively at the edge” = We’re willing to serve stale data sometimes in exchange for avoiding the 50-150ms cross-continent round trip.

“We’ll shard by geography and pin users to their home region” = We’re willing to complicate our routing logic and deal with cross-shard queries in exchange for keeping most operations local.

“We’ll embed the database in the application” = We’re willing to deal with write amplification and complex state reconciliation in exchange for eliminating network latency entirely.

None of these choices is “correct” in the abstract. They’re all trade-offs between different physical constraints: latency vs. consistency, bandwidth costs vs. operational complexity, storage redundancy vs. write amplification.

The systems that succeed are the ones that explicitly acknowledge these constraints and design around them, rather than pretending they can be optimized away. You cannot optimize away the speed of light. You cannot optimize away packet loss on transoceanic cables. You cannot optimize away the bandwidth costs of replicating terabytes across regions.

What you can do is architect systems that minimize unnecessary distance, accept necessary distance where it provides value, and have graceful degradation when distance inevitably causes problems.

The Path Forward

In Chapter 1, we established the data-locality spectrum. In this chapter, we’ve quantified its costs. The numbers are unforgiving: every hop across a network boundary adds milliseconds; every cross-region link adds tens or hundreds of milliseconds; every reliability mechanism adds retries and exponential backoff.

But here’s the interesting question: if the constraints are immutable, can the architecture be adaptive? If data access patterns change—if your European users suddenly spike, if your US users drop off, if a new feature makes certain queries hot—can your data placement evolve to match?

The traditional answer has been “no, pick a topology and live with it.” You architect for your expected distribution of traffic, provision accordingly, and hope you got it right. If you didn’t, you’re stuck with expensive re-sharding or slow queries.

But what if the answer could be “yes”?

In the next chapter, we’ll explore the architectures that push data-locality to its logical extreme: systems where the database lives inside the application, where every query is a local operation, and where network failures are theoretically impossible. We’ll see what you gain—and what you lose—when you refuse to accept any distance at all.

References

[1] P. A. Humblet and S. R. Azzouz, “Performance Analysis of Optical Fiber Communication Systems,” IEEE Journal on Selected Areas in Communications, vol. 4, no. 9, pp. 1547-1556, 1986.

[2] V. Jacobson, “Congestion Avoidance and Control,” ACM SIGCOMM Computer Communication Review, vol. 18, no. 4, pp. 314-329, 1988.

[3] R. Lychev et al., “Quantifying the Latency Overhead of TLS,” Proc. IEEE Conference on Computer Communications, pp. 1-9, 2015.

[4] K. Nichols and V. Jacobson, “Controlling Queue Delay,” Communications of the ACM, vol. 55, no. 7, pp. 42-50, 2012.

[5] J. Dean and L. A. Barroso, “The Tail at Scale,” Communications of the ACM, vol. 56, no. 2, pp. 74-80, 2013.

[6] M. Chow et al., “The Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services,” Proc. 11th USENIX Symposium on Operating Systems Design and Implementation, pp. 217-231, 2014.

[7] M. Allman, V. Paxson, and E. Blanton, “TCP Congestion Control,” IETF RFC 5681, 2009.

[8] J. Iyengar and M. Thomson, “QUIC: A UDP-Based Multiplexed and Secure Transport,” IETF RFC 9000, 2021.

Next in this series: Chapter 3 - Locality and the Edge, where we’ll examine what happens when you refuse to accept any distance at all—and discover why “zero latency” creates its own set of impossible problems.

It Should Just Work®

Discussion about this post