Where LRU Caching Actually Matters in Modern Cloud Systems

Last month I wrote a blog about implementing an LRU cache after working on a LeetCode problem. I thought this would be a worthwhile implementation to know, so I wrote it up to help memorize the concept. Now that those thoughts have settled, I have been wondering: where would I actually use this in modern cloud systems designed for scalability with stateless servers?

Here’s what I figured out.

The Stateless Server Problem

In a world of horizontally scaled, stateless servers, you can’t just cache things on individual instances and call it a day. If requests are load-balanced across multiple servers, each one having its own isolated cache creates more problems than it solves. Different instances would have different caches, leading to inconsistent performance and lower hit rates.

So where does LRU caching actually fit?

Per-Request Caching (Where You Might Implement It)

This is the one place where implementing LRU yourself might actually make sense in application code. Per-request caching lives only for the duration of a single request and gets discarded when you’re done. Some examples that come to mind are:

Memoizing a function result that might be called multiple times during one request
Caching a database lookup you need to reference several times while processing
Deduplicating API calls within a single operation

Since this cache doesn’t survive past the request, it doesn’t violate statelessness.

Cross-Request Caching (Where You Don’t Implement It)

For caching that spans multiple requests in a multi-instance architecture, you need shared infrastructure:

Shared distributed caches like Redis or Memcached - so all instances see the same cache
Client-side caching - where the caller maintains the cache
CDN/proxy layer caching - before requests even reach your servers

Individual server caching across requests only makes sense in rare cases: single-instance deployments (rare in production), truly immutable data like configuration, or when you’re willing to accept inconsistency for performance gains.

Redis

So now I am thinking what do I need to know about LRU cache when using Redis?

LRU eviction is built into Redis. You just configure which policy you want:

maxmemory 2gb
maxmemory-policy allkeys-lru

Redis handles all the eviction logic automatically. As a developer, you’re just:

Setting the eviction policy (one-time configuration)
Setting appropriate TTLs on keys
Choosing what to cache and structuring keys effectively
Monitoring cache hit rates

The actual LRU algorithm is Redis’s job.

Where LRU Still Lives (Even If You Don’t Write It)

Even though we don’t implement it ourselves, LRU is everywhere in the infrastructure we use:

Distributed caches - Redis and Memcached use LRU-like policies to manage finite memory
CDN and edge caching - Content delivery networks use LRU variants at edge locations to decide what to keep cached
Database query caches - Even cloud-managed databases like RDS use LRU internally to keep hot data in memory
Client-side caching - Browsers, mobile apps, and API clients use LRU to minimize network requests
Serverless functions - Individual function instances can cache data between invocations using LRU during the container’s lifetime

The Real Lesson

Working through this reminded me that a lot of the classic computer science algorithms we learn are already baked into the tools we use daily. The skill isn’t necessarily implementing LRU from scratch - it’s knowing:

Which tool uses which strategy
When to reach for that tool
How to configure it properly

That LeetCode problem was still worth doing. Understanding how LRU works helped improved decision making about when and where to apply it. But in practice, I would be more likely to be configuring Redis than writing a doubly-linked list with a hash map.

An exception would be working a level deeper on the Redis codebase itself or other infrastructure tooling. In this case having a deep understanding of the implementation would be very important.

The real implementation opportunities for LRU are in specialized cases: custom application caches where existing tools don’t fit, client-side caching logic, or per-request memoization.

With these two short blogs completed, I am grateful for the leetcode problem as it motivated me to expand my understanding of caching and the technologies we use to support cloud systems at scale.

Cheers!

LRU Cache Cache Invalidation Strategies

Jones Codes

Explorer

Where LRU Caching Actually Matters in Modern Cloud Systems

The Stateless Server Problem

Per-Request Caching (Where You Might Implement It)

Cross-Request Caching (Where You Don’t Implement It)

Redis

Where LRU Still Lives (Even If You Don’t Write It)

The Real Lesson

Graph View

Recent Posts

From LRU Cache to Distributed Systems: A Complete Guide to Caching in Modern Applications

Posts

Testing Distributed Systems: Beyond Unit Tests

Solving the Dual Write Problem - Transactional Outbox and Idempotency

Cache Invalidation in Microservices: The Hard Part