Skip to main content
Engineering7 min readMarch 3, 2026

Web Caching Strategies: HTTP Cache, CDN, and Application Cache

Caching is the fastest page you can serve. Here's a practical guide to HTTP caching headers, CDN configuration, and application-level caching strategies that actually work.

James Ross Jr.

James Ross Jr.

Strategic Systems Architect & Enterprise Software Developer

The Fastest Request Is the One You Don't Make

Caching is the highest-leverage performance optimization in web development. A cached response served from browser memory is microseconds. A cached response from a CDN edge node might be 10-30ms. An uncached response from an origin server is hundreds of milliseconds or more, and it consumes server resources for every request.

The hierarchy of caching from fastest to slowest:

  1. Memory cache (browser in-memory cache) — near zero latency
  2. Disk cache (browser disk cache) — single-digit milliseconds
  3. Service worker cache — varies, can be near memory cache speed
  4. CDN edge cache — 10-50ms from nearby nodes
  5. Application cache (Redis, Memcached) — 5-20ms network hop
  6. Database query cache — depends on query complexity
  7. Full database query — 10-500ms+ depending on complexity

Effective caching strategy means pushing as many requests as possible to higher cache layers.


HTTP Cache Headers

HTTP caching is controlled by headers that you set on your server responses. Understanding these headers is the foundation of web caching.

Cache-Control is the primary caching directive. The values that matter most:

Cache-Control: public, max-age=31536000, immutable

Use this for versioned static assets (JS bundles, CSS files, images with content-hash filenames). max-age=31536000 tells browsers and CDNs to cache this resource for one year. immutable tells the browser not to revalidate the cache even when the user forces a refresh. This is correct when the filename changes on each build (content hashing) — the old URL is cached forever, and the new URL is fetched fresh.

Cache-Control: no-cache

Counterintuitively, no-cache doesn't mean "don't cache." It means "always revalidate before serving from cache." The browser caches the response but checks with the server on every request. If the server returns 304 Not Modified, the browser uses the cached version. If the content changed, it downloads the new version. Use this for HTML documents.

Cache-Control: no-store

This means truly don't cache: not in browser, not in CDN, not anywhere. Use this for sensitive data (account details, private documents, authentication responses).

Cache-Control: private, max-age=3600

private means only the browser can cache this, not CDNs. Use this for authenticated content that is user-specific but not sensitive enough to warrant no-store.

ETag is a fingerprint of the response content, used for conditional requests:

ETag: "33a64df551425fcc55e4d42a148795d9f25f89d4"

On subsequent requests, the browser sends If-None-Match: "33a64df...". If the content hasn't changed, the server responds with 304 Not Modified (no body, very fast). If it has changed, the server sends the new content with a new ETag.

Last-Modified works similarly to ETag but uses timestamps. ETags are generally preferred because timestamps have second-level precision and can create edge cases.


CDN Configuration

A CDN (Content Delivery Network) is a distributed network of servers that cache your content close to users geographically. The three things you need to configure correctly:

Cache rules by URL pattern. Your CDN needs to know what to cache and for how long. Typical configuration:

  • /assets/* (hashed filenames) → cache forever, respect origin Cache-Control headers
  • /images/* → cache for 7-30 days, convert to WebP if requested
  • /_nuxt/*, /_next/* → cache forever (framework assets with content hashes)
  • /api/* → don't cache (or cache very selectively with short TTLs)
  • /* (HTML) → cache no-cache policy (revalidate, but serve stale if origin is down)

Cache invalidation. When you deploy a new version, how does the CDN know to serve the new HTML? Options:

  • Purge by URL: explicitly tell the CDN to invalidate specific URLs after deployment
  • Stale-while-revalidate: serve the stale version immediately while fetching the new version in the background
  • Cache-busting URLs: include a deployment ID in your HTML URL (non-standard, messy)

Most production deployments purge HTML files on deploy (via CDN API in the deployment pipeline) and rely on content-hashed assets for everything else.

Vary header for content negotiation. If you serve different content based on request headers (e.g., different image formats based on Accept: image/avif), you need to configure the CDN to cache separate versions:

Vary: Accept-Encoding, Accept

Without this, the CDN might serve a WebP image to a browser that requested AVIF, or GZIP-compressed content to a client that can't decompress it.


Application-Level Caching

When your API endpoints are slow because of database queries, application-level caching (Redis, Memcached) reduces the database load and improves response times.

The caching patterns:

Cache-aside (lazy loading): Check the cache first. If not found, query the database, store the result in cache, return the result.

async function getUser(userId: string) {
  const cacheKey = `user:${userId}`
  const cached = await redis.get(cacheKey)
  if (cached) return JSON.parse(cached)

  const user = await db.users.findUnique({ where: { id: userId } })
  await redis.set(cacheKey, JSON.stringify(user), 'EX', 300) // 5 min TTL
  return user
}

Write-through: Update the cache when you update the database, so the cache is always current.

Cache invalidation: The hard problem. When a user's record updates, you need to invalidate the cached version. Options: TTL-based expiry (accept some staleness), explicit invalidation on write (update the cache when you update the database), or event-driven invalidation.

What's worth caching:

  • Query results that are expensive to compute and infrequently changing (user preferences, product catalogs)
  • API responses that aggregate data from multiple database queries
  • Session data
  • Rate limit counters (Redis TTL makes this natural)

What's not worth caching:

  • Simple primary-key lookups (fast enough without cache, complex invalidation)
  • User-specific data that changes frequently (cache hit rate will be low)
  • Data with complex invalidation logic that's error-prone to implement correctly

Stale-While-Revalidate

The stale-while-revalidate cache directive is one of the most useful modern caching primitives. It allows serving a stale cached response immediately while refreshing the cache in the background:

Cache-Control: max-age=60, stale-while-revalidate=300

This means: serve this response from cache for 60 seconds. After 60 seconds, if the cache is stale, serve the stale version immediately (for up to 5 minutes) while fetching a fresh version in the background. The user gets a fast response; the cache gets updated asynchronously.

This is particularly valuable for data that changes periodically (news feeds, product listings) where you want performance without excessive staleness. The user never waits for a cache miss; they might see data that's a few minutes old, but the next request will have fresh data.


Service Workers for Offline and Advanced Caching

Service workers are JavaScript processes that run in the browser background and can intercept network requests. They enable:

  • Offline support (serve cached content when offline)
  • Custom caching strategies per request type
  • Background sync for deferred network operations

For most web applications, service workers are overkill for pure performance optimization — HTTP caching and CDN are more straightforward and easier to reason about. Service workers become valuable for PWAs that need offline support or apps with very specific per-resource caching requirements.


Caching is one of the oldest performance techniques in the web developer's toolkit, and it's still the most impactful. Getting HTTP headers right, configuring the CDN correctly, and adding application-level caching where it matters can cut your server load and page load times dramatically. If you're diagnosing performance issues and want to evaluate your caching strategy, book a call at calendly.com/jamesrossjr.


Keep Reading