Website Speed Optimization: Beyond the Basics

You Already Know the Basics

Every web performance guide starts with the same advice: compress images, minify CSS and JavaScript, enable gzip/brotli compression, use a CDN. If you are reading this article, you have probably done those things already — and you are wondering why your site still is not as fast as you want it to be.

The basics matter. But they are table stakes. Once images are compressed, scripts are minified, and a CDN is in place, the remaining performance gains come from deeper architectural decisions: how your server generates responses, how the browser prioritizes resource loading, how your JavaScript executes, and how your caching strategy evolves from "cache everything" to nuanced per-resource policies.

These optimizations require understanding the browser's rendering pipeline at a level most developers do not engage with. They require profiling real user sessions, not just running Lighthouse. And they require making tradeoffs — some optimizations improve one metric while degrading another, and knowing which metric matters more for your specific application is judgment work, not tooling work.

Server-Side Speed: The Forgotten Layer

Frontend performance optimization gets the most attention, but your server response time sets the floor for how fast anything can be. If your server takes 800ms to generate the HTML document, no amount of frontend optimization can achieve a sub-1-second Largest Contentful Paint.

Measure Time to First Byte (TTFB) across your key pages. TTFB includes DNS resolution, TCP connection, TLS handshake, and server processing time. The connection overhead is largely addressed by CDN and HTTP/2 — the server processing time is what you control.

For server-rendered applications (SSR with Nuxt, Next.js, etc.), profiling the server rendering path reveals optimization opportunities. Common bottlenecks: database queries that execute serially when they could run in parallel, API calls to external services that block rendering, template rendering that computes values already available in cache, and missing database indexes on frequently queried fields.

Implement response caching at the server level. Full-page caching with a CDN like Cloudflare can serve cached HTML responses in under 50ms globally. For dynamic pages, use stale-while-revalidate cache policies that serve cached content immediately and refresh the cache in the background. Nuxt's route rules allow per-route caching configuration — static marketing pages can be cached for hours while dashboard pages bypass the cache entirely.

Edge computing pushes server logic closer to users. Running your application on Cloudflare Workers or similar edge platforms reduces the physical distance between server and client, cutting 100-300ms of network latency for geographically distributed users. This is not a marginal improvement — for users on the other side of the world from your origin server, it can halve the TTFB.

Resource Loading Priority

The browser loads resources in a priority order that may not match your application's actual priorities. Understanding and influencing this order produces significant performance improvements without changing a single line of application code.

Preconnect to critical third-party origins. If your page loads fonts from Google Fonts and analytics from a third-party domain, the browser must establish separate connections (DNS + TCP + TLS) to each origin. Each connection takes 100-300ms. <link rel="preconnect"> starts these connections early:

<link rel="preconnect" href="https://fonts.googleapis.com" />
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />

Preload critical resources that the browser discovers late. The browser discovers CSS resources by parsing HTML, and it discovers font files by parsing CSS. A font referenced in a CSS file is not discovered until the CSS is downloaded and parsed — potentially hundreds of milliseconds after the page load begins. Preloading the font in the HTML head starts the download immediately:

<link rel="preload" href="/fonts/inter-var.woff2" as="font" type="font/woff2" crossorigin />

Fetch Priority gives you explicit control over resource priority within the same resource type. Use fetchpriority="high" on the LCP image to ensure the browser prioritizes it over other images. Use fetchpriority="low" on below-the-fold images that are not lazy loaded but also not critical.

Script loading strategy has a massive impact. async scripts download in parallel with HTML parsing and execute immediately when downloaded — their execution order is unpredictable. defer scripts download in parallel and execute in order after HTML parsing completes. For most scripts, defer is the correct choice because it prevents render blocking while preserving execution order. The only exception is scripts that must modify the DOM before it renders (theme detection, A/B testing scripts).

JavaScript Performance Deep Dive

JavaScript is usually the largest bottleneck in modern web applications. Not because of download size (though that matters) but because of execution time. The browser must parse, compile, and execute JavaScript on the main thread, and during that time, the page is unresponsive.

Audit your JavaScript bundle with your bundler's analysis tool (vite-bundle-visualizer for Vite, @next/bundle-analyzer for Next.js). Identify the largest modules. Common offenders: date libraries (moment.js is 300KB — use date-fns or Temporal API instead), charting libraries loaded on every page when charts only exist on one page, CSS-in-JS runtime overhead, and polyfills for APIs your target browsers already support.

Code splitting at the route level ensures each page loads only the JavaScript it needs. But route-level splitting is the minimum. Within a page, dynamically import heavy components that are not visible on initial load — modal dialogs, chart widgets, rich text editors, date pickers. These components often account for 30-50% of a page's JavaScript but are not needed until the user takes a specific action.

Tree shaking eliminates unused code from your bundles, but only if your dependencies are properly structured. Check whether your imported libraries support ES modules and tree-shakeable exports. A library imported as import { debounce } from 'lodash' may include the entire lodash library if the package does not support tree shaking. Use import debounce from 'lodash-es/debounce' or switch to a tree-shakeable alternative.

Third-party scripts deserve special scrutiny. Load them after your core experience is interactive. Use the loading="lazy" approach for third-party embeds (YouTube iframes, social media widgets) and defer analytics scripts using requestIdleCallback or the defer attribute. Measure the impact of each third-party script independently — you may find that a chat widget nobody uses adds 400ms to every page load.

Caching Architecture

Effective caching goes beyond setting Cache-Control: max-age=31536000. Different resource types require different caching strategies, and the wrong strategy creates either stale content or unnecessary re-downloads.

Immutable static assets (JavaScript bundles, CSS files, images with content hashes in filenames): Cache for one year with immutable directive. The content hash in the filename changes when the content changes, so the cached version is always correct. Cache-Control: public, max-age=31536000, immutable.

HTML documents: Do not cache aggressively. Use Cache-Control: no-cache (which means "validate before using the cache," not "do not cache") so the browser always checks for a fresh version but can use the cached version if the server responds with 304 Not Modified. For static-generated pages, a short max-age (60-300 seconds) with stale-while-revalidate provides a balance between freshness and CDN cache efficiency.

API responses: Cache duration depends on data volatility. User-specific data should not be cached by shared caches (CDNs). Public data that changes infrequently (product catalogs, blog feeds) can be cached at the CDN edge with short TTLs and purged on update via webhook-triggered cache invalidation.

Service Worker caching adds a local cache layer that serves content when the network is unavailable or slow. Use it as a complement to HTTP caching, not a replacement. The service worker cache handles offline scenarios and instant repeat visits, while HTTP caching handles CDN-level efficiency.

Performance optimization is iterative. Measure, identify the biggest bottleneck, fix it, measure again. The diminishing returns curve eventually tells you when to stop optimizing and focus on other aspects of the application.