The Strangler Fig Pattern: Migrating Legacy Systems Incrementally

Why Big-Bang Rewrites Fail

The instinct when facing a legacy system is to rewrite it. Start fresh, do it right this time, use modern tools. The reasoning feels sound: the old system is a mess, patching it is expensive, and a clean start would be faster than continuing to maintain something built with outdated technology.

In practice, big-bang rewrites fail more often than they succeed. The new system must reach feature parity with the old one before it can replace it. Feature parity takes longer than estimated because the old system's behavior — including its undocumented quirks and edge cases — is the specification. During the rewrite, the old system continues evolving because the business cannot wait. The target keeps moving. Eighteen months in, the new system handles 70% of what the old one does, the team is exhausted, and the project gets cancelled or descoped.

The strangler fig pattern, named by Martin Fowler after the tropical tree that grows around its host and gradually replaces it, takes a different approach. Instead of replacing the entire system at once, you replace it one capability at a time. The old system continues running. New functionality routes through new code. Over time, the new system handles more and more of the traffic until the old system can be decommissioned.

How the Pattern Works

The strangler fig pattern has three repeating phases: identify, implement, and redirect.

Identify a discrete piece of functionality in the legacy system that can be extracted. Good candidates are features that are relatively self-contained, have clear inputs and outputs, and are actively being modified (so the investment pays off immediately). A user authentication flow, a product search endpoint, a reporting module — something with defined boundaries.

Implement that functionality in the new system. The new implementation can use modern technology, better architecture, improved data models — whatever is appropriate. It does not need to replicate the old implementation's internal structure. It needs to produce the same external behavior for the same inputs.

Redirect traffic for that functionality from the old system to the new one. This is typically done through a routing layer — a reverse proxy, an API gateway, or a load balancer — that directs requests to the appropriate system based on the URL path, headers, or other criteria. The redirect can be gradual: send 10% of traffic to the new system first, verify correctness, then increase.

Once the redirected functionality is stable in the new system, the corresponding code in the old system becomes dead code. It is still there but no longer receives traffic. Eventually, it can be removed.

Repeat for the next piece of functionality.

The Routing Layer Is Critical

The mechanism that decides whether a request goes to the old system or the new system is the most important piece of infrastructure in a strangler fig migration. It needs to be:

Configurable without deployment. You need to be able to switch routing rules quickly — especially to roll back if the new system has issues. Feature flags or a configuration-driven routing table work well. Hardcoded routing logic that requires a deployment to change defeats the purpose.

Observable. You need to know how much traffic each system is handling, what the error rates are for each, and how latency compares. Without this visibility, you are migrating blind.

Transparent to clients. The clients making requests should not need to know whether they are talking to the old system or the new one. The routing layer abstracts this. If clients need to change their behavior based on which system handles their request, the abstraction is leaking.

An API gateway often serves this role naturally. If you already have one, it is the logical place to implement strangler fig routing. If you do not, a reverse proxy like Nginx or Envoy with path-based routing is sufficient for most cases.

When It Works and When It Struggles

The strangler fig pattern works best when the legacy system has clear functional boundaries that can be isolated. A system with a well-defined API surface — even if the internals are messy — is a good candidate because each API endpoint or group of endpoints can be migrated independently.

It struggles when the legacy system's data is deeply entangled. If migrating the orders functionality requires the new orders service to access the same database tables that the legacy inventory and billing code depend on, extracting orders without breaking inventory and billing is difficult. In these cases, the data migration strategy matters more than the routing strategy. Techniques like database views that abstract the underlying table structure or event-based synchronization between old and new data stores help manage this entanglement.

It also struggles when the legacy system's behavior is undocumented and inconsistent. The new system needs to match the old system's behavior at the boundary. If nobody knows exactly what the old system does in all cases, verifying correctness during the migration is guesswork. Characterization tests — tests written against the old system's actual behavior, not its intended behavior — are essential groundwork before starting the migration.

The pattern rewards patience. Each increment is small, testable, and reversible. The business gets value from each step rather than waiting for a complete rewrite. And if the migration stalls or priorities change, you are left with a partially modernized system rather than an incomplete rewrite and a still-running legacy system.

If you have a legacy system that needs modernization and want to plan an incremental migration that manages risk, let's talk.