Enterprise Integration Patterns That Actually Work in Production

The Gap Between Theory and Production

The classic book on enterprise integration patterns was published in 2003. The patterns it describes — message channels, message routers, correlation identifiers, dead letter channels — are still valid. The book is worth reading. But it describes patterns in a vacuum, and production integration systems don't operate in a vacuum.

They operate in environments where vendors change their APIs without warning, where network partitions happen at 2 AM, where a message that should have been processed once gets processed three times because of a retry storm, where the documentation says the endpoint accepts JSON but it actually returns XML with a JSON Content-Type header.

This is a practical guide to integration patterns that hold up in production, written by someone who has maintained them when things went wrong at 2 AM.

The Foundational Decision: Synchronous vs. Asynchronous

Before you design any integration, you need to answer one question: does the caller need an immediate response?

Synchronous integrations — REST, gRPC, GraphQL — are appropriate when:

The caller needs the result to continue processing
The operation completes quickly (under a few seconds)
Failure handling is simple (return an error, let the caller retry)
The volume is low enough that blocking waits don't cause resource exhaustion

Asynchronous integrations — message queues, event streams, webhooks — are appropriate when:

The operation is long-running or involves multiple systems
The caller doesn't need an immediate result
You want to decouple the producer from the consumer's availability
High volume requires buffering to handle load spikes

Most enterprise systems need both. An order submission might be synchronous (the user needs confirmation), but the fulfillment workflow triggered by that order is asynchronous (warehouse picking, inventory reservation, shipping label generation — all independent operations that don't need to block the customer).

Pattern 1: The Anti-Corruption Layer

This is the pattern I use most often, and the one most teams skip because it feels like over-engineering.

When you integrate with an external system — a vendor's ERP, a payment gateway, a third-party API — that system has its own data model and business semantics. If you let those external semantics leak into your core domain, your internal code becomes coupled to the vendor's data model. When the vendor changes their API or you switch vendors, the change ripples through your entire codebase.

The anti-corruption layer (ACL) is a translation boundary between the external system's model and your internal domain model. Your code talks to your model. The ACL translates between your model and the vendor's API.

// External vendor model (their API's shape)
interface VendorOrderResponse {
 ord_id: string;
 ord_status: 'O' | 'C' | 'X'; // vendor's codes
 line_items: Array<{ sku: string; qty: number; unit_prc: number }>;
}

// Your internal domain model
interface Order {
 id: string;
 status: 'open' | 'closed' | 'cancelled';
 items: Array<{ productSku: string; quantity: number; unitPrice: Money }>;
}

// The ACL translates between them
function translateVendorOrder(vendor: VendorOrderResponse): Order {
 return {
 id: vendor.ord_id,
 status: translateStatus(vendor.ord_status),
 items: vendor.line_items.map(translateLineItem),
 };
}

When the vendor changes ord_status codes in their next API version, you change the ACL. The rest of your codebase doesn't know anything changed.

Pattern 2: Idempotent Message Consumers

Distributed systems deliver messages at least once. This is not a bug — it's a property of reliable distributed systems. A message that's not acknowledged before a timeout gets redelivered. Network partitions cause duplicate deliveries. Your consumer will process the same message more than once.

If your processing is not idempotent — if processing the same message twice produces different results — you will corrupt data.

The implementation pattern is straightforward: every message needs an idempotency key, and your consumer tracks which keys have been processed.

async function processOrderCreatedEvent(event: OrderCreatedEvent): Promise<void> {
 const idempotencyKey = `order-created-${event.orderId}`;

 // Check if already processed
 const alreadyProcessed = await idempotencyStore.exists(idempotencyKey);
 if (alreadyProcessed) {
 return; // Safe to skip, already handled
 }

 // Process the event
 await fulfillmentService.createFulfillmentOrder(event.orderId);

 // Mark as processed
 await idempotencyStore.set(idempotencyKey, { processedAt: new Date() }, { ttl: 30 * 24 * 3600 });
}

The idempotency store can be Redis, a database table, or any persistent store. The TTL should be longer than your longest possible message redelivery window.

Critical: the idempotency check and the business operation should be in the same transaction where possible. If you check, process, and then fail to record — you'll process again on retry. Atomic operations or database-level idempotency constraints are your friend here.

Pattern 3: The Outbox Pattern for Reliable Event Publishing

Here's a failure mode I've seen more than once: your service updates a database record and then publishes an event to a message queue. The database write succeeds. The queue publish fails. Or the process crashes between the two. Now your database says the order was created but no event was published, and downstream services never know.

The outbox pattern solves this by making event publishing part of the database transaction.

-- Both operations are in the same transaction
BEGIN;
 INSERT INTO orders (id, customer_id, status) VALUES ($1, $2, 'pending');
 INSERT INTO outbox_events (id, event_type, payload, published_at)
 VALUES (gen_random_uuid(), 'order.created', $3, NULL);
COMMIT;

A separate background process (the outbox relay) reads unpublished outbox events and publishes them to the message queue, then marks them as published. If the relay fails, it retries. The event is never lost because it's in the database — atomic with the business operation.

This pattern adds a bit of latency (the relay polling interval) but provides exactly-once delivery semantics for your outbound events. For high-volume systems, the relay can be a separate service with CDC (Change Data Capture) from the outbox table instead of polling.

Pattern 4: Circuit Breakers for Unstable Downstream Systems

Enterprise integrations involve calling systems you don't control. Those systems go down, get slow, return errors. Without protective patterns, a slow downstream system can cascade failures into your system — requests pile up, connections exhaust, your system degrades.

The circuit breaker pattern sits between your code and the downstream call. When failures exceed a threshold, the circuit "opens" and subsequent calls fail fast without attempting the downstream call. After a configured timeout, the circuit enters a "half-open" state and tries one request. If it succeeds, the circuit closes. If it fails, it stays open.

const circuitBreaker = new CircuitBreaker(callExternalAPI, {
 timeout: 3000, // Timeout threshold (ms)
 errorThresholdPercentage: 50, // Open circuit if >50% fail
 resetTimeout: 30000, // Try again after 30 seconds
});

// Your code calls the breaker, not the API directly
try {
 const result = await circuitBreaker.fire(requestPayload);
 return result;
} catch (error) {
 if (error.name === 'OpenCircuitError') {
 // Circuit is open — return cached result or degrade gracefully
 return getCachedResult();
 }
 throw error;
}

The critical companion to circuit breakers is graceful degradation: when the circuit is open, what does your system do? Return a cached result? Queue the request for later? Return a default value? This needs to be designed, not improvised at 2 AM.

Pattern 5: Event Sourcing for Audit-Critical Integrations

In integrations where auditability matters — financial systems, compliance-driven domains, medical records — event sourcing provides a pattern that makes the audit trail intrinsic rather than bolted on.

Instead of recording only current state, event sourcing records every state change as an immutable event. The current state is derived by replaying the events.

// Events are the source of truth
type OrderEvent =
 | { type: 'OrderCreated'; customerId: string; items: OrderItem[] }
 | { type: 'OrderPaid'; amount: Money; paymentRef: string }
 | { type: 'OrderShipped'; trackingNumber: string; shippedAt: Date }
 | { type: 'OrderCancelled'; reason: string; cancelledAt: Date };

// Current state is derived from events
function deriveOrderState(events: OrderEvent[]): Order {
 return events.reduce(applyEvent, initialOrderState());
}

The audit trail isn't a log — it's the system. You can reconstruct the state of any order at any point in time by replaying events up to that moment. This is valuable for debugging integration issues (you can see exactly what happened and in what sequence) and for compliance (the complete history is always available).

The Integration That's Not Worth Building

One pattern I want to name explicitly: the point-to-point integration that accumulates over years until your architecture is a web of pairwise connections between systems, each integration built in isolation, none of them discoverable or manageable as a whole.

This happens when integrations are built tactically — each one seemed reasonable at the time — without an architectural view of the overall integration topology.

The solution is not necessarily an ESB (Enterprise Service Bus) — those have their own problems. But it is intentional: define your integration layer as an explicit architectural concern, choose whether that's a shared event bus, a dedicated integration service, or a mesh pattern, and apply standards consistently.

Integration work done in isolation creates integration debt that is extraordinarily expensive to untangle.

What Successful Enterprise Integration Looks Like

The integrations that hold up in production have a few things in common: clear ownership, documented behavior, observable systems, and graceful failure modes. The patterns above are tools toward those goals, not ends in themselves.

If you're designing an integration architecture for an enterprise system and want to work through the patterns with someone who has built and debugged these systems in production, schedule time at calendly.com/jamesrossjr.