Building Reliable Webhook Consumers

Webhooks Are Harder Than They Look

Receiving a webhook seems simple: listen for a POST request, parse the payload, do something with the data. In a tutorial, this takes ten lines of code. In production, it takes a system — because the internet is unreliable, webhook providers have different retry behaviors, payloads can arrive out of order, and your processing logic can fail partway through.

Every developer who has integrated with Stripe, GitHub, or Shopify webhooks has encountered the gap between the documentation's simplicity and the operational reality. Events arrive twice. Events arrive out of order. Your database is temporarily unavailable when a critical event arrives. The webhook provider retries, and your handler processes the same event again, creating duplicate records.

Building a webhook consumer that handles these realities requires a set of patterns that go beyond basic request handling. These patterns add complexity, but the alternative — debugging mysterious data inconsistencies in production — is worse.

Acknowledge First, Process Later

The most important pattern for reliable webhook handling is separating acknowledgment from processing. When a webhook request arrives, immediately return a 200 response after performing the minimum validation — typically signature verification and basic payload parsing. Then process the event asynchronously.

This separation matters because webhook providers have timeout thresholds. If your handler takes ten seconds to process an event — querying a database, calling another API, sending an email — the provider may time out and retry. Now you're processing the same event twice, potentially concurrently. By acknowledging immediately and queuing the event for background processing, you avoid timeouts entirely.

Store the raw event payload in a durable queue or database table before returning the 200 response. If your background processing fails, you can retry from the stored payload without relying on the webhook provider's retry behavior. This gives you control over retry timing, backoff strategy, and error handling — rather than being at the mercy of the provider's retry schedule.

This is the same principle that applies to background job architecture in general: accept the work quickly, confirm receipt, and process it reliably in a separate flow.

Idempotency: Handle Duplicates Gracefully

Webhook providers guarantee at-least-once delivery, not exactly-once delivery. This means your consumer will receive duplicate events. Designing for idempotency — ensuring that processing the same event multiple times produces the same result as processing it once — is non-negotiable for production systems.

The simplest approach is deduplication using the event ID. Most webhook providers include a unique identifier in each event. Store processed event IDs in a database table and check for duplicates before processing. If the event ID already exists, skip processing and return success.

1. Receive event with ID "evt_abc123"
2. Check: does "evt_abc123" exist in processed_events table?
3. If yes: return 200, skip processing
4. If no: insert "evt_abc123" into processed_events, then process

The insertion and the duplicate check should happen in a transaction or use an upsert to prevent race conditions where two instances of the same event arrive simultaneously.

For events that don't include a provider-assigned ID, create your own deduplication key from the event's content — typically a hash of the event type, the resource identifier, and the timestamp. This is less reliable because identical events with different payloads might generate different hashes, but it's better than no deduplication at all.

Beyond deduplication, make your processing logic itself idempotent. If the event updates a record, use an upsert rather than a conditional insert-or-update that might fail on race conditions. If the event creates a resource, check whether it already exists. If it triggers a notification, verify the notification hasn't already been sent. Defense in depth — deduplication at the consumer level and idempotency at the processing level — protects against the scenarios that either layer alone would miss.

Handling Out-of-Order Delivery

Webhook events don't always arrive in the order they occurred. A subscription "cancelled" event might arrive before the "created" event. An order "shipped" event might arrive before "paid." Your consumer needs to handle these sequences without corrupting data.

State machine validation is the most solid approach. Define the valid states for each resource and the valid transitions between them. When an event arrives that implies a state transition, verify that the transition is valid from the current state. If the resource doesn't exist yet (because the creation event hasn't arrived), either queue the event for later reprocessing or create the resource in the implied state.

For simpler scenarios, timestamp-based ordering works: include a timestamp comparison in your update logic. Only apply an update if the event's timestamp is newer than the last update you processed. This prevents older events from overwriting newer state, regardless of arrival order.

Security and Verification

Never process a webhook payload without verifying its authenticity. Without verification, anyone who discovers your webhook endpoint can send fabricated events that your system will process as legitimate.

Most webhook providers sign their payloads using HMAC with a shared secret. Verify the signature before any processing — including before storing the event for asynchronous processing. A forged event should be rejected with a 401 response immediately.

Verify the payload against the signature using a timing-safe comparison function. Standard string comparison is vulnerable to timing attacks where an attacker can determine the correct signature byte by byte based on response time differences. Every major language has a constant-time comparison function — use it.

Restrict your webhook endpoint to the expected IP addresses if the provider publishes their IP ranges. This adds a network-level verification layer on top of signature verification. Also, use HTTPS exclusively. A webhook payload transmitted over HTTP can be intercepted and read — or modified — by any intermediary.

Log all received webhooks, including rejected ones. If someone is attempting to send forged webhooks, the logs will show the pattern. If legitimate webhooks are failing signature verification, the logs will help you diagnose configuration issues — typically a mismatched signing secret between your provider configuration and your consumer code. Comprehensive logging turns debugging webhook issues from guesswork into investigation.