Introduction to webhook troubleshooting
Webhook troubleshooting means tracing a delivery from the provider to your endpoint and through the code that handles it, then finding where the chain breaks. A webhook can fail before it leaves the sender, arrive at your server but be rejected, or reach your app and still never be processed correctly. The goal is to follow the delivery path step by step instead of guessing.
The fastest way to isolate the problem is to separate sender-side issues from receiver-side issues. On the sender side, the provider may never send the event, may target the wrong URL, or may be retrying after repeated failures. On the receiver side, your app may time out, return an error, fail signature verification, or drop the payload before it reaches business logic.
Webhooks are harder to debug than polling APIs because delivery is asynchronous, retries are common, and network hops add uncertainty. Many systems use at-least-once delivery, so duplicates and delayed arrivals are normal behaviors, not always bugs. For background on the concept, see what is a webhook. For a hands-on walkthrough, see how to debug webhook requests.
What webhook troubleshooting covers
Webhook troubleshooting usually means answering four questions: why the webhook was not received, why it was received but not processed, why it was duplicated, and why it was delayed. It also includes checking the endpoint, request method, headers, payload, signature, logs, retries, and downstream processing.
If you never see a request in logs, check whether the failure happened before your server accepted it: DNS, TLS, SSL certificate errors, WAF or firewall blocks, reverse proxy rules, load balancer routing, or CDN behavior. If the provider got a response, look at the HTTP status code. 2xx status codes usually mean success, while 3xx redirects, 4xx client errors, and 5xx server errors can trigger retries or stop processing depending on the provider.
When the request reaches your app but still fails, inspect the body with webhook inspection and how to debug webhook requests. Common causes include invalid JSON, payload validation failures, signature verification errors, auth failures, and downstream queue or service outages.
Why webhooks are not received
Start with the exact URL, environment, and HTTP method. A provider sending POST to an endpoint that only accepts GET, PUT, or DELETE can trigger 405 Method Not Allowed. A 301 or 302 redirect can also cause problems if the provider does not follow redirects cleanly or if headers are dropped during the redirect chain.
If there is no server log entry, focus on sender or network issues first. Verify DNS resolution, confirm the TLS connection succeeds, and check whether a WAF, firewall, reverse proxy, or CDN such as Cloudflare is blocking the request before it reaches your app. In production, also confirm that the load balancer is routing traffic to the correct backend and that the endpoint is publicly reachable.
Use curl or Postman to send a test POST request to the endpoint, and compare the response with what the provider sees. If you need a temporary public endpoint for testing, use ngrok or a request bin service to confirm the provider can reach your URL.
Why a webhook is received but not processed
If the request reaches your server, inspect the response code, app exceptions, JSON parsing, signature verification, missing fields, queue failures, and downstream outages. A 4xx client errors response or a 5xx server errors response can stop processing, and even a hidden exception after a 2xx status codes reply can leave the provider thinking delivery succeeded while your app never finished the work.
Check whether the handler is doing too much work inline. Webhook receivers should usually validate the request, enqueue the event, and return a 2xx response quickly. Heavy processing belongs in a queue or background jobs so the provider does not time out waiting for your app.
If the payload looks valid but processing still fails, compare the raw request body against your parser’s expectations. A common issue is verifying a modified body instead of the exact raw payload that was signed.
Why webhook events get duplicated
Duplicates are often expected under at-least-once delivery. Providers retry after non-2xx responses, timeouts, or transient network failures, and retry logic often uses exponential backoff. That means the same event may arrive more than once even when nothing is broken.
Design consumers for idempotency. Store an event ID or idempotency key in a deduplication table, check current state before writing, and use safe upserts instead of blind inserts. If a provider sends the same event twice, your handler should ignore the duplicate after the first successful write.
If duplicates appear only in production, check whether retries are being triggered by slow responses, intermittent 408 Request Timeout errors, or 429 Too Many Requests responses. Also confirm that your dedupe logic is shared across all app instances, not just one process.
Why webhook deliveries are delayed
Delays usually come from provider backoff, endpoint latency, cold starts, rate limiting, or queue congestion. If there is a log entry, focus on application processing; if not, focus on sender or network issues.
A delayed webhook can also be caused by a slow database, a saturated worker pool, or a dead-letter queue that is filling up because retries keep failing. In serverless environments such as AWS Lambda or AWS API Gateway, cold starts and concurrency limits can make delivery appear delayed even when the provider sent the event on time.
If the provider supports a delivery dashboard, compare the event timestamp, retry count, and response history with your own logs. That often shows whether the delay happened before the request reached your system or after your app accepted it.
How to debug webhook requests step by step
- Confirm the exact endpoint URL, environment, scheme, path, and trailing slash. Check for staging versus production mix-ups and redirect chains.
- Send a test request with
curlor Postman and verify the endpoint accepts the expected HTTP method and returns the expected status code. - Inspect server logs, reverse proxy logs, and application logs. Look for access logs, error logs, and any structured logging fields that include a correlation ID, request ID, or event ID.
- Verify the payload and headers. Confirm
Content-Type, JSON formatting, HMAC signature verification, SHA-256 usage, and the shared secret used to compute the signature. - Compare the raw request body with the signed payload. If the body is altered by middleware, the signature check will fail even when the provider sent a valid request.
- Reproduce the event with a webhook inspector, webhook debugger, replay tool, or request bin so you can compare a known-good delivery with the failing one.
- Check downstream systems. If the webhook is accepted but not processed, inspect the queue, background jobs, database writes, and any service dependencies.
- If the issue persists, review provider retries, delivery history, and error messages before escalating.
For more detail, see how to debug webhook requests, webhook endpoint testing, and webhook testing for developers.
How to verify a webhook URL and endpoint
Verification starts with the URL itself. Confirm the scheme is https, the host is correct, the path matches the provider configuration, and the endpoint is reachable from the public internet. Then verify the route accepts the provider’s method, usually POST, and returns a fast 2xx response.
Check the endpoint in the actual runtime environment, not just locally. A route that works in development may fail in production because of DNS differences, TLS certificate problems, WAF rules, reverse proxy configuration, or a load balancer sending traffic to the wrong service.
If the provider offers a verification or test delivery feature, use it before enabling live traffic. You can also use webhook endpoint testing and webhook delivery testing to confirm the endpoint responds correctly under realistic conditions.
What server logs to check for webhook issues
Check access logs first to confirm whether the request arrived at all. Then review application logs for parsing errors, signature failures, authorization failures, and exceptions in the handler. If you use a reverse proxy such as nginx or Apache, check its logs too, because the request may fail before it reaches the app.
If your stack includes Express.js on Node.js, Flask or Django on Python, Ruby on Rails, or PHP/Laravel, make sure the logs include the request path, method, status code, and a correlation ID. Structured logging makes it much easier to trace one event across the web server, queue, and database.
Observability tools such as Sentry, Datadog, New Relic, and OpenTelemetry can help correlate webhook failures with application errors and latency spikes. Use them to confirm whether the failure is in the network layer, the request parser, or downstream processing.
How to confirm a webhook payload and signature
Start by capturing the raw payload exactly as it was received. Then compare it with the provider’s expected format, usually JSON. Confirm the headers include the correct Content-Type, event type, and any provider-specific metadata.
For signature verification, use the provider’s documented algorithm, often HMAC with SHA-256 and a shared secret. Some providers also use a Bearer token or API key for authentication, but that is separate from payload signing. The important point is to verify the raw body before any transformation, whitespace normalization, or JSON reserialization.
If the signature fails, check for the wrong secret, a rotated secret that was not updated, or middleware that changed the request body. A replay tool or webhook inspector can help you compare a successful delivery with a failed one and isolate the exact mismatch.
How webhook retries work
Most providers retry failed deliveries automatically. A retry is usually triggered by a timeout, a network failure, or a non-2xx response. Many providers use exponential backoff so the delay between attempts increases over time.
Retries are useful because they protect against temporary outages, but they also mean your handler must be safe to run more than once. That is why idempotency matters. If the first attempt succeeded but the response was lost, the provider may send the same event again.
If retries continue for too long, some providers move the event to a dead-letter queue or mark it as permanently failed. At that point, you should inspect the event manually and decide whether to replay it.
What a webhook inspector or replay tool does
A webhook inspector or replay tool lets you capture, inspect, and resend webhook requests. These tools are useful when you need to see the exact headers, payload, and response code without waiting for the provider to send another live event.
Use a webhook inspector during development to confirm the provider is sending the expected JSON and signature headers. Use a replay tool when you want to resend a known event after fixing your code. Tools like ngrok, request bin, and dedicated webhook debugger products are especially helpful when you need to test against a temporary public URL.
For more on this workflow, see webhook debugger and webhook inspection.
Can firewalls or proxies block webhooks?
Yes. Firewalls, WAFs, reverse proxies, load balancers, and CDNs can all block or alter webhook traffic before it reaches your application. TLS termination, SSL certificate issues, IP allowlists, and request size limits can also interfere with delivery.
If your provider says the event was sent but your app never logs it, check whether the request was blocked by Cloudflare, nginx, Apache, or a cloud security group. In AWS, also review AWS API Gateway, AWS Lambda, and any network controls in front of the function or service.
Why webhooks work in staging but fail in production
This usually means the environments are not actually equivalent. Staging may use different DNS records, a different SSL certificate, looser firewall rules, a separate reverse proxy config, or a smaller data set that makes processing faster.
Production can also fail because of stricter WAF rules, rate limits, queue saturation, or a different auth secret. If staging works but production fails, compare the full request path, not just the application code.
How to test webhook endpoints before production
Test the endpoint with real requests before you turn on live traffic. Use curl, Postman, ngrok, or a webhook debugger to send sample payloads and confirm the endpoint returns the correct status code, validates signatures, and handles invalid input safely.
Include negative tests too: bad JSON, missing headers, wrong signature, expired secret, duplicate event ID, slow downstream dependency, and timeout behavior. That gives you confidence that the endpoint can handle both normal deliveries and failure cases.
For more guidance, see webhook testing for developers, webhook endpoint testing, and webhook delivery testing.
When to contact the webhook provider for help
Contact provider support when your logs show successful processing but the provider dashboard still reports failures, or when you see evidence of provider outages, malformed payloads, or account-level restrictions. Send event IDs, request IDs, timestamps, response codes, screenshots, and relevant log excerpts.
Also contact the provider if you have confirmed that your endpoint is reachable, signatures verify correctly, retries are behaving as expected, and the problem still appears to originate on their side. That evidence helps the provider compare their delivery record with your server-side trace and separate your issue from theirs.
Final checklist
Before you close a webhook incident, confirm the following:
- The endpoint URL, method, and environment are correct
- DNS, TLS, SSL certificate, WAF, firewall, reverse proxy, and load balancer checks are complete
- The raw JSON payload matches what was signed
- HMAC SHA-256 signature verification passes with the correct shared secret
- Logs include a correlation ID, request ID, or event ID
- Retries, exponential backoff, and retry logic are understood
- Idempotency and idempotency key handling prevent duplicate side effects
- Queue, background jobs, and dead-letter queue behavior are verified
- The endpoint was tested with curl, Postman, ngrok, a webhook inspector, or a replay tool
- Provider support is contacted only after you have evidence from your own logs