Degraded performance on dashboard and workflow execution
Incident Report for Knock
Resolved
This issue is now resolved and all services are fully operational. We’ll be sharing a postmortem report on this outage here soon.
Posted Mar 02, 2023 - 19:43 UTC
Update
Update. We’ve finished processing our backlog of delayed workflows, workflow runs/API logs, and webhooks.
Posted Mar 02, 2023 - 19:09 UTC
Update
We’re no longer seeing 500 errors and our API latency is back to normal levels. We continue to process workflows runs normally and you should no longer be seeing delays in workflow run process time. We’re continuing to process our backlog of workflow and API logs.
We’re continuing to monitor our service. We’ll update our status to fully operational in fifteen minutes if our current status persists.
Posted Mar 02, 2023 - 18:35 UTC
Update
We’re seeing 500 errors and API latency come down to normal levels. We continue to process workflows normally, though you may occasionally see delays.
We’re now working on processing our delayed backlog in our workflow and API logging services.
We’ll post an update here as we say our API return to its normal service level and when we’ve finished processing the backlog in our workflow and API.
Posted Mar 02, 2023 - 17:44 UTC
Update
We’re seeing 500 errors and API latency come down to normal levels. We continue to process workflows normally, though you may occasionally see delays.
We’re now working on processing our delayed backlog in our workflow and API logging services.
We’ll post an update here as we see our API return to its normal service level and when we’ve finished processing the backlog in our workflow and API.
Posted Mar 02, 2023 - 16:58 UTC
Update
We’re still seeing intermittent 500s on our service and API processing times are higher than normal.
We’re still processing workflows normally.
Our workflow logs and metrics are still delayed but will be processed once this issue is fully resolved.
Posted Mar 02, 2023 - 16:26 UTC
Update
Our service is still experiencing some 5xx issues intermittently but the level is fairly nominal.
Our workflows are processing as normal.
No workflow logs, Segment metrics, DataDog metrics, usage metrics are being processed right now (but we will work through the delay when we can).
Posted Mar 02, 2023 - 15:40 UTC
Monitoring
Our service has returned to a normal state and we are processing workflows and delivering notifications. Our dashboard logs will continue to be delayed. We are monitoring the situation closely before we declare this incident resolved.
Posted Mar 02, 2023 - 15:01 UTC
Update
The incident is still ongoing. We will provide an update in 15 minutes.
Posted Mar 02, 2023 - 14:34 UTC
Update
We are seeing an uptick on 500 errors on Knock's APIs, we are working on stabilizing the situation.
We are experiencing delays on our logging services.
We are experiencing delays on sending notifications.
Posted Mar 02, 2023 - 13:59 UTC
Update
We are seeing connection errors in our infrastructure related to our provider. We are still investigating why this is happening to get it resolved.
Posted Mar 02, 2023 - 13:18 UTC
Update
We are continuing to investigate this issue.
Posted Mar 02, 2023 - 12:53 UTC
Investigating
At 6AM EST we observed an uptick of errors on our Dashboard and requests to the Knock API. We are still seeing some errors, and delays in processing workflows. We are working to bring Knock back to full health and we will provide updates soon.
Posted Mar 02, 2023 - 12:42 UTC
This incident affected: Knock systems (Notification delivery, Dashboard, API).