Service Degradation on US Control Plane.

Incident Report for StrongDM

Postmortem

Summary
On December 11th 2025, StrongDM experienced a service degradation affecting the US Control Plane. The issue was caused by resource exhaustion from an internal monitoring component and lasted approximately 73 minutes before service was fully restored.
What Happened
An internal monitoring component configured to collect operational telemetry was unable to process data efficiently under US production load. This led to resource exhaustion on the Control Plane. UK and EU Control Planes were not affected due to lower total throughput volumes.
Resolution
Infrastructure rolled back the monitoring component's configuration, immediately relieving resource pressure and restoring normal service.
Prevention & Remediation
To prevent recurrence, StrongDM is updating testing processes to better catch issues like this and revising configuration review processes for internal tooling updates.

Posted Dec 12, 2025 - 19:58 UTC

Resolved

The incident has been resolved and we will provide an RCA within the next 7 days.
Posted Dec 12, 2025 - 02:12 UTC

Monitoring

We have identified an issue causing performance issues in our product DB. We are monitoring and a more detailed update will follow.
Posted Dec 12, 2025 - 02:10 UTC

Investigating

We are currently investigating the issue. We will update here as we know more.
Posted Dec 12, 2025 - 00:49 UTC
This incident affected: US Control Plane (Admin UI, API).