We are investigating a problem with authorizations. Updates to follow.

Incident Report for StrongDM

Postmortem

Cause:

  • A change to the control plane was rolled out which altered the default policy used to authorize connections when the policy editor is not enabled for an organization. This policy was synced down to customer infrastructure for local authorization but, critically, new entities which were referenced by that policy were not always synchronized. 
  • This was caused by an incomplete implementation of a database synchronization function on the control plane, which, in some cases, informed clients about the policy but not about the new entities.
  • The policy began to deny authorizations on customer infrastructure due to missing entity references which caused the policy to not apply. This issue affected customers who had policy disabled (either by SKU or by toggle in the Admin UI) because the globalpolicy is intended to stand in for hand-written policy when policy is enabled.

Resolution:

  • To fix the issue, the control plane was rolled back to a version that did not have the new globalpolicy plugin. However, the bad policy could have synchronized to customer nodes and, in order to make sure that it was purged, we later merged a change to bump the authsync version which forced all customers' nodes to fully resynchronize their policies and entities with the control plane.

Downtime:

  • Total downtime was just over an hour. Restarting nodes helped some customers due to forcing a resynchronization of the full set of policies and entities. 

Prevention:

  • Going forward StrongDM now automatically forces a full resynchronization whenever code that is involved in synchronizing policies and entities to the nodes is modified.
Posted Sep 26, 2025 - 17:13 UTC

Resolved

This incident has been resolved.

If you are still seeing issues please reach out to StrongDM Support via our Help Center: https://help.strongdm.com/hc/en-us/requests/new

An RCA will be posted within the next 7 business days.
Posted Sep 18, 2025 - 19:11 UTC

Monitoring

A fix has been implemented and we are monitoring the results.

If you are still seeing issues at this point you should be able to restart your relay/gateways to get back into a good state.

Let us know if you are still having authorization failures via our ticket portal: https://help.strongdm.com/hc/en-us/requests/new
Posted Sep 18, 2025 - 18:40 UTC

Identified

The issue has been identified and a fix is being implemented.
Posted Sep 18, 2025 - 18:22 UTC

Update

We are continuing to investigate this issue.
Posted Sep 18, 2025 - 18:10 UTC

Investigating

We are currently investigating this issue. Updates to follow.
Posted Sep 18, 2025 - 18:09 UTC
This incident affected: US Control Plane (Admin UI, API), UK Control Plane (Admin UI, API), and EU Control Plane (Admin UI, API).