tag:status.strongdm.com,2005:/historystrongDM Status - Incident History2024-03-19T02:29:58ZstrongDMtag:status.strongdm.com,2005:Incident/202344732024-03-12T22:00:00Z2024-03-13T14:57:38ZSAML-based authentications were failing.<p><small>Mar <var data-var='date'>12</var>, <var data-var='time'>22:00</var> UTC</small><br><strong>Resolved</strong> - March 12, 22:09 UTC: SDM revoked a set of older encryption keys.<br />March 13, 00:14 UTC: A signing certificate used to verify SAML-based authentications fell out of cache and was re-retrieved and re-decrypted, but this decryption failed as it was encrypted using revoked keys. <br />March 13, 10:12 UTC: SDM was alerted to failures authenticating via SAML, affecting SSO logins. This also affected access to Snowsight resources, which use SAML for authentication.<br />March 13, 11:24 UTC: Issue was escalated and began restoring the relevant revoked keys. <br />March 13, 12:07 UTC: Revoked keys were restored. Issue resolved.</p>tag:status.strongdm.com,2005:Incident/196499502024-01-08T17:00:00Z2024-01-08T20:27:34ZA bug was pushed out impacting some customers using drivers related to MYSQL.<p><small>Jan <var data-var='date'> 8</var>, <var data-var='time'>17:00</var> UTC</small><br><strong>Resolved</strong> - At 17:00 UTC a sdm-cli update began rolling out which contained a bug; this bug caused some drivers to stop functioning, these drivers being primarily aliases for mysql:<br />* aurora-mysql<br />* aurora-postgres<br />* clustrix<br />* cockroach<br />* greenplum<br />* maria<br />* memsql<br />* singlestore<br />* a mongo replica-set variant<br /><br />We were alerted to this problem at 19:38 UTC and executed a rollback for those who had already received the slow rollout, and that rollback will be complete at 20:38 UTC.</p>tag:status.strongdm.com,2005:Incident/184374982023-09-08T16:57:16Z2023-09-08T22:01:26ZRouting Issues<p><small>Sep <var data-var='date'> 8</var>, <var data-var='time'>16:57</var> UTC</small><br><strong>Resolved</strong> - We made an enhancement to our routing protocol for compatibility with a new operating mode, which caused the current routing system to fail. The change has since been rolled back, and the issue identified to be resolved. The service should be functioning normally at this time.</p><p><small>Sep <var data-var='date'> 8</var>, <var data-var='time'>16:45</var> UTC</small><br><strong>Investigating</strong> - Some organizations may experience connection issues. We are currently investigating the issue and will update here with additional information.</p>tag:status.strongdm.com,2005:Incident/181440182023-08-11T04:00:00Z2023-08-15T22:34:02ZAWS Maintenance Outage<p><small>Aug <var data-var='date'>11</var>, <var data-var='time'>04:00</var> UTC</small><br><strong>Resolved</strong> - On August 11th, 2023 from 11:14pm to 11:15pm Pacific the StrongDM production database was taken offline due to required maintenance by our service provider which caused a brief interruption of service.<br /><br />We apologize for the inconvenience.</p>tag:status.strongdm.com,2005:Incident/169221832023-04-20T16:26:20Z2023-04-20T16:26:20ZSystem email provider is currently inoperable<p><small>Apr <var data-var='date'>20</var>, <var data-var='time'>16:26</var> UTC</small><br><strong>Resolved</strong> - We have implemented a resolution with our system email provider and the issue should now be resolved.</p><p><small>Apr <var data-var='date'>20</var>, <var data-var='time'>15:43</var> UTC</small><br><strong>Investigating</strong> - We are having an issue with our system email provider. This impacts password reset emails and other account-related emails. We are currently working towards a resolution and will provide further updates as soon as possible.</p>tag:status.strongdm.com,2005:Incident/169156062023-04-19T16:30:00Z2023-04-19T23:12:46ZSystem Outage<p><small>Apr <var data-var='date'>19</var>, <var data-var='time'>16:30</var> UTC</small><br><strong>Resolved</strong> - We released a server change this morning which increased the amount of data in each report generated by the reports library. This caused an increase in database activity that eventually led to operations timing out. This caused a brief outage in some server operations, primarily related to authentication. These lasted from 16:21 UTC to 16:29 UTC, upon which the issue was resolved.<br /><br />During this time period, end user queries completed successfully and no logging data was lost.<br /><br />We deployed a fix that prevents future impact on the database from operations related to the report library.</p>tag:status.strongdm.com,2005:Incident/157745902023-01-05T03:02:35Z2023-01-05T03:02:35ZSystem email provider is currently down<p><small>Jan <var data-var='date'> 5</var>, <var data-var='time'>03:02</var> UTC</small><br><strong>Resolved</strong> - Our email provider has provided a fix and we have verified that the issue is resolved. We are closing the incident as our provider has closed it on their side.</p><p><small>Jan <var data-var='date'> 5</var>, <var data-var='time'>00:21</var> UTC</small><br><strong>Identified</strong> - We are having an issue with our system email provider. This impacts password reset emails and other account-related emails. We are currently working and staying up to date with the provider and will provide further updates as soon as possible.</p>tag:status.strongdm.com,2005:Incident/144361842022-12-05T21:22:44Z2022-12-05T21:22:44ZNetwork issue preventing access to StrongDM<p><small>Dec <var data-var='date'> 5</var>, <var data-var='time'>21:22</var> UTC</small><br><strong>Resolved</strong> - AWS resolved the issue on their side. Our indicators show our service is fully operational for all our customers.<br />The issue had been resolved.</p><p><small>Dec <var data-var='date'> 5</var>, <var data-var='time'>21:10</var> UTC</small><br><strong>Monitoring</strong> - AWS are beginning to see signs of recovery. Currently our indications are that StrongDM service is back to normal but we continue monitoring it closely.</p><p><small>Dec <var data-var='date'> 5</var>, <var data-var='time'>21:01</var> UTC</small><br><strong>Identified</strong> - AWS confirmed there is an issue which is impacting Internet connectivity for the US-EAST-2 Region. AWS is working on fixing the issue.<br />We will continue to follow up and let you know once the issue is resolved in AWS.</p><p><small>Dec <var data-var='date'> 5</var>, <var data-var='time'>20:24</var> UTC</small><br><strong>Update</strong> - We are continuing to investigate this issue.</p><p><small>Dec <var data-var='date'> 5</var>, <var data-var='time'>20:21</var> UTC</small><br><strong>Investigating</strong> - We have noticed a network issue preventing some customers and users from accessing the StrongDM client and portal. We are investigating the issue and you can follow the status of our investigation at https://strongdm.statuspage.io/</p>tag:status.strongdm.com,2005:Incident/126392752022-10-27T20:05:50Z2022-10-27T20:15:42ZOkta SCIM sync issue<p><small>Oct <var data-var='date'>27</var>, <var data-var='time'>20:05</var> UTC</small><br><strong>Resolved</strong> - Our system was running into issues connecting to Okta SCIM and may have caused disruptions in user creation and deletion. When users are created and deleted on the Okta side, these may have not sync'ed to SDM. Logging of IPs might also have been affected.</p>tag:status.strongdm.com,2005:Incident/100884932022-05-27T12:00:00Z2022-05-27T13:35:05ZstrongDM Service Degradation<p><small>May <var data-var='date'>27</var>, <var data-var='time'>12:00</var> UTC</small><br><strong>Resolved</strong> - We experienced a slight degradation in services from a version promotion yesterday, which resulted in a higher connection count across the strongDM fleet. This led to some servers exceeding their maximum connection limit. Some users were briefly unable to access strongDM resources as a result. This release has been rolled back to a previous version to mitigate stress on the fleet.</p>tag:status.strongdm.com,2005:Incident/82576662021-10-18T19:49:40Z2021-10-18T19:49:40ZHeavy Database Load<p><small>Oct <var data-var='date'>18</var>, <var data-var='time'>19:49</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Oct <var data-var='date'>18</var>, <var data-var='time'>18:21</var> UTC</small><br><strong>Monitoring</strong> - The high database load condition has been resolved. Engineering continues to investigate root cause, and we are monitoring all systems actively.</p><p><small>Oct <var data-var='date'>18</var>, <var data-var='time'>18:09</var> UTC</small><br><strong>Update</strong> - The database load has been resolved and the engineering team continues to investigate the root cause.</p><p><small>Oct <var data-var='date'>18</var>, <var data-var='time'>18:07</var> UTC</small><br><strong>Investigating</strong> - strongDM is experiencing heavy database load which may manifest in service degradation. The strongDM engineering team is actively investigating and we will update this incident as more information becomes available.</p>tag:status.strongdm.com,2005:Incident/81224512021-10-01T00:04:18Z2021-10-01T00:04:18ZSDK: Service Degradation<p><small>Oct <var data-var='date'> 1</var>, <var data-var='time'>00:04</var> UTC</small><br><strong>Resolved</strong> - DNS has successfully propagated and all systems are operational.</p><p><small>Sep <var data-var='date'>30</var>, <var data-var='time'>18:57</var> UTC</small><br><strong>Update</strong> - strongDM has identified the source of the SDK degradation as an expired certificate issued by Let's Encrypt. strongDM has released remediating steps to update the DNS records and issued requests to DNS providers to refresh relevant entries. strongDM is awaiting DNS propagation.<br /><br />If any customer wishes to accelerate their ability to use our API before DNS propagates, please reach out to support@strongdm.com for a workaround.</p><p><small>Sep <var data-var='date'>30</var>, <var data-var='time'>18:05</var> UTC</small><br><strong>Update</strong> - Remediation steps to push a new certificate CA to our production servers. This may cause a service outage of a few minutes.</p><p><small>Sep <var data-var='date'>30</var>, <var data-var='time'>17:49</var> UTC</small><br><strong>Update</strong> - Further remediation steps are being investigated</p><p><small>Sep <var data-var='date'>30</var>, <var data-var='time'>17:31</var> UTC</small><br><strong>Identified</strong> - Remediation steps are being initiated which we expect to result in a service outage for a few minutes.</p><p><small>Sep <var data-var='date'>30</var>, <var data-var='time'>15:48</var> UTC</small><br><strong>Investigating</strong> - strongDM has identified a known issue that seems to be related to Let's Encrypt certificate expiration. strongDM is actively remediating.</p>tag:status.strongdm.com,2005:Incident/74430902021-07-08T07:00:00Z2021-07-08T13:09:15ZService degradation to strongDM API<p><small>Jul <var data-var='date'> 8</var>, <var data-var='time'>07:00</var> UTC</small><br><strong>Resolved</strong> - From 3:03 AM ET to 5:11 AM ET, strongDM service was affected due to high load on API servers. A remediation has been applied and full service has been restored.</p>tag:status.strongdm.com,2005:Incident/72001742021-06-08T18:48:04Z2021-06-08T18:48:04ZService Degradation<p><small>Jun <var data-var='date'> 8</var>, <var data-var='time'>18:48</var> UTC</small><br><strong>Resolved</strong> - After releasing the hotfix, Engineering has observed all systems return to Operational state.</p><p><small>Jun <var data-var='date'> 8</var>, <var data-var='time'>18:41</var> UTC</small><br><strong>Update</strong> - Engineering has released a fix and is actively monitoring system performance.</p><p><small>Jun <var data-var='date'> 8</var>, <var data-var='time'>18:27</var> UTC</small><br><strong>Update</strong> - We are continuing to monitor for any further issues.</p><p><small>Jun <var data-var='date'> 8</var>, <var data-var='time'>18:27</var> UTC</small><br><strong>Monitoring</strong> - strongDM is currently experiencing intermittent service degradation and the engineering team is actively investigating.</p>tag:status.strongdm.com,2005:Incident/68651542021-04-28T14:00:00Z2021-04-28T18:53:43ZTerraform GPG Certificate Rotation<p><small>Apr <var data-var='date'>28</var>, <var data-var='time'>14:00</var> UTC</small><br><strong>Resolved</strong> - Hashicorp has rotated its GPG certificates in order to remediate a potential breach of its systems. As a result, strongDM's Terraform provider signatures were invalidated. We have updated the certificate and released a new version. To inherit these changes, you will need to complete the following steps:<br /><br />In the Required Provider's block, update the minimum version to: "1.0.21" as seen below:<br />Once you complete this step, run terraform init before running any form of terraform plan or terraform apply:<br /><br />terraform {<br /> required_providers {<br /> sdm = {<br /> source = "strongdm/sdm"<br /> version = "1.0.21"<br /> }<br /> }<br />}<br />provider "sdm" {<br /> # Configuration options<br />}<br /><br />Our Support team is standing by to help troubleshoot if any questions come up.</p>tag:status.strongdm.com,2005:Incident/62471762021-02-11T21:34:52Z2021-02-11T21:34:52ZClient in reconnecting state<p><small>Feb <var data-var='date'>11</var>, <var data-var='time'>21:34</var> UTC</small><br><strong>Resolved</strong> - After investigation we are confident that this network connectivity issue is due to an interaction Sophos & a recent update to MacOS Big Sur v11.2.1. <br />Contact your Sophos representative for more information.</p><p><small>Feb <var data-var='date'>11</var>, <var data-var='time'>20:34</var> UTC</small><br><strong>Identified</strong> - We identified a known issue that seems to be related to an interaction between the latest Big Sur update and Sophos endpoint protection software.</p><p><small>Feb <var data-var='date'>11</var>, <var data-var='time'>18:54</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating an issue. Initial investigation is focused on two potential sources: <br />1) Big Sur update version 11.2.1<br />2) Anti-virus software</p>tag:status.strongdm.com,2005:Incident/57535962020-12-08T17:47:09Z2020-12-08T17:47:09ZLonger Than Expected Duration to Process Query Logs<p><small>Dec <var data-var='date'> 8</var>, <var data-var='time'>17:47</var> UTC</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Dec <var data-var='date'> 8</var>, <var data-var='time'>17:20</var> UTC</small><br><strong>Monitoring</strong> - A fix has been implemented and we are monitoring the results.</p><p><small>Dec <var data-var='date'> 8</var>, <var data-var='time'>15:59</var> UTC</small><br><strong>Investigating</strong> - strongDM is experiencing longer than expected duration to process query logs and the engineering team is actively investigating.</p>tag:status.strongdm.com,2005:Incident/52754402020-10-08T06:19:19Z2020-10-08T06:19:19ZstrongDM is currently experiencing intermittent service degradation and the engineering team is actively investigating.<p><small>Oct <var data-var='date'> 8</var>, <var data-var='time'>06:19</var> UTC</small><br><strong>Resolved</strong> - Engineering has identified and addressed the cause of the service degradation. Normal service is now restored and we will continue to monitor system performance.</p><p><small>Oct <var data-var='date'> 8</var>, <var data-var='time'>05:47</var> UTC</small><br><strong>Investigating</strong> - We are currently investigating this issue.</p>tag:status.strongdm.com,2005:Incident/46943152020-07-24T02:22:14Z2020-07-24T02:22:14ZstrongDM service outage<p><small>Jul <var data-var='date'>24</var>, <var data-var='time'>02:22</var> UTC</small><br><strong>Resolved</strong> - We are continuing to monitor for any further issues.</p><p><small>Jul <var data-var='date'>24</var>, <var data-var='time'>02:03</var> UTC</small><br><strong>Update</strong> - We are continuing to monitor for any further issues.</p><p><small>Jul <var data-var='date'>24</var>, <var data-var='time'>02:03</var> UTC</small><br><strong>Monitoring</strong> - A brief service outage has been resolved and we are monitoring the system status</p>tag:status.strongdm.com,2005:Incident/46761022020-07-21T22:34:49Z2020-07-21T22:34:58ZstrongDM service degradation<p><small>Jul <var data-var='date'>21</var>, <var data-var='time'>22:34</var> UTC</small><br><strong>Resolved</strong> - Engineering has identified and addressed the cause of the service degradation. Normal service is now restored and we will continue to monitor system performance.</p><p><small>Jul <var data-var='date'>21</var>, <var data-var='time'>22:09</var> UTC</small><br><strong>Monitoring</strong> - A fix has been implemented and we are monitoring the results.</p><p><small>Jul <var data-var='date'>21</var>, <var data-var='time'>22:00</var> UTC</small><br><strong>Identified</strong> - The issue has been identified and a fix is being implemented.</p><p><small>Jul <var data-var='date'>21</var>, <var data-var='time'>21:48</var> UTC</small><br><strong>Investigating</strong> - strongDM is currently experiencing intermittent service degradation and the engineering team is actively investigating.</p>tag:status.strongdm.com,2005:Incident/41650642020-05-22T06:06:11Z2020-05-22T06:06:11ZstrongDM service degradation<p><small>May <var data-var='date'>22</var>, <var data-var='time'>06:06</var> UTC</small><br><strong>Resolved</strong> - Reprocessing of audit logs complete. Service restored. All systems available.</p><p><small>May <var data-var='date'>21</var>, <var data-var='time'>21:20</var> UTC</small><br><strong>Update</strong> - Access to resources, our API, and admin UI are available and stable. Reprocessing of audit logs has been deferred, meaning query and session history will not be updated until the process is resumed. Reprocessing will resume no earlier than 05.22 0500 UTC.</p><p><small>May <var data-var='date'>21</var>, <var data-var='time'>21:02</var> UTC</small><br><strong>Monitoring</strong> - Customers should be able to successfully access servers and datasources through strongDM. However, some functionality within the admin UI may not be available.</p><p><small>May <var data-var='date'>21</var>, <var data-var='time'>20:12</var> UTC</small><br><strong>Update</strong> - Customers should be able to successfully access servers and datasources through strongDM. However, some functionality within the admin UI may not be available.</p><p><small>May <var data-var='date'>21</var>, <var data-var='time'>19:46</var> UTC</small><br><strong>Update</strong> - We are continuing to work on a fix for this issue.</p><p><small>May <var data-var='date'>21</var>, <var data-var='time'>19:41</var> UTC</small><br><strong>Identified</strong> - The issue has been identified and a fix is being implemented.</p>