2/15/2021 12:27 PM
The LMS365 platform was unfortunately impacted by two incidents from the Microsoft Azure infrastructure:
* RCA - Azure Front Door - North Europe CKTV-TT0
* RCA - App Service - North Europe 9KZV-TT8
Root causes analysis provided by Microsoft:
Root Cause: We determined that instances of a backend service had become unhealthy preventing requests from completing.
Mitigation: We removed the unhealthy devices from rotation and rerouted the network traffic. This allowed for the requests to complete as expected.
As a result users in the North Europe region faced slow login requests, and to make the system work again we had to reconfigure our application to make sure it uses healthy instances. After Microsoft mitigate the issue on the Azure side we have reverted back to our normal configuration and report that all services are running optimally again.
Despite the fact the issue was on the Azure side we will investigate how to avoid a similar situation in the future.
Thank you for your patience while resolved the issue impacting performance.
We’re happy to report the issues affecting performance for customers in our North Europe Region has now been resolved.
Thank you for your patience and please let us know if you see any further issues.
Post-mortem to follow:
The Microsoft Product Team is working on mitigating the issue in the North European Region. Additional MS engineering teams have been engaged, and they believe they have identified a potential root cause. They are in the process of applying mitigation steps.2/12/2021 9:41 AM
We have identified the root cause of the performance degradation caused by the Azure Frontdoor service in our European region. Our Operation Team is in close contact with Microsoft and is able to temporarily re-routed traffic and improve the overall performance. We will continue to provide updates and work on resolving all services back to the expected high level.2/12/2021 8:32 AM
We have received reports of issues with slow performance in the North Europe Region (service). We will keep you updated as we investigate further.FOR MORE INFORMATION
For current system status information about LMS365, check out our system status page. During an incident, you can also receive status updates by subscribing to updates available on our status page. The summary of our post-mortem investigation is usually posted here a few days after the incident has ended. If you have additional questions about this incident, please log a ticket with us.
Article is closed for comments.