



Just yesterday (February 2, 2022) meanwhile Microsoft admitted that “customers using Azure AD may have experienced failures when performing any service management operations” blaming (quelle surprise) “a recent change to a dependent service caused the impact” which they promptly rolled back.Ī 2021 RCA from Microsoft suggests that, in theory, this shouldn’t be happening. “As a result of the way that the job execution infrastructure was implemented, the compounding failures were not visible in our telemetry - leading to engineer’s mis-identifying the cause initially and attempting mitigations which did not improve the underlying health of the service,” Microsoft admitted last month. In some cases, the jobs executed with such prolonged delays that they were unable to succeed, and customers will have seen failures in these cases. The impact spread to backup paired regions as the new code was deployed, resulting in job queue up, latency delays, and timeouts. It’s been a bumpy start to the year for Microsoft on the resilience front: recent outages include a January issue stemming from the Azure Resource Manager blamed by the company on a “a code modification which started rolling out on exposed a latent defect in the infrastructure used to process long running operations… over the course of hours, the job executions shifted entirely away from the regions that had received the new code to their backup paired regions. Follow The Stack on LinkedIn and say hi to the team… Reports of issues stretched back to early evening on Wednesday. Exchange admin center, Teams admin center and other service specific admin centers can also be accessed directly through their links” Microsoft added in an update at 11:46 GMT. Exchange admin center for Mailbox migration and Distribution lists. “While we’re focused on remediation, admins can use the following: Azure portal for User management, Licence management. We’re continuing our efforts to resolve the issue by also manually restarting the affected infrastructure to expedite the recovery” the company said shortly after 11am on February 3 as customers queried the regular time-outs that they were seeing on the portal.Īn earlier Microsoft update said - to the tune of a well known Alanis Morissette song - that a “recent service update designed to improve user experience is causing impact”. “The revert of the update is taking longer than expected to complete. Microsoft said it was having to manually restart infrastructure to help roll back an update that appeared to have caused issues. Users hosted in Africa, Europe, and the Middle East were struggling to access the Microsoft 365 Admin Portal today amid widespread issues triggered by a buggy update making it to production.
