Summary:
On April 30th, 2024, starting at 8:18 AM EDT, we began to receive reports that User Sync files were failing to process, and the following error message was returned:
The encryption key expected to be used is [Key Fingerprint].
A platform incident was declared at 10:23 AM EDT and was fully mitigated by 10:59 AM EDT.
Scope and Impact:
The scope of this incident was isolated to only customers who encrypt their User Sync file before uploading it. The impact of this incident was restricted to customers who had uploaded an encrypted User Sync file between 10:03 PM EDT on April 29th, 2024, and 10:59 AM EDT on April 30th, 2024.
Root Cause:
The incident response team identified that this incident resulted from a regression to a software release on April 29th, 2024. It was identified that the OS image used to deploy the upgrade lacked crucial packages for decryption.
Mitigation:
At 10:59 AM EDT, the released upgrade was rolled back to its previous version which contained the decryption packages, to allow normal decryption of encrypted User Sync files. We also identified and reprocessed any encrypted customer User Sync files that had failed to process within the duration of the incident.
Recurrence Prevention:
A technical team post-mortem meeting reviewed that the zip-based deployment of the OS had no controls over updating or re-deploying the upgrade. We therefore transitioned to an image-based deployment which allowed for greater control over the OS image and the necessary dependencies. The upgrade was later redeployed on May 6th, 2024, using the OS image that included the necessary decryption packages.
We also:
Added additional monitoring and alerting for the health of external-registration (User Sync
files processing).
Updated regression test packs to include testing user sync with encrypted files.