14 votes

38TB of data accidentally exposed by Microsoft AI researchers

1 comment

  1. Amun
    Link
    Hillai Ben-Sasson, Ronny Greenberg Wiz Research found a data exposure incident on Microsoft’s AI GitHub repository, including over 30,000 internal Microsoft Teams messages – all caused by one...

    Hillai Ben-Sasson, Ronny Greenberg


    Wiz Research found a data exposure incident on Microsoft’s AI GitHub repository, including over 30,000 internal Microsoft Teams messages – all caused by one misconfigured SAS token

    • Microsoft’s AI research team, while publishing a bucket of open-source training data on GitHub, accidentally exposed 38 terabytes of additional private data — including a disk backup of two employees’ workstations.

    • The backup includes secrets, private keys, passwords, and over 30,000 internal Microsoft Teams messages.

    • The researchers shared their files using an Azure feature called SAS tokens, which allows you to share data from Azure Storage accounts.

    • The access level can be limited to specific files only; however, in this case, the link was configured to share the entire storage account — including another 38TB of private files.

    • This case is an example of the new risks organizations face when starting to leverage the power of AI more broadly, as more of their engineers now work with massive amounts of training data. As data scientists and engineers race to bring new AI solutions to production, the massive amounts of data they handle require additional security checks and safeguards.

    Link to Microsoft blog

    Microsoft mitigated exposure of internal information in a storage account due to overly-permissive SAS token

    5 votes