14 votes

38TB of data accidentally exposed by Microsoft AI researchers

Posted September 18, 2023 by Amun

Tags: microsoft, artificial intelligence, github, sas, researchers, data.user, leaks.data, wiz research, author.hillai ben sasson, author.ronny greenberg

https://www.wiz.io/blog/38-terabytes-of-private-data-accidentally-exposed-by-microsoft-ai-researchers

Link information

This data is scraped automatically and may be incorrect.

Published: Sep 18 2023
Word count: 2179 words

1 comment

Amun (OP)
September 18, 2023
Link
Hillai Ben-Sasson, Ronny Greenberg Wiz Research found a data exposure incident on Microsoft’s AI GitHub repository, including over 30,000 internal Microsoft Teams messages – all caused by one...
Hillai Ben-Sasson, Ronny Greenberg

Wiz Research found a data exposure incident on Microsoft’s AI GitHub repository, including over 30,000 internal Microsoft Teams messages – all caused by one misconfigured SAS token
Microsoft’s AI research team, while publishing a bucket of open-source training data on GitHub, accidentally exposed 38 terabytes of additional private data — including a disk backup of two employees’ workstations.

The backup includes secrets, private keys, passwords, and over 30,000 internal Microsoft Teams messages.

The researchers shared their files using an Azure feature called SAS tokens, which allows you to share data from Azure Storage accounts.

The access level can be limited to specific files only; however, in this case, the link was configured to share the entire storage account — including another 38TB of private files.

This case is an example of the new risks organizations face when starting to leverage the power of AI more broadly, as more of their engineers now work with massive amounts of training data. As data scientists and engineers race to bring new AI solutions to production, the massive amounts of data they handle require additional security checks and safeguards.
Link to Microsoft blog

Microsoft mitigated exposure of internal information in a storage account due to overly-permissive SAS token
5 votes