- ~tech - Tildes

4 votes

Posted July 16, 2018 by unknown user

Topic deleted by author

2 comments

eYredWkae3QVaX8b
July 16, 2018
Link
There is only /one/ possible way to anonymize a dataset: Randomizing the dataset D such that the resulting dataset D' is drawn from the same distribution as D. This basically, means that the table...

There is only /one/ possible way to anonymize a dataset: Randomizing the dataset D such that the resulting dataset D' is drawn from the same distribution as D. This basically, means that the table rows don't mean anything on their own but you can learn the same stuff from D' as from D.

But there is always the utility-privacy trade off. When there is little data, you cannot randomize a lot before you start to lose information. On the other hand if the dataset is large, it becomes possible to make very strong privacy guarantees without throwing information away.

Randomization is the only way for micro-data to be shared. And as always, there is no free lunch in data mining. It's possible, research has been going since the late '70s on this topic. Companies are just not doing it.

2 votes
Catt
July 16, 2018
Link
I believe this post is a duplicate of this one.

I believe this post is a duplicate of this one.

1 vote