IM histories and to some extent personal posts can go and no one will care but all the actual documents on the internet should be preserved just like we preserve books. The internet has generated...
IM histories and to some extent personal posts can go and no one will care but all the actual documents on the internet should be preserved just like we preserve books. The internet has generated mountains of historical data which once would have been spoken conversation and never recorded and largely a lot of this is junk that could go. Archive.org is essential for when a document you need is gone or changed and you need to find it again. I just wish they had some way to search the internet at certain points because sometimes I want to research something old and I don't have the links to the pages which would have hosted it.
We don't care about it now (and often we may feel more comfortable if that sort of thing would disappear) but I think in the future this kind of information will be invaluable to historians trying...
The internet has generated mountains of historical data which once would have been spoken conversation and never recorded and largely a lot of this is junk that could go.
We don't care about it now (and often we may feel more comfortable if that sort of thing would disappear) but I think in the future this kind of information will be invaluable to historians trying to understand our time. What would we give now to have transcripts of conversations between Han dynasty peasants or to be able to read the comments from a Babylonian Reddit equivalent? Even to take the example from the article of something trivially useless—commercial invoices from the 1800s—it brought to mind a book I read recently that used financial records from the 1700s to try to make an argument for the causes of the American Revolution.
None of this is to say that we necessarily should save all this disappearing information or that there aren't problems associated with keeping it, but I don't think it's as easy to write off as useless as this article says.
To add more weight to what you're saying... most of what we know about bronze age Mesopotamia (Sumer, Akkad, Babylon, Assyria, etc), which is unfortunately not a whole lot, was gleaned almost...
To add more weight to what you're saying... most of what we know about bronze age Mesopotamia (Sumer, Akkad, Babylon, Assyria, etc), which is unfortunately not a whole lot, was gleaned almost entirely from various, often broken scraps of pottery with proto-cuneiform on them, and an incredibly limited number of still legible cuneiform tablets. Cuneiform tablets which more often than not were merely used for recording mundane business/legal documentation and correspondence. And yet even the sparse amount of information we have managed to glean from those completely changed the way Historians have come to understand human development, our earliest civilizations, and even the way serious scholars now view the Bible. So who knows what data we consider insignificant and disposable now will ultimately turn out to be invaluable for us in the future.
And as a fan of history, the amount of information we as a species have already lost throughout the ages is heartbreaking, and this article honestly makes me mad. It's one thing to respect someone's digital privacy and their right to be forgotten should they choose to invoke it (which I entirely agree with and support) but it's another to argue that online data degradation should not only be embraced but also actively encouraged. That is incredibly shortsighted lunacy, IMO.
p.s. Which is precisely why I have a recurring donation set up to archive.org and actively encourage everyone who cares about data preservation to do the same.
While I do believe that the right to be forgotten online is important, sure you should be able to delete your facebook profile and your twitter profile. Maybe even remove your name from the...
While I do believe that the right to be forgotten online is important, sure you should be able to delete your facebook profile and your twitter profile. Maybe even remove your name from the internet in general in some cases.
That is different from say deleting a website full of conferences in this case, or images, or blogs. What harm is keeping this data online doing, it isn't harming anybody in any way allowing this data to continue to exist and be viewable. We didn't keep records of every invoice from the 1800's cause we didn't want to, it didn't happen cause we had no way to. But with storage being so cheap and readily available, I see no reason not to allow this to exist. And hey it might not be useful now, but in 100 years somebody might be interested in using that information in some way.
Storage is just one small part of the equation, though. Someone needs to run the web server and, more importantly, keep the website secure from constant, mainly automated, spamming and hacking...
But with storage being so cheap and readily available, I see no reason not to allow this to exist.
Storage is just one small part of the equation, though. Someone needs to run the web server and, more importantly, keep the website secure from constant, mainly automated, spamming and hacking attempts. There are updates that you need to run, some of which may require major changes to the website, even if you are not otherwise actively developing it anymore. Legal developments like the recent European cookie and GDPR laws may also require you to make changes to an otherwise dormant website. All this takes time, effort and money. Maintaining something is a responsibility. While it may not be much for one static website, the burden does accumulate if you run several. It isn't like just storing a book (or a hard drive) somewhere. Sometimes letting go is the only sane decision.
More importantly, when you’re holding data that pertains to other people you’re exposing those people to risk. We really need to start thinking of data the way we do radioactive waste. We ought to...
Legal developments like the recent European cookie and GDPR laws may also require you to make changes to an otherwise dormant website.
More importantly, when you’re holding data that pertains to other people you’re exposing those people to risk. We really need to start thinking of data the way we do radioactive waste. We ought to focus on minimizing how much we generate and hold and worry about where it goes and who gets their hands on it.
People’s privacy is being compromised by all these old, orphaned data caches laying around being connected to the internet and way behind on their security. It doesn’t even have to be anything especially sensitive in itself. You can slowly build up from irrelevant data to get at relevant data. A recycled password, a minor life detail that helps find other accounts, something that helps answer a secret question or a social engineering attack, etc.
I don't know why, but thinking of the internet with missing pieces reminds me of Alzheimer's. The information was once there, but now it is inaccessible. The thing I wonder about the other...
I don't know why, but thinking of the internet with missing pieces reminds me of Alzheimer's. The information was once there, but now it is inaccessible. The thing I wonder about the other comments left here pertains to information that seems innocent enough being used for evil means in the future.
Just look at any website that works like websites are supposed to work (i.e. linking to information on other websites) and if it's more than, say, 5 or 7 years old, at least half of the links...
Just look at any website that works like websites are supposed to work (i.e. linking to information on other websites) and if it's more than, say, 5 or 7 years old, at least half of the links won't work (it's extreme with Metacritic, check out a game's page from the PS2 era and a good 90% of the review links won't work – even for websites that still exist but decided to restructure their URLs). It's fascinating and a bit soothing to see that even such a technical construct as the internet displays the same kind of "rot" as real world data or objects. Even if it can get occasionally annoying, I think it's healthy. Information that's actually useful (Encyclopedic knowledge like Wikipedia, high quality news paper articles,...) have long been kept around, it's a question of what the internet adds to these quality categories and I'd say that's ultimately very little.
The problem is rather social media websites and companies like Google who have an interest in keeping trash data around, just for the possible benefit of feeding it into an advertising algorithm or some machine learning, eventually. On a gut feeling level, I think all of us expect stupid social media posts from 10 years ago to no longer exist, but they do. It shouldn't really be profitable or desirable to keep this stuff up. Those "right to be forgotten" laws make a lot of sense. Currently, it's handled as a "right to be informed" kind of situation, maybe with the possibility to request deletion. I think it should be automated. Why should facebook be keeping your posts from 10 years ago, if you haven't logged in in years and probably don't even remember your password? I don't think it's unreasonable to require websites with gatekeeper status (say, more than 10,000,000 users) to delete information automatically after a while. It used to be that just the cost of hosting/storing it made them do so but with cost of probably cents per terrabyte, I don't think that's no longer a factor.
IM histories and to some extent personal posts can go and no one will care but all the actual documents on the internet should be preserved just like we preserve books. The internet has generated mountains of historical data which once would have been spoken conversation and never recorded and largely a lot of this is junk that could go. Archive.org is essential for when a document you need is gone or changed and you need to find it again. I just wish they had some way to search the internet at certain points because sometimes I want to research something old and I don't have the links to the pages which would have hosted it.
We don't care about it now (and often we may feel more comfortable if that sort of thing would disappear) but I think in the future this kind of information will be invaluable to historians trying to understand our time. What would we give now to have transcripts of conversations between Han dynasty peasants or to be able to read the comments from a Babylonian Reddit equivalent? Even to take the example from the article of something trivially useless—commercial invoices from the 1800s—it brought to mind a book I read recently that used financial records from the 1700s to try to make an argument for the causes of the American Revolution.
None of this is to say that we necessarily should save all this disappearing information or that there aren't problems associated with keeping it, but I don't think it's as easy to write off as useless as this article says.
To add more weight to what you're saying... most of what we know about bronze age Mesopotamia (Sumer, Akkad, Babylon, Assyria, etc), which is unfortunately not a whole lot, was gleaned almost entirely from various, often broken scraps of pottery with proto-cuneiform on them, and an incredibly limited number of still legible cuneiform tablets. Cuneiform tablets which more often than not were merely used for recording mundane business/legal documentation and correspondence. And yet even the sparse amount of information we have managed to glean from those completely changed the way Historians have come to understand human development, our earliest civilizations, and even the way serious scholars now view the Bible. So who knows what data we consider insignificant and disposable now will ultimately turn out to be invaluable for us in the future.
And as a fan of history, the amount of information we as a species have already lost throughout the ages is heartbreaking, and this article honestly makes me mad. It's one thing to respect someone's digital privacy and their right to be forgotten should they choose to invoke it (which I entirely agree with and support) but it's another to argue that online data degradation should not only be embraced but also actively encouraged. That is incredibly shortsighted lunacy, IMO.
p.s. Which is precisely why I have a recurring donation set up to archive.org and actively encourage everyone who cares about data preservation to do the same.
While I do believe that the right to be forgotten online is important, sure you should be able to delete your facebook profile and your twitter profile. Maybe even remove your name from the internet in general in some cases.
That is different from say deleting a website full of conferences in this case, or images, or blogs. What harm is keeping this data online doing, it isn't harming anybody in any way allowing this data to continue to exist and be viewable. We didn't keep records of every invoice from the 1800's cause we didn't want to, it didn't happen cause we had no way to. But with storage being so cheap and readily available, I see no reason not to allow this to exist. And hey it might not be useful now, but in 100 years somebody might be interested in using that information in some way.
Storage is just one small part of the equation, though. Someone needs to run the web server and, more importantly, keep the website secure from constant, mainly automated, spamming and hacking attempts. There are updates that you need to run, some of which may require major changes to the website, even if you are not otherwise actively developing it anymore. Legal developments like the recent European cookie and GDPR laws may also require you to make changes to an otherwise dormant website. All this takes time, effort and money. Maintaining something is a responsibility. While it may not be much for one static website, the burden does accumulate if you run several. It isn't like just storing a book (or a hard drive) somewhere. Sometimes letting go is the only sane decision.
More importantly, when you’re holding data that pertains to other people you’re exposing those people to risk. We really need to start thinking of data the way we do radioactive waste. We ought to focus on minimizing how much we generate and hold and worry about where it goes and who gets their hands on it.
People’s privacy is being compromised by all these old, orphaned data caches laying around being connected to the internet and way behind on their security. It doesn’t even have to be anything especially sensitive in itself. You can slowly build up from irrelevant data to get at relevant data. A recycled password, a minor life detail that helps find other accounts, something that helps answer a secret question or a social engineering attack, etc.
I don't know why, but thinking of the internet with missing pieces reminds me of Alzheimer's. The information was once there, but now it is inaccessible. The thing I wonder about the other comments left here pertains to information that seems innocent enough being used for evil means in the future.
Just look at any website that works like websites are supposed to work (i.e. linking to information on other websites) and if it's more than, say, 5 or 7 years old, at least half of the links won't work (it's extreme with Metacritic, check out a game's page from the PS2 era and a good 90% of the review links won't work – even for websites that still exist but decided to restructure their URLs). It's fascinating and a bit soothing to see that even such a technical construct as the internet displays the same kind of "rot" as real world data or objects. Even if it can get occasionally annoying, I think it's healthy. Information that's actually useful (Encyclopedic knowledge like Wikipedia, high quality news paper articles,...) have long been kept around, it's a question of what the internet adds to these quality categories and I'd say that's ultimately very little.
The problem is rather social media websites and companies like Google who have an interest in keeping trash data around, just for the possible benefit of feeding it into an advertising algorithm or some machine learning, eventually. On a gut feeling level, I think all of us expect stupid social media posts from 10 years ago to no longer exist, but they do. It shouldn't really be profitable or desirable to keep this stuff up. Those "right to be forgotten" laws make a lot of sense. Currently, it's handled as a "right to be informed" kind of situation, maybe with the possibility to request deletion. I think it should be automated. Why should facebook be keeping your posts from 10 years ago, if you haven't logged in in years and probably don't even remember your password? I don't think it's unreasonable to require websites with gatekeeper status (say, more than 10,000,000 users) to delete information automatically after a while. It used to be that just the cost of hosting/storing it made them do so but with cost of probably cents per terrabyte, I don't think that's no longer a factor.