35 votes

Delete never: The digital hoarders who collect Tumblrs, medieval manuscripts, and terabytes of text files

17 comments

  1. [2]
    Octofox
    Link
    I'm a data hoarder. I have somewhere over 20TB of storage right now. About 5 TB of this is useful data (Games I play, music, movies) and the rest is random stuff that could be useful but I will...

    I'm a data hoarder. I have somewhere over 20TB of storage right now. About 5 TB of this is useful data (Games I play, music, movies) and the rest is random stuff that could be useful but I will probably never use it. I have 500GB of every reddit comment up until 2016 that could be used for some cool stats if I ever had loads of free time.

    Its super cool to see how far storage has come. This year I purchased a 10TB hdd and it wasn't even the biggest one at the store. Pretty crazy stuff.

    I wouldn't really classify it as the same thing as physical item hoarding. Its a similar kind of activity but it has no real negative effects. I can easily afford the hard drives and they take very little space.

    17 votes
    1. unknown user
      Link Parent
      That's a lot of useful data. I imagine I could max it out with full FLAC discographies, BD Remux-quality films, and a couple of latest open-world games... What do you have in there?

      About 5 TB of this is useful data (Games I play, music, movies)

      That's a lot of useful data. I imagine I could max it out with full FLAC discographies, BD Remux-quality films, and a couple of latest open-world games...

      What do you have in there?

      1 vote
  2. [6]
    asoftbird
    Link
    The internet actually is pretty bad at retaining information compared to offline analog methods- files dissipate, urls change and quality can degrade immensely. Stuff's still out there but nobody...

    The internet actually is pretty bad at retaining information compared to offline analog methods- files dissipate, urls change and quality can degrade immensely. Stuff's still out there but nobody can access it, for instance.
    I believe this effect had a name but l'm not sure what it was.

    13 votes
    1. Deimos
      Link Parent
      I'm not sure if it's what you're talking about because it's not quite the same, but the idea of "deep web" / "invisible web" is similar in some ways.

      I'm not sure if it's what you're talking about because it's not quite the same, but the idea of "deep web" / "invisible web" is similar in some ways.

      5 votes
    2. Atvelonis
      (edited )
      Link Parent
      Definitely. I help run a wiki and this is an issue we have to deal with constantly when using external links. My long-term (and not comprehensive) solution as of a few months ago has been a...

      Definitely. I help run a wiki and this is an issue we have to deal with constantly when using external links. My long-term (and not comprehensive) solution as of a few months ago has been a browser extention that allows me to quickly archive the page that I'm viewing. I do this for pretty much anything off-site I link to from a wiki article. Then, in the event that the link dies in the future, assuming someone notices, we can just plug in the archived version of it.

      The problem is that certain websites are not set up to allow proper web crawling. Most of Bethesda's news blogs, for example, are impossible to archive because the site has an agegate on it. The Wayback Machine also has trouble with interactive elements, which is sometimes important. I've come to accept by this point that, if I want to be as thorough as possible, I need to document on-site everything that the company says. The alternative is to accept that, eventually, some information will simply be lost, which makes me sad.

      2 votes
    3. crdpa
      Link Parent
      I can confirm. One day i remembered a porn that i really liked in the past. It was a real adventure to track down and find it again.

      I can confirm. One day i remembered a porn that i really liked in the past. It was a real adventure to track down and find it again.

  3. [7]
    Macil
    (edited )
    Link
    Sometimes I go through my bookmarks, and find lots of dead links. Not all of them show up on archive.org. I'm tempted to set up a script to periodically wget all of my bookmarks, but it...

    Sometimes I go through my bookmarks, and find lots of dead links. Not all of them show up on archive.org. I'm tempted to set up a script to periodically wget all of my bookmarks, but it disappoints me that my personal archive is unlikely to be found by anyone else who coincidentally tries to follow broken links to the same resources that I've archived. Also, many sites aren't built in a way that wget works well on.

    I really hope IPFS takes off. If all of my bookmarks pointed to sites using IPFS, then I could mirror their content and help serve them on IPFS. If the original host goes down, I'll still be able to help host the content at its original URL, and the URL will still work for anyone else in the world who tries to follow it. And then maybe people will re-host my own content in the same way, even to long after I'm gone if my content is good enough.

    6 votes
    1. zaarn
      Link Parent
      I can recommend ArchiveBox; it can read an export of your firefox bookmarks and then generates a pretty good archive from that.

      I can recommend ArchiveBox; it can read an export of your firefox bookmarks and then generates a pretty good archive from that.

      4 votes
    2. [5]
      mrbig
      Link Parent
      I think that is a neat idea and other people would be grateful if you did this!

      I'm tempted to set up a script to periodically wget all of my bookmarks

      I think that is a neat idea and other people would be grateful if you did this!

      1 vote
      1. [4]
        Adys
        Link Parent
        For what it's worth, that's one of the things Pocket Premium does.

        For what it's worth, that's one of the things Pocket Premium does.

        1 vote
        1. [3]
          pew
          Link Parent
          there's just no way to get it out of pocket unfortunately.

          there's just no way to get it out of pocket unfortunately.

          3 votes
          1. [2]
            JuniperMonkeys
            Link Parent
            I think Pinboard's archives are exportable -- I don't use an archiving account, though, so I've never tested it.

            I think Pinboard's archives are exportable -- I don't use an archiving account, though, so I've never tested it.

            1. pew
              Link Parent
              Can't recommend pinboard anylonger, I think the service is just running as-is and is dead otherwise. Was a paying customer for years exactly for the export feature. One day I requested the archive...

              Can't recommend pinboard anylonger, I think the service is just running as-is and is dead otherwise. Was a paying customer for years exactly for the export feature. One day I requested the archive and after two months still not being done I contacted him and never got a reply.

              1 vote
  4. zaarn
    Link
    Hi, I'm a datahoarder myself, currently sitting on 40TB of raw storage (30TB usable, 18TB used), most of which is dedicated to family videos, entertainment and hoarding. My biggest data piece is a...

    Hi, I'm a datahoarder myself, currently sitting on 40TB of raw storage (30TB usable, 18TB used), most of which is dedicated to family videos, entertainment and hoarding. My biggest data piece is a 3TB folder full of youtube videos, of which 20% are no longer available online.

    I see datahoarding as a form of archiving, though unlike the Internet Archive / ArchiveTeam (who are amazing people don't get me wrong), I curate a lot of data. Most of the time when I download big datasets, I comb through and delete larger parts I find uninteresting for me.

    It's a very interesting hobby to have.

    6 votes
  5. crdpa
    Link
    The only thing i archive are my photos (recently, i used to delete them all after some time) and music (flac files). I delete movies and series after i watch them.

    The only thing i archive are my photos (recently, i used to delete them all after some time) and music (flac files).

    I delete movies and series after i watch them.

    1 vote