All because Reddit is in the business of selling user comments and information to AI companies. Can't have rogue companies stealing your data that Reddit never paid for.
All because Reddit is in the business of selling user comments and information to AI companies. Can't have rogue companies stealing your data that Reddit never paid for.
I love the Internet Archive but I’ve been saying for years that we shouldn’t be putting all our eggs in one basket. We’re one censorious regime (or hostile takeover, or destructive wildfire, or...
I love the Internet Archive but I’ve been saying for years that we shouldn’t be putting all our eggs in one basket. We’re one censorious regime (or hostile takeover, or destructive wildfire, or malicious hacker, or funding depletion, or leadership retirement, etc.) away from the collapse of the internet’s most valuable resource maybe next to Wikipedia.
We have BitTorrent and blockchains, but I’m honestly surprised that decentralized, distributed computing never really became “a thing” on any bigger scale. We should all be running nodes in a giant swarm of web scrapers, making our own collectively shared Wayback Machine. One that’s resistant to server blocks, censorship, and takedowns. Frustrating that it’s 2025 and we still don’t have anything like this.
A lot of sites will take action against residential IP addresses that are scraping data. That could be anything from throttling to an outright ban. There are other ways to decentralize, such as...
One that’s resistant to server blocks, censorship, and takedowns.
A lot of sites will take action against residential IP addresses that are scraping data. That could be anything from throttling to an outright ban.
There are other ways to decentralize, such as using server resources (VPS, bare metal, or even cloud compute) from various providers. But this starts to become a bit more centralized and dependent on these hosting companies, as they can close accounts however they see fit.
Long ago I had the idea of basically creating a decentralized crowdsourced internet archive that shares caches of sites with each other via P2P. Basically people install the plugin/Server and you...
I love the Internet Archive but I’ve been saying for years that we shouldn’t be putting all our eggs in one basket.
Long ago I had the idea of basically creating a decentralized crowdsourced internet archive that shares caches of sites with each other via P2P.
Basically people install the plugin/Server and you configure it every time, hour, day, etc, to download webpages you frequent, or partials, HTML, etc all configurable, and it basically keeps that copy for X amount of days and will look for incremental updates to that page while filtering out ads and stuff. Of course it removes cookies and personally identifiable information too.
Completely configurable to give people the control of how often to take a backup, whether or not to include pictures and images, have a way to compress images, whitelists for sites you want to backup, blacklist for sites you don't, how many incremental backups to keep, how much memory before deleting older backups, etc. Was even thinking of a plugin that used the same processes as JDownloader2 to download videos off of sites too, so doing the same for YouTube.
It would work in the background via a browser plugin that connects to a hosted server that manages the backups, and basically just uses what you have in your cache and already basically downloaded, instead of like remotely downloading a website or scraping data.
Those downloaded incremental backups of websites are then shared via a P2P network, so people can browse backups and download backups of said website.
The idea was to have people be able to host a server where these backups can be stored and sorted, and you can configure it to download incremental backups of a website from the P2P network, and to store yours. Could be self hosted on Docker.
Then of course you'd have places like Universities and Internet Archive with immense amounts of storage that would piggy back off the network too and give them the ability to make daily backups in some, and the ability to track changes over time.
Hell, I even came up with a fun button that will send you to a random website that hadn't been archived in a while so people can click it when they're bored and it'll help out the network.
I was talking to a few people about starting that project and the various ways we could get it to work, but like most project talk, people lose interest and move on. Never came to fruition.
It would have been great for data hoarders like myself, and others that potentially need to have access to websites offline for various reasons. Also just generally good for the health of the internet too, it would have made this article a moot point.
I mean, most of the stuff on the Internet Archive is available via torrents already. Almost every item I've come across has a torrent download option. That doesn't mean you can't lose an item...
I mean, most of the stuff on the Internet Archive is available via torrents already. Almost every item I've come across has a torrent download option.
That doesn't mean you can't lose an item because no one is seeding it, though. So you're up against the same problem. But obviously with a torrent, you'd need everyone seeding to disappear, not just the Internet Archive.
You’re talking about content that was scraped by IA and then shared in torrent form, right? So that would still be stymied by blocks like what Reddit’s doing.
You’re talking about content that was scraped by IA and then shared in torrent form, right? So that would still be stymied by blocks like what Reddit’s doing.
Yes, but presumably Reddit isn't just going to block the Internet Archive, but any scraper doing something similar. My original comment was responding to your comment that we need to diversify...
You’re talking about content that was scraped by IA and then shared in torrent form, right? So that would still be stymied by blocks like what Reddit’s doing.
Yes, but presumably Reddit isn't just going to block the Internet Archive, but any scraper doing something similar.
My original comment was responding to your comment that we need to diversify away from Internet Archive being the primary source for all things "old internet" and have multiple options. If they got taken down and poofed out of existence, it would be devastating, but it wouldn't mean the immediate loss of all their knowledge since it's all also shared via torrents.
Oh I see what you mean. I think the “old internet” archives are in decent hands at this point because of that; there should be enough copies scattered around the web at this point to provide some...
Oh I see what you mean. I think the “old internet” archives are in decent hands at this point because of that; there should be enough copies scattered around the web at this point to provide some resilience.
I’m just concerned about future archives of today’s web, which will be “old internet” soon enough too. If we lose the primary way these archives are created, the project is dead. The reason I’m advocating for decentralized scraping is that if you spread it around enough, archival requests will be indistinguishable from regular browsing. It’s one thing to block traffic from a known data center IP, it’s another matter to play whack-a-mole with ten thousand residential addresses from around the world, crawling in a manner that is by all appearances uncoordinated.
I agree we should all be nodes in what you described. The trouble is, no one gives a shit. The solution is to make them give a shit. How do you incentivize all the people to participate? Maybe we...
I agree we should all be nodes in what you described. The trouble is, no one gives a shit.
The solution is to make them give a shit. How do you incentivize all the people to participate?
Maybe we pay them in a free subscription to a paid service. You want Netflix? You need to keep this 1x tarball online for 95% of days in a month and you get a month free. Oh, you want a Comcast ISP subscription? That's a 3x tarball for 95% uptime.
Find a way to monetize the tool sufficiently to be able to pay the users, while also being a registered not for profit?
Bluesky is the most archive-friendly user forum infrastructure I know about. Downloading the posts from each user's PDS (which is like a home directory that stores all their posts) is necessary to...
Bluesky is the most archive-friendly user forum infrastructure I know about. Downloading the posts from each user's PDS (which is like a home directory that stores all their posts) is necessary to run the app, and anyone could do that, the same way Bluesky does it.
The PDS can store arbitrary data. It's technically possible to build a link-sharing app on Bluesky (here's one), but they're not very popular yet.
It's also inevitable that some AI companies will use Bluesky posts for training (if they aren't already), so being archive-friendly has downsides, too.
IPFS having that crypto-adjacent vibe feels to me like it hurts what could potentially be an interesting technology. A few weeks back, specifically when the Minecraft world torrent was posted, I...
IPFS having that crypto-adjacent vibe feels to me like it hurts what could potentially be an interesting technology. A few weeks back, specifically when the Minecraft world torrent was posted, I was thinking that a debrid service that targets IPFS would be interesting as it would've made it trivial to post an http download link (using ipfs gateways) for anyone that wanted it without torrenting.
A few years back I tried out hosting a website with it and it was kind of interesting. For static sites it actually had some rather useful properties due to it being hash addressed and naively supporting http requests (via gateways). It's been quite a long time since I looked at it, but I'm pretty sure all I needed to do to update the site was pin the new version and update my DNS record to point at the new root address. Since I wasn't unpinning the old pages I also got to keep my whole history for the storage cost of only the changed files. I've built similar things before on more traditional stacks, but features like deduping unchanged files were much more of a chore.
A search engine called "YaCy" is basically what you described, it's a search engine but self hosted and connects to swarm. it has a crawler that runs locally and crawls websites and indexes and...
A search engine called "YaCy" is basically what you described, it's a search engine but self hosted and connects to swarm. it has a crawler that runs locally and crawls websites and indexes and makes them searchable by others. I think you can even view the cached pages. it does work but the problem is that it's so obscure and niche that there aren't many people hosting it. it's also very heavy in terms of system resource usage, I installed it once but didn't find it very useful for myself.
Edit: I just remembered one more search engine, it's called mwmbl, it's a bit different. you can install an extension which crawls websites and then sends the data to mwmbl servers, this way, the crawling is done by users. here's the link: https://github.com/mwmbl/mwmbl?tab=readme-ov-file
I have mixed thoughts about this one. It feels relatively obvious that Reddit is doing this to stop scrapers from accessing the data via IA, and instead companies will have to pay Reddit for...
I have mixed thoughts about this one. It feels relatively obvious that Reddit is doing this to stop scrapers from accessing the data via IA, and instead companies will have to pay Reddit for access to the data. And while it seems to just be kind of a greedy move by Reddit, it does feel within their rights to do so. I think IA's mission is a noble goal, but it does end up just being used to get around paywalls, and that the original creators/hosts do not get paid fairly for their work. If a small website reached out to IA to get them to stop archiving their website due to AI training concerns, I feel like the public perception would be in support of that website. However, when Reddit does it, the public perception is more negative. Maybe it is due to people not agreeing with how Reddit is handling AI at large, but I feel like it is valid for Reddit to push back and prevent archiving when it does not meet their standards (there was also comments by Reddit about IA archiving deleted posts, which Reddit did not appreciate).
I'd feel a lot more sympathy for most other entities than reddit. In this case "their work" is really the work of reddit's users. The only valuable thing reddit has is its users. The work that...
I'd feel a lot more sympathy for most other entities than reddit.
I think IA's mission is a noble goal, but it does end up just being used to get around paywalls, and that the original creators/hosts do not get paid fairly for their work.
In this case "their work" is really the work of reddit's users. The only valuable thing reddit has is its users. The work that reddit.com actually does is actually near worthless. The platform isn't unique or interesting at all. There's no huge technical hurdles to making a new reddit, and it's actually pretty bad at most of the things it's supposed to do.
The value is in the content created by the users, so for reddit to say "no you're not allowed to have any of this content that we didn't make, we just allowed people to make for us without compensation" is, while within their rights, pretty shitty.
I think that is part of the reason that reddit does not have public support in their AI approach. It feels to most people that all reddit does is collect the cheque, but does no work for it, and...
In this case "their work" is really the work of reddit's users. The only valuable thing reddit has is its users. The work that reddit.com actually does is actually near worthless.
I think that is part of the reason that reddit does not have public support in their AI approach. It feels to most people that all reddit does is collect the cheque, but does no work for it, and does not pass on part of the profits to the people who are actually providing the value. I don't think reddit will ever be able to share profits with users though, as then it gets messy on how to revenue share. Do you split profits based on karma? - increase in bots and karma farming (which is already a major issue). Do you split profits based on word count? - Increase in fluff and using AI to pad out the responses. So I think that as long as reddit tries to monetize user content, they will not get public support
Archive.org recently allowed me to recover a good chunk of my personal history in the form of my first blog which was deleted by blogspot for reasons I will never know. It was wonderful. I don't...
Archive.org recently allowed me to recover a good chunk of my personal history in the form of my first blog which was deleted by blogspot for reasons I will never know. It was wonderful.
I don't care if archive.org removes the TV shows. It would still be a great thing if it was only for things like that.
Yep. I’m not a big fan of archive.is for the same reason. Good journalism cost money. Post adpocalpyse, subscriptions are the way. Is what it is. Don’t people hate ads, anyway? Pay for the news....
Yep. I’m not a big fan of archive.is for the same reason. Good journalism cost money. Post adpocalpyse, subscriptions are the way. Is what it is.
Don’t people hate ads, anyway? Pay for the news. And if you don’t want to, that’s fine, don’t read it.
For me a big problem is that I usually go to a random news site from a link. Right now options usually range from subscription paywall to accessing the article somehow with difficulty to not...
For me a big problem is that I usually go to a random news site from a link. Right now options usually range from subscription paywall to accessing the article somehow with difficulty to not accessing the article. What I choose varies but a recurring 5-10 dollars a subscription for a site I am likely to see a handful of times that month at most is a lot, not mentioning the possibility of forgetting to cancel. If they simply had the option to prepay for that amount for lets say fifty articles(entirely random number), personally I would be significantly more willing to pay.
It is that way with most subscriptions. They are absolutely terrible monetization method for the end user for a large number of content services. Right up there with ads and selling data and attention. Though better that the currently favorite pioneered method of having all of them.
I haven't followed it and I don't think it ended up working out, but I think this is part of why Brave tried creating BAT. The current system of per-platform accounts and payments incentivizes...
I haven't followed it and I don't think it ended up working out, but I think this is part of why Brave tried creating BAT. The current system of per-platform accounts and payments incentivizes staying inside a few small spaces rather than branching out to look at a wider variety of sources. It presumably also helps reinforce echo chambers because opposing views are likely to be behind paywalls, so even if provided it's much easier to dismiss due to the barrier to access.
I don't have a real answer though. Maybe something similar to BAT could work if disconnected from crypto and Brave, but also maybe not. If I was to pitch something off the cuff it would be a bit like BAT in that it would pay out based on proportional usage, but not run based on crypto. Maybe then specific outlets paywall by category (ex. maybe a given news outlet says "you must have at least $10/mo allocated to news"). Probably incentivizes pumping out cheap low quality content though, but with a bit of creativity that could probably be solved.
I'm in the same boat of that I don't just go to news sites and browse around. I end up linked to them from places like Tildes. I don't see any reality in which I even make accounts with so many more services, much less subscribe to all of them. I already have far too many accounts.
This is more or less what I believe, but I think a lot of people feel entitled to viewing media for free and chafe at the notion of paying. Same dynamic as why a lot of people pirate, even if it...
Don’t people hate ads, anyway? Pay for the news. And if you don’t want to, that’s fine, don’t read it.
This is more or less what I believe, but I think a lot of people feel entitled to viewing media for free and chafe at the notion of paying. Same dynamic as why a lot of people pirate, even if it is often justified in other ways. There is something about the net that makes people really abhor paying money to access things!
Information is plentiful and you have finite minutes in your life. That means, by simple market economics, your attention is a valuable commodity and content is worthless. Nobody wants to pay for...
Information is plentiful and you have finite minutes in your life. That means, by simple market economics, your attention is a valuable commodity and content is worthless. Nobody wants to pay for media, and they all must compete for viewership. No amount of trying to shove the genie back in the bottle is going to magically reverse the law of supply and demand...people will just find something else to look at.
This is the same reason raw ticket numbers for movies have been declining for over twenty years, despite rising prices keeping the box office figures up. There's more competition for attention, the the cost/value is slipping as ticket prices go up.
The inevitable conclusion of paywalls is more people just reading headlines, and getting their news from random vertical videos on Instagram.
And, personally, my goal for the past few years has been to reduce my consumption of news media. I'll wait until I hear about big news and read things then. But otherwise, it's just deleterious to mental health. It's definitely a bit much to ask for someone to drop Netflix money on morbid entertainment when they can pay for actual entertainment.
I'll apologize in advance because this is mostly not responding to you, but I'm going to nit-pick "attention is a valuable commodity" because it's a cliché that bugs me. It's a very zoomed-out,...
I'll apologize in advance because this is mostly not responding to you, but I'm going to nit-pick "attention is a valuable commodity" because it's a cliché that bugs me. It's a very zoomed-out, high-level way to put it that ignores important distinctions:
Although it's true that there is only so much attention someone can give others in a day, how much it's worth varies extremely depending on circumstances. There are many kinds of attention that have negative value. (Consider that often, the attention men give women is unwanted, and privacy is about avoiding attention.)
Even when someone seeks attention from strangers, it's usually not from just anyone. This is true even of marketing. That's why there are targeted ads. A purchase funnel is about quickly getting rid of unwanted attention (often most of it) and narrowing it down to just prospective customers.
To take the "commodity" metaphor seriously, copper is a valuable commodity and copper ore is just a lot of rock that contains copper.
When people hope to be paid for their attention, they often greatly overestimate how much it's worth. If you're not the droid they're looking for, if you get paid anything, it will be a pittance. Marketers are often paying for advertising to find someone else.
This is just depressing.
All because Reddit is in the business of selling user comments and information to AI companies. Can't have rogue companies stealing your data that Reddit never paid for.
Really makes you appreciate the fact that Tildes is a non profit
I love the Internet Archive but I’ve been saying for years that we shouldn’t be putting all our eggs in one basket. We’re one censorious regime (or hostile takeover, or destructive wildfire, or malicious hacker, or funding depletion, or leadership retirement, etc.) away from the collapse of the internet’s most valuable resource maybe next to Wikipedia.
We have BitTorrent and blockchains, but I’m honestly surprised that decentralized, distributed computing never really became “a thing” on any bigger scale. We should all be running nodes in a giant swarm of web scrapers, making our own collectively shared Wayback Machine. One that’s resistant to server blocks, censorship, and takedowns. Frustrating that it’s 2025 and we still don’t have anything like this.
A lot of sites will take action against residential IP addresses that are scraping data. That could be anything from throttling to an outright ban.
There are other ways to decentralize, such as using server resources (VPS, bare metal, or even cloud compute) from various providers. But this starts to become a bit more centralized and dependent on these hosting companies, as they can close accounts however they see fit.
Long ago I had the idea of basically creating a decentralized crowdsourced internet archive that shares caches of sites with each other via P2P.
Basically people install the plugin/Server and you configure it every time, hour, day, etc, to download webpages you frequent, or partials, HTML, etc all configurable, and it basically keeps that copy for X amount of days and will look for incremental updates to that page while filtering out ads and stuff. Of course it removes cookies and personally identifiable information too.
Completely configurable to give people the control of how often to take a backup, whether or not to include pictures and images, have a way to compress images, whitelists for sites you want to backup, blacklist for sites you don't, how many incremental backups to keep, how much memory before deleting older backups, etc. Was even thinking of a plugin that used the same processes as JDownloader2 to download videos off of sites too, so doing the same for YouTube.
It would work in the background via a browser plugin that connects to a hosted server that manages the backups, and basically just uses what you have in your cache and already basically downloaded, instead of like remotely downloading a website or scraping data.
Those downloaded incremental backups of websites are then shared via a P2P network, so people can browse backups and download backups of said website.
The idea was to have people be able to host a server where these backups can be stored and sorted, and you can configure it to download incremental backups of a website from the P2P network, and to store yours. Could be self hosted on Docker.
Then of course you'd have places like Universities and Internet Archive with immense amounts of storage that would piggy back off the network too and give them the ability to make daily backups in some, and the ability to track changes over time.
Hell, I even came up with a fun button that will send you to a random website that hadn't been archived in a while so people can click it when they're bored and it'll help out the network.
I was talking to a few people about starting that project and the various ways we could get it to work, but like most project talk, people lose interest and move on. Never came to fruition.
It would have been great for data hoarders like myself, and others that potentially need to have access to websites offline for various reasons. Also just generally good for the health of the internet too, it would have made this article a moot point.
I mean, most of the stuff on the Internet Archive is available via torrents already. Almost every item I've come across has a torrent download option.
That doesn't mean you can't lose an item because no one is seeding it, though. So you're up against the same problem. But obviously with a torrent, you'd need everyone seeding to disappear, not just the Internet Archive.
You’re talking about content that was scraped by IA and then shared in torrent form, right? So that would still be stymied by blocks like what Reddit’s doing.
Yes, but presumably Reddit isn't just going to block the Internet Archive, but any scraper doing something similar.
My original comment was responding to your comment that we need to diversify away from Internet Archive being the primary source for all things "old internet" and have multiple options. If they got taken down and poofed out of existence, it would be devastating, but it wouldn't mean the immediate loss of all their knowledge since it's all also shared via torrents.
Oh I see what you mean. I think the “old internet” archives are in decent hands at this point because of that; there should be enough copies scattered around the web at this point to provide some resilience.
I’m just concerned about future archives of today’s web, which will be “old internet” soon enough too. If we lose the primary way these archives are created, the project is dead. The reason I’m advocating for decentralized scraping is that if you spread it around enough, archival requests will be indistinguishable from regular browsing. It’s one thing to block traffic from a known data center IP, it’s another matter to play whack-a-mole with ten thousand residential addresses from around the world, crawling in a manner that is by all appearances uncoordinated.
I agree we should all be nodes in what you described. The trouble is, no one gives a shit.
The solution is to make them give a shit. How do you incentivize all the people to participate?
Maybe we pay them in a free subscription to a paid service. You want Netflix? You need to keep this 1x tarball online for 95% of days in a month and you get a month free. Oh, you want a Comcast ISP subscription? That's a 3x tarball for 95% uptime.
Find a way to monetize the tool sufficiently to be able to pay the users, while also being a registered not for profit?
Baby, you got a stew.
that's essentially what private torrent trackers do, and it works pretty well
Bluesky is the most archive-friendly user forum infrastructure I know about. Downloading the posts from each user's PDS (which is like a home directory that stores all their posts) is necessary to run the app, and anyone could do that, the same way Bluesky does it.
The PDS can store arbitrary data. It's technically possible to build a link-sharing app on Bluesky (here's one), but they're not very popular yet.
It's also inevitable that some AI companies will use Bluesky posts for training (if they aren't already), so being archive-friendly has downsides, too.
IPFS seems worth mentioning here.
IPFS having that crypto-adjacent vibe feels to me like it hurts what could potentially be an interesting technology. A few weeks back, specifically when the Minecraft world torrent was posted, I was thinking that a debrid service that targets IPFS would be interesting as it would've made it trivial to post an http download link (using ipfs gateways) for anyone that wanted it without torrenting.
A few years back I tried out hosting a website with it and it was kind of interesting. For static sites it actually had some rather useful properties due to it being hash addressed and naively supporting http requests (via gateways). It's been quite a long time since I looked at it, but I'm pretty sure all I needed to do to update the site was pin the new version and update my DNS record to point at the new root address. Since I wasn't unpinning the old pages I also got to keep my whole history for the storage cost of only the changed files. I've built similar things before on more traditional stacks, but features like deduping unchanged files were much more of a chore.
A search engine called "YaCy" is basically what you described, it's a search engine but self hosted and connects to swarm. it has a crawler that runs locally and crawls websites and indexes and makes them searchable by others. I think you can even view the cached pages. it does work but the problem is that it's so obscure and niche that there aren't many people hosting it. it's also very heavy in terms of system resource usage, I installed it once but didn't find it very useful for myself.
Edit: I just remembered one more search engine, it's called mwmbl, it's a bit different. you can install an extension which crawls websites and then sends the data to mwmbl servers, this way, the crawling is done by users. here's the link: https://github.com/mwmbl/mwmbl?tab=readme-ov-file
I have mixed thoughts about this one. It feels relatively obvious that Reddit is doing this to stop scrapers from accessing the data via IA, and instead companies will have to pay Reddit for access to the data. And while it seems to just be kind of a greedy move by Reddit, it does feel within their rights to do so. I think IA's mission is a noble goal, but it does end up just being used to get around paywalls, and that the original creators/hosts do not get paid fairly for their work. If a small website reached out to IA to get them to stop archiving their website due to AI training concerns, I feel like the public perception would be in support of that website. However, when Reddit does it, the public perception is more negative. Maybe it is due to people not agreeing with how Reddit is handling AI at large, but I feel like it is valid for Reddit to push back and prevent archiving when it does not meet their standards (there was also comments by Reddit about IA archiving deleted posts, which Reddit did not appreciate).
I'd feel a lot more sympathy for most other entities than reddit.
In this case "their work" is really the work of reddit's users. The only valuable thing reddit has is its users. The work that reddit.com actually does is actually near worthless. The platform isn't unique or interesting at all. There's no huge technical hurdles to making a new reddit, and it's actually pretty bad at most of the things it's supposed to do.
The value is in the content created by the users, so for reddit to say "no you're not allowed to have any of this content that we didn't make, we just allowed people to make for us without compensation" is, while within their rights, pretty shitty.
I think that is part of the reason that reddit does not have public support in their AI approach. It feels to most people that all reddit does is collect the cheque, but does no work for it, and does not pass on part of the profits to the people who are actually providing the value. I don't think reddit will ever be able to share profits with users though, as then it gets messy on how to revenue share. Do you split profits based on karma? - increase in bots and karma farming (which is already a major issue). Do you split profits based on word count? - Increase in fluff and using AI to pad out the responses. So I think that as long as reddit tries to monetize user content, they will not get public support
Archive.org recently allowed me to recover a good chunk of my personal history in the form of my first blog which was deleted by blogspot for reasons I will never know. It was wonderful.
I don't care if archive.org removes the TV shows. It would still be a great thing if it was only for things like that.
Yep. I’m not a big fan of archive.is for the same reason. Good journalism cost money. Post adpocalpyse, subscriptions are the way. Is what it is.
Don’t people hate ads, anyway? Pay for the news. And if you don’t want to, that’s fine, don’t read it.
For me a big problem is that I usually go to a random news site from a link. Right now options usually range from subscription paywall to accessing the article somehow with difficulty to not accessing the article. What I choose varies but a recurring 5-10 dollars a subscription for a site I am likely to see a handful of times that month at most is a lot, not mentioning the possibility of forgetting to cancel. If they simply had the option to prepay for that amount for lets say fifty articles(entirely random number), personally I would be significantly more willing to pay.
It is that way with most subscriptions. They are absolutely terrible monetization method for the end user for a large number of content services. Right up there with ads and selling data and attention. Though better that the currently favorite pioneered method of having all of them.
I haven't followed it and I don't think it ended up working out, but I think this is part of why Brave tried creating BAT. The current system of per-platform accounts and payments incentivizes staying inside a few small spaces rather than branching out to look at a wider variety of sources. It presumably also helps reinforce echo chambers because opposing views are likely to be behind paywalls, so even if provided it's much easier to dismiss due to the barrier to access.
I don't have a real answer though. Maybe something similar to BAT could work if disconnected from crypto and Brave, but also maybe not. If I was to pitch something off the cuff it would be a bit like BAT in that it would pay out based on proportional usage, but not run based on crypto. Maybe then specific outlets paywall by category (ex. maybe a given news outlet says "you must have at least $10/mo allocated to news"). Probably incentivizes pumping out cheap low quality content though, but with a bit of creativity that could probably be solved.
I'm in the same boat of that I don't just go to news sites and browse around. I end up linked to them from places like Tildes. I don't see any reality in which I even make accounts with so many more services, much less subscribe to all of them. I already have far too many accounts.
This is more or less what I believe, but I think a lot of people feel entitled to viewing media for free and chafe at the notion of paying. Same dynamic as why a lot of people pirate, even if it is often justified in other ways. There is something about the net that makes people really abhor paying money to access things!
Information is plentiful and you have finite minutes in your life. That means, by simple market economics, your attention is a valuable commodity and content is worthless. Nobody wants to pay for media, and they all must compete for viewership. No amount of trying to shove the genie back in the bottle is going to magically reverse the law of supply and demand...people will just find something else to look at.
This is the same reason raw ticket numbers for movies have been declining for over twenty years, despite rising prices keeping the box office figures up. There's more competition for attention, the the cost/value is slipping as ticket prices go up.
The inevitable conclusion of paywalls is more people just reading headlines, and getting their news from random vertical videos on Instagram.
And, personally, my goal for the past few years has been to reduce my consumption of news media. I'll wait until I hear about big news and read things then. But otherwise, it's just deleterious to mental health. It's definitely a bit much to ask for someone to drop Netflix money on morbid entertainment when they can pay for actual entertainment.
I'll apologize in advance because this is mostly not responding to you, but I'm going to nit-pick "attention is a valuable commodity" because it's a cliché that bugs me. It's a very zoomed-out, high-level way to put it that ignores important distinctions:
Although it's true that there is only so much attention someone can give others in a day, how much it's worth varies extremely depending on circumstances. There are many kinds of attention that have negative value. (Consider that often, the attention men give women is unwanted, and privacy is about avoiding attention.)
Even when someone seeks attention from strangers, it's usually not from just anyone. This is true even of marketing. That's why there are targeted ads. A purchase funnel is about quickly getting rid of unwanted attention (often most of it) and narrowing it down to just prospective customers.
To take the "commodity" metaphor seriously, copper is a valuable commodity and copper ore is just a lot of rock that contains copper.
When people hope to be paid for their attention, they often greatly overestimate how much it's worth. If you're not the droid they're looking for, if you get paid anything, it will be a pittance. Marketers are often paying for advertising to find someone else.