Is it me or are "news" articles on the web getting more and more irritating to read
I've recently experienced something multiple times and wanted to see if others are seeing this. I'm seeing various news articles where the first few paragraphs basically say the exact some information over and over again 3 or 4 times in slightly different ways. My most recent experience was this article about some hackers selling information on billions of Facebook users.
The article starts off with the title "Personal Information of More Than 1.5 Billion Facebook Users Sold on Hacker Forum". Straightforward and to the point. Next we get this paragraph in bold:
The private and personal information of over 1.5 billion Facebook users is being sold on a popular hacking-related forum, potentially enabling cybercriminals and unscrupulous advertisers to target Internet users globally.
Next is a bullet list of the highlights of the incident:
Highlights:
- Data scrapers are selling sensitive personal data on 1.5 billion Facebook users.
- Data contains users’: name, email, phone number, location, gender, and user ID.
- Data appears to be authentic.
- Personal data obtained through web scraping.
- Data can be utilized for phishing and account takeover attacks.
- Sold data claimed to be new from 2021.
This rehashes the number (1.5 billion) and place (Facebook), but does contain new information like what was leaked, and some unsubstantiated claims about whether it's authentic and how it was obtained.
The next paragraph repeats the 1.5 billion number a fourth time, and repeats that the data is available on a hacker forum. Two paragraphs later, we get another list of bullet points which are identical to the 2nd bullet point above; namely that the info contains:
According to the forum poster, the data provided contains the following personal information of Facebook users:
- Name
- Location
- Gender
- Phone number
- User ID
At this point I stop reading because I mistakenly think that I'm re-reading the same paragraph over and over again. It's an incredibly unpleasant experience.
Is anyone else seeing this? I've been seeing this not just on smaller sites like the one linked here, but on major news sites like CNBC and CNN, too. I know that news sites are having their budgets slashed, etc., but I literally can't read articles like this. I mean my brain just won't let me complete them because it thinks it's caught in a loop or something. It's hard to describe.
I ran into a similar problem recently with an article I had to read for a class.
I've noticed with even non-news searches that I'm getting more and more results that I believe are algorithmically generated. I'm in a book club and we were trying to figure out a name for the group, so I searched for suggestions. One of the top hits was this site, where the suggestions include "Barnes & Noble" and "Half Price Books" (both are popular bookstores in the US), as well as "Issaquah Church of Christ" and "East Shore Unitarian Church". It's like a bot just crawled different place names that came up in a search for "book" and ran with it. The latter half of the "Feminist Book Club" list on the site is literally just names of libraries.
Meanwhile, if I try to search for the "best [anything]", I get dozens of sites that are just lists of things pulled straight from Amazon with affiliate links (see here, here, or here). The first example at least looks like it has some effort put into it, and the writing is readable, but the latter two are clearly cookie cutter nonsense.
The second link is generically written, with "bath sponges" inserted to match my search, though it could apply to literally any product someone is searching for:
Meanwhile the third one feels like a bot attempting to convince you it's human by conveying sexism that it clearly has no genuine understanding of or experience with:
I feel like I'm encountering increasing amounts of this -- like the internet is getting more and more polluted with these garbage sites that look exactly like what I want only because they're machine generated for every different topic under the sun. For a while I thought it was because I had switched to DuckDuckGo over Google for searching, but I'll occasionally pass the search to Google using DDG's Google bang, and it's seemingly just as bad there too.
Even bots and AI generated sites aside -- Google also seems to incentivize plenty of badly designed human-made sites as well. A quick visit to literally any recipe blog results in you scrolling through either a life story or the recipe expanded into paragraph form before you actually get to an ingredient list and recipe steps. It's very aggravating and presumably done for SEO reasons which to me implies Google is rewarding sites for how many ad hits they can generate rather than the content itself.
Yeah, I ended up purchasing Paprika Recipe Manager specifically to deal with this issue. It has a built-in browser (at least on iOS) where you can enter a URL and it extracts the recipe without you having to deal with paragraphs of unrelated text, pop-overs asking you to subscribe, etc. It's glorious. I will never go to a recipe page without it again.
Entirely off topic but one discord server I'm in has a culture of typing up fake life stories to start recipes as a joke and I must say it's source of regular joy.
My life story, like so many others, is just an allegory for great marinara sauce. You see….
Someone needs to hook these into GPT-2 and see what it spits out.
My dude, your night has been made:
https://www.aiweirdness.com/tag/recipes/
If you only have the recipie, your blog has nothing to offer over any aggregate recipie site, or even another blog. A blog is all about trying to gather a loyal following, so having a unique style is an intended part of that. Whether or not that is useful to recipie seekers is debateable though.
Recipies themselves also can't be copywriten, although I suspect most blogs don't think about that. By writing out recipies in prose, that's no longer true.
I think, at least in the past, SEO cares far more about being linked to/from, which I think is part of why Amazon (both the main site and the botspam) dominates a lot of search results.
Copyrighted, and they absolutely do. It's one of the most common pieces of advice given in the (surprisingly large) genre of "how to make a living publishing recipes/crafts" videos and tutorial posts on the 'net.
We’ve been through this before. Content farms were big before 2011 when Google started cracking down on them with its first “Panda” update.
Newer content farms seem to be getting ahead of the ranking algorithms again.
Peak Google was until about 2003, which is when those content farms and SEO reverse-engineered enough of Google's algorithms to reduce its usefulness.
I guess we need to fight fire with fire...train an AI on output from other AI to identify AI generated content and toss it in the bin.
Oh man, that is both hilarious and awful!
Yeah, it seems like a form of evolution. As consumers become more savvy and search engines learn to detect gaming of their metrics, they change. So the sites then change to compensate, etc. It's like we need some sort of Captcha before websites are allowed to be indexed by search engines or something. They should have to prove they're legitimate sites written by actual humans before we allow them to be presented to end users.
Here was an article I saw a couple of months ago about a spam site that seems to be ranking well for almost every search on Google in Norway: The mermaid is taking over Google search in Norway
Like you said, it's bad in English lately, but I expect it's even worse in a lot of other locations/languages like this because Google won't be paying nearly as much attention to them or putting the same level of resources into spam-fighting.
I don't know how useful this is to you now but I asked GPT-3 for names for a feminist book club and it suggested "RadicalReads". Not half bad IMO.
There's a very good chance that the reason you're seeing this is because a lot of news is actually written by AI. Companies exist which create AI 'skeleton' articles utilizing web scrapers and text generation such as GPT-2/3 and sell this as a feed to news companies which then take the skeleton article and have a human do some minor proofreading and reorganization of material.
I work for a company that helps people do exactly this. Coincidentally we had a dogfooding session this morning where I wrote a blog post on how to monetize a blog. It took about 30 minutes to write 1000 words. It came out better than this (not quite so repetitive) but to be honest it's similarly lacking in content.
That's certainly what it feels like! I guess once they get their click, they've achieved their objective so it doesn't matter if the content is any good.
I feel that even some of the biggest news outlets follow the general formula:
AP + Bigotry = Fox News
AP + Neoliberalism = CNN
Most content written to blogs and public news sources is garbage. They're trying to optimise for SEO. They want to get people onto the site from Google and hope that enough eyeballs will mean enough ad revenue. I would recommend getting your news from places of the calibre of the New York Times.
I mostly do, but I like to read one-off articles that I find on places like Tildes or Hackernews because they often are about subjects I wouldn't get elsewhere. This one will probably be reported on by more reputable sources, but I didn't think to look at the website it was linking to. I just clicked on it.
That said, I specifically avoid the New York Times because just about every other article linked to from other sites seems to be to them. I don't want to get all of my news from a single source. (Also, I really dislike their business. Even though their paywall doesn't affect me, the way they treat their paying users is abhorrent, requiring them to make a phone call to cancel their subscription, and just generally making it impossible to get through to them.)
I also have my issues with the NYT and I don't think its perfect. But at this point I'm honestly paying more for their Cooking app with a side of news articles than the other way around.
The Washington Post is considerably cheaper, so that's what I read and why a lot of the articles I share are from there.
I'm with you on the pricing but the veeeery Amazon friendly articles put me off. 1 vote for the Guardian or the Atlantic.
I hadn’t heard that about canceling a NYT subscription. Thankfully I pay for mine with a privacy.com card so all I need to do is shut that off when I’m done.
I will echo what /u/teaearlgraycold has said. Paying for a news subscription today seems required if you want to read decent articles. I also like NYT because with their all access subscription you get access to NYT Cooking, which is actually quite good.
Seconding NYT and adding the small gripe that their crossword puzzle is under a separate subscription.
I have a subscription to AppleNews which has a lot of good sources (once you block crap like NewsWeek and People), but even some of them seem to have this problem.
Yea, you would hope that Apple would curate that a bit more -- the onus should not be on you to blacklist articles with poor journalistic standards or unsubscribe.