17
votes
Is YouTube's use of AI upscaling for Shorts unethical?
Link information
This data is scraped automatically and may be incorrect.
- Title
- YouTube Is Using AI to Alter Content (and not telling us)
- Authors
- Rhett Shull
- Duration
- 13:57
- Published
- Aug 14 2025
This was uploaded back in August, and YouTube subsequently responded on Twitter to clarify that it was testing AI upscaling, not altering content. I've changed the title here in an attempt to avoid sensationalism while retaining valid concern.
I try my best to avoid YouTube Shorts for all sorts of reasons (that they have in common with other social media video feeds like TikTok, Instagram, etc.), but one of them is the proliferation of what seems to me like AI-generated content. In my rare forays into YT shorts (before coming across this video), I see many with the same kind of smeary effect Rhett mentions, but surely a lot of them are actually real videos that have had some sort of weird filter intentionally applied over them. However, I don't like it, so even seeing videos that are upscaled and have an unintentionally similar filtered result is disgusting to me. Add that on top of mountains of clickbait and propaganda and you can argue that YouTube has, at minimum, been irresponsible with this. It's easy to be paranoid - for example, try searching for Burkina Faso president Ibrahim Traoré on YT, for any first-party reporting, and see how much crap you have to trawl through. By the time I finally found some, I started wondering if I was imagining that telltale AI character stiffness in him or not, even in regular, non-shorts videos.
This gets to what I think may be the core question on the ethics of AI use, which is mentioned in the latter part of the video - the inevitable erosion of trust when humans cannot easily determine whether they are interfacing with AI-generated content in any format.
My hope is that we implement some sort of requirement that any published media discloses whether it uses AI and maybe even how much, in the same way that food products containing GMO ingredients must be labeled accordingly. How this can be practically applied and/or enforced, I have absolutely no clue... we're heading down the road where everyone and everything will need a label. Still, there need to be guardrails for the fundamental assumptions we have utilized for millennia to navigate human society. I'd even go as far as making it illegal for any chatbot, AI agent, etc. to use first-person pronouns.
Ok to be fully honest, would anyone actually get upset about this issue if it wasn't reported as "AI Upscaling"? If it was just called "Upscaling" or "enhanced transcoding" or something, people would probably either be unaware, or think that it's a cool feature. "AI" upscaling has existed for quite a while in one form or another, and has quietly been a powerful tool for a lot of purposes.
The question is valid though, if Youtube is using this as an excuse to train AI on all youtube shorts that definitely is a misleading and scummy thing to do; but I'd have to see any scrap of proof that its actually happening, not just "what if they were doing this?"
I get that everyone is rightfully a little touchy over AI since pretty much every major company has trawled the internet and stolen zettabytes of data to train their AI on, but there isn't even any real evidence that this is that. Youtube claims its traditional machine learning upscaling and explicitly not generative AI which is neither new nor damaging to artists/creators. I think it's vitally important that we don't just call everything "AI" until it completely loses it's meaning (even though companies have been doing exactly that as a marketing buzzword.) It's just muddying the already very muddy waters and confusing people who don't have any idea of what's actually going on and are just following whatever the youtuber says.
Youtube absolutely should make this an Opt-in service though, simply because people are so wary of it. No reason to give people a service they don't want.
I agree with you on the broad context - companies have called anything and everything AI, and now plenty of perfectly useful tools are getting caught in the backlash when they really shouldn’t be.
That tweet is nonsensical though, and actually makes me a lot less confident in what they’re doing (assuming that’s an official account?):
Debatable, because everything gets an AI label stuck on it now, but I see what he’s saying.
If it’s not upscaling, then it’s explicitly doing something designed to alter the content of the frames in-place. I’d argue that’s significantly worse in terms of respecting creator intent, and significantly less likely to be a technical step to reduce transcoding artefacts or similar. Actual upscaling, i.e. just making it bigger and using machine learning to take an educated guess at how to fill in the extra pixels, would be a lot more defensible IMO.
I’d be amazed if that’s not a transformer model of some kind, which he defined as GenAI a few lines above. If it happens to not be a transformer because it’s, say, a state space model or a unet instead then they’re being intentionally obtuse because that’s just an implementation detail.
And computational photography is important partly because it’s used as a way to interpret sensor data into a usable image - that’s the bit you can’t turn off. YouTube doesn’t have the original sensor data, they have the finished video.
When computational photography is done to alter the resulting image, we call that a filter, or facetune, or yassify, or image enhancement in general. And again, while I think the “AI or not AI” conversation is misleading, I definitely don’t think YouTube should be applying filters to people’s videos without the option to turn it off - whether it’s a transformer model, a handwritten algorithm, or a bunch of interns in the basement with rolls of film and airbrushes!
[Edit] Just to really hammer the point: the different approaches to computational photography, and/or the ability to store the raw format data and alter that processing in post, is one of the major selling points and areas of differentiation between the various phone brands and between higher end cameras.
Whether or not it’s “AI” (and I’d bet good money that it is, not by my own personal definition, but by the definition Google have decided to use across their other marketing), it’s unilaterally overruling a choice that the creator probably spent very good money on making when they selected their camera setup and editing workflow.
To play devil's advocate here: how is an upscaling filter that different to YouTube reencoding videos in newer algorithms or allowing uses to full screen the video when their screen resolution differs from the video resolution? Those compression algorithms and scaling algorithms also encode biases and choices about what's most important to keep and what can be lost.
It's not like people never noticed that either, there was lots of talk of e.g. webp or av1 compression smoothening film grain because the film grain is high frequency data that compresses well and the algorithm designer's definition of perceptual quality considered that an ok trade off for file size.
They'll also re-encode high bitrate videos at lower bitrates to save on storage and bandwidth costs, which is a problem if you want to show details like snow or rain, for example.
And iirc YouTube's original recommended upload file was like flv containers with 360p h263 video content and MP3 audio. These are mostly converted to av1 with opus audio as the "primary" format and then potentially transcoded to h264 and aac depending on the receiving devices compatibility. And some of those older videos end up looking even poorer quality and not as the creator envisioned them than they might have been from all these transcoding steps. Sure some of it is that we were all so amazed at online video being a thing and everything on the internet had jpeg compression artifacts that we didn't notice some noisy line boundaries on Dr octagonapus or whatever meme videos we were watching in 2010, but also some of those older videos (maybe not quite going back into the 360p era) people have downloaded copies of or had older screenshots to compare to see that the quality has deteriorated from YouTube's reencodes over the years.
Like I said, I think upscaling would be a lot more defensible - they're explicitly saying it isn't upscaling, which means for whatever reason they're intentionally altering the actual visual content.
You're always going to be making technical tradeoffs with encoding, resolution, etc. when distributing, and those are potentially going to butt heads with the creative vision, but to some extent that's a necessity of the medium. It's the same even with cinema distribution, going pretty much all the way back to the start: technical and financial limitations put in place by the distributors existed, creators were sometimes frustrated by them, and on occasion the creators with more leverage actually managed to get the distributors to make improvements (THX is a great example, even though it's audio rather than visual).
The big difference here is that YouTube appear to be making changes with the intent of altering the creative work, not as a side effect of technical limitation. It could still be purely technical - I did mention that as a possibility below - but if it were I'd kind of expect them to come out and say that, and I'm skeptical that things we're seeing like face filtering are happening as a result of a process with purely technical intent.
Based on their statements and the type of changes we're seeing, it seems more likely to me that they're deliberately altering content to be more algorithm-friendly.
Yes, after seeing the results I agree this shouldn't have been done to people's videos without their permission. I don't think that because of any ethical reasons though, I just think it looks bad haha
Right with you there! For all that I think the deeper debates around consent, content use, and the rest are important, a big part of me is also screaming "you went to all that effort, spent all that money, pissed off all these people, and the results aren't even good?!"
Fully, yes. People's attention was drawn to this not by an announcement (it was an entirely silent launch), but by the actual visual effects it has. The technique they're doing is not just upscaling, it's also adjusting contrast, vibrance, sharpness, and it has a mild but noticeable impact on content that was not intended. For most content, this can be noticed as just an obnoxious oversharpening effect combined with a smeariness of unclear detail. For some content, this highlights edges that were supposed to be subtle in a way that actually reduces visual clarity or just negatively impacts the aesthetics. And although I said it wasn't "just" upscaling just now, it also doesn't really appear to actually be any kind of upscaling at all, as it's donating obnoxious generative-AI-esque smears, smudges, and nonsensical detail to already-high-resolution videos.
I follow one artist, SavannahXYZ, who posts claymation-style 3D models that often have subtle fingerprint details in the materials. This AI upscaling, particularly the oversharpening and edge contrast boosting, makes that subtle detail leap out at you. The effect is honestly (mildly) unsettling.
I've noticed for some types of animation, particularly animation that's primarily line art, where the edge highlighting ironically makes it quite difficult to actually even see the lines from a distance, as the brightening effect around the darker line art (again, from a distance) averages out into blending right in with the background color.
So, yes, people noticed it in the first place because it was affecting their content in ways that no other video host does. In ways that no longer faithfully represent their artistic vision. Some videos, styles, and lighting scenarios are affected worse than others by this automatically-applied and impossible-to-remove technique. The AI is just the shit cherry on top for many of those artists.
I've been wondering a bit about why YouTube might be doing this, so here's what I've got so far, from least to most nefarious possibility:
Your most nefarious possibility is immediately what jumped to my mind. Intentionally making it very hard to tell what is generated content opens a lot of doors for malicious actors to put out intentionally harmful content (political actors making opposition look bad, etc...) and for YouTube specifically to start generating their own content that they can directly profit from without any creators to split revenue.
It seems like this is wrong for the same reason that applying an instagram filter to a user’s photos without the user’s consent would be wrong. Authoral intent should be respected and it’s disconcerting that someone at YouTube didn’t see anything wrong with it.
I wonder how much controversy it resulted in internally? More than decade ago when I worked for Google, there would be big internal controversies about such things, but perhaps nowadays people are more resigned.
I did a cursory news search and didn’t find any followup. Did they roll it back?