23 votes

Artists lose first copyright battle in the fight against AI-generated images

18 comments

  1. [14]
    redwall_hp
    (edited )
    Link
    The thing that I guess this is eventually going to hinge on is the technicality of originality. Copyright is concerned with duplication of an original work, or obvious derivatives. Whether the...

    The thing that I guess this is eventually going to hinge on is the technicality of originality. Copyright is concerned with duplication of an original work, or obvious derivatives. Whether the original was actually observed by a person or machine actually has surprisingly little to do with it.

    You can draw things that look stylistically the same as Disney all you want, and even study their work when doing so. It's not duplication and is not considered legally derivative. Stylistic, subjective things are not protected by copyright. Only specific works are. You could independently, by pure coincidence, draw something that looks nearly identical to a Disney character...and distributing it would still be infringement. Similarly, it wouldn't technically matter if a person or machine learned from existing examples and generated something novel.

    The same is true for source code, in the Copilot case. It doesn't really matter if a system trained on GPL code or a person does it: distributing code identical to the published code is infringement. But at the same time, if it generates novel code that is distinct from the existing examples, it's also not any more infringement than if you wrote it yourself with your knowledge of how an algorithm must function.

    Personally, I've been for dialing back copyright overreach for a long time, due to the chilling effect it has on creativity. It's dismaying to see popular support for incredibly draconian expansion. And considering how many artists make a living drawing private commissions for fans of various games, anime, etc (or freely distributing fan art in general), opening those various cans of worms is something they would probably regret. (It would also basically be a death warrant for music in general.)

    Edit: As an addendum to explain the last parenthetical, music is a creative area that is deeply impacted by copyright overreach. It has fundamental rules (stemming from mathematics/physics) that limit possibilities to create things that are aesthetically...music at all. It has concepts like genre, which boil down to stylistic similarity. Some genres not only use similarity of composition, but sample short segments of recordings. These have been the subjects of many spurious lawsuits over the decades, and it's slowly getting to the point where any song release is an invitation to litigation, no matter how much you check your creativity against the legal creep.

    We're at the point where there is litigation over chord progressions, which is ludicrous. (I highly recommend Adam Neely's YouTube videos covering the Ed Sheeran, Katy Perry and Dua Lipa lawsuits for in depth examples. And those are big names. Smaller artists just give in.) We're already at the point where music copyright chills creativity and takes away artists' opportunity...all to serve a rentier class that is simply seeking more intellectual real estate to own. Rock music wouldn't exist if someone could own the 12 bar blues progression or Bo Diddley beat. House music or Hip Hop wouldn't exist with modern case law over sampling.

    36 votes
    1. [2]
      stu2b50
      Link Parent
      It is interesting how the rise of generative ML has caused a 180 in attitude towards copyright in certain circles. I've seen people who vehemently opposed copyright in yesteryear become staunchly...

      Personally, I've been for dialing back copyright overreach for a long time, due to the chilling effect it has on creativity. It's dismaying to see popular support for incredibly draconian expansion. And considering how many artists make a living drawing private commissions for fans of various games, anime, etc (or freely distributing fan art in general), opening those various cans of worms is something they would probably regret. (It would also basically be a death warrant for music in general.)

      It is interesting how the rise of generative ML has caused a 180 in attitude towards copyright in certain circles. I've seen people who vehemently opposed copyright in yesteryear become staunchly supportive of more stringent copyright protections within the span of not even a full year. I suppose it indicates that much of ideology is around personal benefit, in the end - if you were a small or indie artist, copyright before was mostly a cudgel used against you. Now, its role as potentially as the only defense against being crowded out by generative machine learning overrides prior concerns.

      21 votes
      1. teaearlgraycold
        Link Parent
        Well, IP rights are entirely fabricated and exist solely to give different parties certain benefits. I think most people have been consistent in their belief that copyright is in dire need of...

        I suppose it indicates that much of ideology is around personal benefit

        Well, IP rights are entirely fabricated and exist solely to give different parties certain benefits. I think most people have been consistent in their belief that copyright is in dire need of legal updates.

        2 votes
    2. [10]
      legogizmo
      Link Parent
      I think fundamentally copyright is the wrong way to deal with the issue. Like you said, copyright deals with the originality of a work, and AI certainly seems to fall under fair use in this...

      I think fundamentally copyright is the wrong way to deal with the issue. Like you said, copyright deals with the originality of a work, and AI certainly seems to fall under fair use in this context.

      Whether a work can or can't be used for machine learning is a new right that needs a new law to implement. But as of now, there are no protections.

      Additionally, whether the results of an AI process can be copyrighted is also something that needs to be addressed by new laws. (The common sense approach here would be that any generated work would have the same license as the most open work that was used to train it)

      12 votes
      1. [8]
        redwall_hp
        Link Parent
        Exactly. I see them as two sides of the same "using copyright as a cudgel" coin. I'm all for specific laws regulating use, both to protect professionals' livelihoods (whether they're artists,...

        Exactly. I see them as two sides of the same "using copyright as a cudgel" coin. I'm all for specific laws regulating use, both to protect professionals' livelihoods (whether they're artists, musicians, writers, software engineers, doctors, etc) and to avoid unethical uses (e.g. mass surveillance).

        Copyright is entirely the wrong tool, and I can't help but think the current popular sentiment might be something driven by, say, record labels or publishers. The value of intellectual property and the ability to monopolize it, after all, are more desirable to them than the ability to churn things out cheaper and saturate the market.

        8 votes
        1. [8]
          Comment deleted by author
          Link Parent
          1. [7]
            Grumble4681
            Link Parent
            The copyright office did make that ruling, but I don't think that it can be considered set in stone at this point. Here's a good opinion article that I think does a good job explaining the...

            The copyright office did make that ruling, but I don't think that it can be considered set in stone at this point.

            Here's a good opinion article that I think does a good job explaining the pitfalls of that ruling. It seems infeasible that there will be able to be a determination on many works what is copyrightable and what isn't. That ruling will be something that they likely cannot uphold once more and more cases get tested.

            4 votes
            1. [7]
              Comment deleted by author
              Link Parent
              1. [4]
                stu2b50
                Link Parent
                I don't think it's that black or white. For instance, look at photography - does the photographer own the bitmap result of their camera? Most jurisdictions say yes. But the photographer only has a...

                I don't think it's that black or white. For instance, look at photography - does the photographer own the bitmap result of their camera? Most jurisdictions say yes. But the photographer only has a few inputs to the process - the composition, an exposure settings, primarily.

                Would you say that the photographer only owns the copyright to the composition - say, GPS coordinates and a 3d angle tuple - and the A/SS/ISO settings? Or do they own the copyright for the entire result, even if the sensor and the software that processes the sensor's output does all of the work of producing the bitmap image itself?

                1. [4]
                  Comment deleted by author
                  Link Parent
                  1. [3]
                    stu2b50
                    Link Parent
                    I don’t think it’s out of the question then for the person prompting generative models to own the copyright for the bitmap it produces either, then, no?

                    I don’t think it’s out of the question then for the person prompting generative models to own the copyright for the bitmap it produces either, then, no?

                    1. [3]
                      Comment deleted by author
                      Link Parent
                      1. [2]
                        stu2b50
                        Link Parent
                        No, it would be like owning a bitmap image - a 3xWxH matrix of numbers. Which you can do. I own the copyright for the bitmaps of all the photos I took, for instance.

                        No, it would be like owning a bitmap image - a 3xWxH matrix of numbers. Which you can do. I own the copyright for the bitmaps of all the photos I took, for instance.

                        1. [2]
                          Comment deleted by author
                          Link Parent
                          1. stu2b50
                            Link Parent
                            Sure, you actually own a bit more than the matrix. But you also do own that specific matrix. If someone copied that specific sequence of numbers, they would be violating copyright. You don't own...

                            Sure, you actually own a bit more than the matrix. But you also do own that specific matrix. If someone copied that specific sequence of numbers, they would be violating copyright. You don't own all numbers, or any component numbers, but that sequence of numbers together is something you own the copyright to.

                            Now, with AI the output is just math.

                            So is the output of a camera. It's raw sensor data that is processed by software - the actual results of the jpeg or raw file come from algorithms and math. I only supply some tangential input to the creation of the bitmap. I don't own the algorithms - sony does.

                            Or even if you want to say the reflection of physical, real world data from the sensor is what makes it copyrightable, what if you do the same with a model? If you seed it from physical sensor data?

                            Or what about pix2pix generative algorithms? If I take a photo I took, and run it through pix2pix, how is that any different than the algorithms that turn raw sensor data into a processed jpeg?

                            1 vote
              2. [2]
                Grumble4681
                (edited )
                Link Parent
                Yes, that's the argument. The debate beyond this is to what degree human input alters the identity of authorship of AI output. I also find it strange you don't think they make good arguments, but...

                The copyright office only takes works of human authorship. The argument is that AI output is not human authorship.

                Yes, that's the argument. The debate beyond this is to what degree human input alters the identity of authorship of AI output.

                I also find it strange you don't think they make good arguments, but don't cite a single argument and you completely disregard one of the primary examples it uses to drive its argument which is the history of photography with relation to copyright. It comes across as low effort contrarianism. I didn't post it with the intention of having a debate or to substantiate its arguments, but merely posted it to give some perspective to the person I replied to of how it isn't necessarily settled because there's other perspectives of the challenges that will be encountered with ruling copyrights that way. However I will follow it up some.

                The nation’s highest court acknowledged that “ordinary” photographs may not merit copyright protection because they may be a “mere mechanical reproduction” of some scene.

                By contrast, the court said the Wilde photograph reflected Sarony’s “original mental conception,” which he had brought to life by “posing Oscar Wilde in front of the camera, selecting and arranging the costume, draperies, and other various accessories in said photograph, arranging the subject so as to present graceful outlines, arranging and disposing the light and shade, suggesting and evoking the desired expression.”

                So even though a mechanical process captured the image, it nevertheless reflected creative choices by the photographer, and therefore deserved copyright protection.

                That is from the opinion article I linked. The court ruling using wording "original mental conception" is actually quite interesting considering the subject we're talking about, because that wording alone gives weight to human conception being the foundation of authorship rather than the mechanical process by which conception becomes reality. Now of course it actually refers to both in some ways, which all of this is discussed in the opinion provided in that article. The bottom line of this example is that all photography ends up being copyrightable regardless of whether someone posed their subject in front of the camera or selected the costume, draperies etc.

                With most AI, the algorithm does all the drawing. In theory, the artist could own the prompt, if it was suitable complex enough. But not the output.

                The output could be considered a weave of the prompt and the training data, possibly some of the code that facilitates the weaving. Now it is likely most people are not creating their own training data and certainly not contributing code to these AI softwares, but the prompt is perhaps the most critical component, because it reflects the "original mental conception" of a human.

                This is all well within the purview of the Copyright Office unlike what you are stating, because there's actually still a claim to be made that there is human authorship. You're acting as though that is a foregone conclusion, but the opinion article uses a strong example with photography to substantiate that the amount of human input in AI work could certainly be considered enough to be human authored. The Copyright Office does not have its hands tied to this ruling and could just as easily change it, it does not require the work of Congress to rewrite the laws to make this happen.

                1. [2]
                  Comment deleted by author
                  Link Parent
                  1. Grumble4681
                    Link Parent
                    They only mention Photoshop once in that opinion, and how they mentioned it was not really addressed in your argument. It's certainly not clear that you were directly addressing what they had...

                    I paraphrased their argument about Photoshop because that is more topical to the actual argument. Photography is kind of a non-sequester since it's "Solved". Which really my main point, the current legal doctrine on AI output is "solved" in the sense that the law is consistent with itself, and with current paradigms.

                    They only mention Photoshop once in that opinion, and how they mentioned it was not really addressed in your argument. It's certainly not clear that you were directly addressing what they had mentioned about Photoshop, and it was the briefest of mentions compared to the length of the opinion.

                    Then you say photography is a "non-sequester", which I don't know what that means, I assume you mean non-sequitur but I'm not sure because even that doesn't make sense. You then keep saying things are solved, but you refer to these things very indirectly or abstractly and it really just comes across as hand-waving. You can say whatever you want and sound confident saying it, while also using the words law and consistent, but it doesn't make what you're saying substantive or true. It's like people who go around shouting the word "science" and pretending like it legitimizes everything they have to say, but they didn't actually use any scientific analysis to substantiate their arguments.

                    The idea that the current legal doctrine on AI output is "solved" in any sense is nonsense, considering how it is an emerging technology that has barely been addressed given the totality of the work and scope at which it can potentially operate. Not only can this tech change rapidly within the next few years, our institutions don't operate fast enough to have "solved" any of this yet, let alone anticipating it's near future. It's one thing to speak confident within opinion, which I'm fine with, and another thing to make claims like this that are attempts at fact-cudgeling when it's clearly not fact yet. It could be fact eventually, but it's currently in the debate stages, which is whey it's being openly debated (and no it's not enough to say that everything is debated because surely there are fact-skeptics out there for everything, so I'm drawing a line above that to require more than just a few kooks to say something is openly debated) nearly everywhere of substance, in courts, in congress, in the executive branch, in business settings, various online discussion boards etc. Now if you want to say that you strongly believe that it will all be ruled this way, then fine, I don't have an issue with that. I strongly believe otherwise.

                    I mean, by posting you invite debate or discussion, don't you? Perhaps I misunderstand your intention. I merely saw an point I've seen brought up elsewhere that, from my perspective, has a clear answer.

                    I meant more so that it was my expectation that it was a rather simple introduction to the opinion that the Copyright Office made a mistake and that it can be changed to a different ruling. I didn't expect that it would garner such an odd response.

                    And you could in theory own the prompt that generated it, not the "AI brush stroke", i.e. mathematics of it. Things get a bit more complicated though, when you consider random noise is being added. It actually dilutes the human authorship claim further since it's not just the prompt generating the image.

                    The prompt itself is not really anything other than words, you can copyright the prompt to the degree you can copyright any sequence of words. The prompt inputted into various software with various training sets can produce completely different outcomes. The idea that you can copyright only the prompt as it relates to the output is nonsense because the prompt means nothing without the other factors. That's what you're seemingly referring to as "noise" that dilutes it, but because the AI is a tool, your choice of tool combined with the prompt is further expression of your "original mental conception".

                    Merely capturing a fact, i.e. that something happened is not copyrightable, though organization of facts can be in that the organization itself is copyrighted. A human is required to take the picture

                    To my understanding, you can set a timer on a camera and it can be considered human authored. The human chose the timer, along with the background etc. at the time the timer goes off. Of course no one owns copyright of the sun because they took a picture of the sun, but they can own copyright over very specific details of a photograph in which the sun is the primary subject, which ultimately ends up meaning just something that is a straight up copy of the exact pixels of the photograph, because a million people can take a million pictures of the sun at any given time and come up with tiny minuscule differences between each of the pictures that would effectively render them as different in terms of copyright.

                    Just like photography, the degree to which someone needs to express "artistic intent" will become relatively meaningless because the vast majority can qualify by claiming artistic intent in otherwise menial decisions. Even if someone does not handpick every piece of training data that goes into any particular generative AI, the fact that there will be some people who are able to modify what training data they use is one subset of people who can make a case that not only do they have their prompt as one component of their work, but they can have the training data as the other. Even if that's only a small minority of people who are doing that, choosing different software with different sets of training data can also be considered part of human input since that combined with the prompt produces a different outcome. There may even be easy tweaks to the coding in terms of how it weights certain factors of the data that become user tunable.

                    That is the point in the opinion of that article, eventually humans can push the line so far that the copyright office will have no choice but to view that all these little pieces represent human choice and input and that it is significant enough to be considered human authorship.

      2. vektor
        Link Parent
        But considering that different rights holders might maintain parts of their rights to things that went into the model, this might end up being a complete mess. Currently, technically, the...

        (The common sense approach here would be that any generated work would have the same license as the most open work that was used to train it)

        But considering that different rights holders might maintain parts of their rights to things that went into the model, this might end up being a complete mess. Currently, technically, the strictest interpretation of copyright (which is what you're proposing ultimately is) would hold that a lot of people have rights to the created output: The author of the software, and every author of training data. They all contributed. Maybe some of them waived their rights as part of this licensing agreement or that copyleft notice, but broadly they have rights. If I'm not mistaken, if and only if they can agree on it, can they monetize or publish or do anything else that is covered by copyright. In short: That work would be in complete limbo, unless the training dataset was absolutely sterile, in which case the software author would be the only one with a horse in the race. Well, and maybe the model can be prompted; though whether a prompt would be sufficiently creative as to be copyright-protected (thus making the output a derivative work of the prompt) is a difficult can of worms.

        I don't think that's a viable approach, we're seeing the cracks now. Almost all of these authors have such a miniscule impact on the work as to be meaningless. I think in the interest of making AI more accessible to everyone, not just the big companies, it's in our common best interest to make training data a big old free-for-all. "Everyone" can write the code for training the damn thing. Some people even have sufficient compute. Everyone can prompt these things. The only thing us commoners don't have is the vast datasets that the big companies have, and the legal/financial firepower to defend their (legally shaky) interest in these datasets. If we make the datasets a free-for-all, meaning under some clearly defined circumstances (meant to eliminate overfitting-based copyright trolling), a model is no longer beholden to the copyright of its training data, then everyone can have access to the tools we're currently developing.

        2 votes
  2. [2]
    raccoona_nongrata
    Link
    There's too much money at stake with AI, unfortunately artists are going to lose this battle. Getting the labor of artists for free to fuel your paid AI project? No way every major company isn't...

    There's too much money at stake with AI, unfortunately artists are going to lose this battle. Getting the labor of artists for free to fuel your paid AI project? No way every major company isn't going to fight tooth and nail to wrestle that value away from the artists.

    9 votes
    1. Eji1700
      Link Parent
      Yeah it's going to be ugly. There's 0 way corps are going to not find a way to make AI use legal for them, but using it to make anything that looks like mickey mouse gets you sued into dust. We...

      Yeah it's going to be ugly. There's 0 way corps are going to not find a way to make AI use legal for them, but using it to make anything that looks like mickey mouse gets you sued into dust.

      We really need to do a total rework of the system, bu that's not happening anytime soon.

      5 votes
  3. [2]
    Habituallytired
    Link
    This ruling bothers me. These people didn't register their copyrights? That's not how copyright works in art in California, as far as I'm aware (I'm no expert, but I have spent a lot of time in...

    This ruling bothers me. These people didn't register their copyrights? That's not how copyright works in art in California, as far as I'm aware (I'm no expert, but I have spent a lot of time in the knitting circle and copywank was a term in dear usage there).

    Also see page 5 in this ppt (😂) from a ca.gov link: https://olsip.apps.dgs.ca.gov/Content/IPTrainingV6.pdf

    6 votes
    1. stu2b50
      Link Parent
      You have copyright automatically but you must register with the copyright office in order to bring a copyright suit to trial.

      You have copyright automatically but you must register with the copyright office in order to bring a copyright suit to trial.

      15 votes