75 votes

Report: Potential New York Times lawsuit could force OpenAI to wipe ChatGPT and start over

56 comments

  1. [13]
    Rocket_Man
    Link
    Just considering the purpose of copyright, I don't see how OpenAIs usage compromises The New York Times incentive to produce and benefit from their content. However, making that content available...

    Just considering the purpose of copyright, I don't see how OpenAIs usage compromises The New York Times incentive to produce and benefit from their content. However, making that content available to train models that don't produce significant copies of the original content seems like it would encourage innovation and could benefit society. That seems to align with the idea of fair use.

    That being said, I don't think fair use currently protects OpenAI here. Meaning the outcome of this lawsuit could be a turning point in copyright law to expand the definition of fair use which is kind of exciting.

    39 votes
    1. [12]
      hammurobbie
      Link Parent
      I think what they'll argue is that OpenAI directly threatens the revenue stream NYT currently relies on. You could, for instance, ask ChatGPT, "Pretend you're an NYT columnist. What do you think...

      I think what they'll argue is that OpenAI directly threatens the revenue stream NYT currently relies on. You could, for instance, ask ChatGPT, "Pretend you're an NYT columnist. What do you think about the Trump trial?"

      ChatGPT could respond, "Milton Friedman thinks ... " and then give you in almost exact detail what Friedman would write. No one would need to read the Times anymore, and Friedman is out of a writing gig.

      However, it's almost like demanding a copyright before a work is produced. That might be difficult to argue.

      23 votes
      1. Rocket_Man
        Link Parent
        They could try and argue that, but AI will never be Milton Freidman, and while he could theoretically be put out of a writing gig by a good enough AI mimic. We'd need to decide if that is just...

        They could try and argue that, but AI will never be Milton Freidman, and while he could theoretically be put out of a writing gig by a good enough AI mimic. We'd need to decide if that is just competition or if something more intrinsic has been unreasonably copied from him, what it is, and if it can be protected.

        27 votes
      2. [2]
        Grumble4681
        Link Parent
        I think generative AI will necessarily have to turn on its head how people view uniqueness to human work. The training element of it being a key part of how we treat how it gets the results it...
        • Exemplary

        I think generative AI will necessarily have to turn on its head how people view uniqueness to human work. The training element of it being a key part of how we treat how it gets the results it does, but ultimately when we've created machines that can outproduce the collective minds of humans in a matter of seconds, uniqueness isn't what it was before. Even prior to this, uniqueness wasn't viewed appropriately IMO, because the more humans there are, the less things there are that end up being unique. How many people think they have some unique thought or response to something, only to come into a comment section and find hundreds/thousands/millions of other people already thought it.

        Take generative AI in terms of visual production for example. The average person can write words on a page, make a book, but they can't frame by frame generate realistic pictures of things to create their own movie. But what ended up being possible was groups of people being able to come together with video recording equipment to act out things happening and create the visual production of the story that was being written. Now there's the possibility of generating artwork frame by frame that's more cost effective than hiring people to act it out, a single person isn't limited to only writing words down in a book. But in order to do this currently, these models are trained on existing data. I don't know much on the technical aspects to how it works, but in one way I could conceive of it working is like being presented with a captcha where you have to select the bridges or the street lights etc. and at one point in the past, whether those are being used that way or not now, those were used to tell various systems what the objects in the picture were. A very manual and work intensive way of getting people to tell these systems how to randomly generate collections of pixels into what we identify as objects in the real world could be one way of viewing generative AI. Except currently these systems are using already created data, much of it copyrighted, to automate telling the AI what these objects look like and move like.

        Basically, the output of what they do to the extent that they do renders the uniqueness of human output to be not unique anymore. Every single word or idea I expressed in this comment could made by generative AI or an LLM today, but if I expanded it into a book or somehow published it, is there any reason why I should have any ability to copyright what I've said if some machine could produce it as well?

        14 votes
        1. boxer_dogs_dance
          Link Parent
          Honestly I have always hated that aspect of captchas. Don't require me to train your self driving car to recognize traffic obstacles without compensation!

          Honestly I have always hated that aspect of captchas. Don't require me to train your self driving car to recognize traffic obstacles without compensation!

          11 votes
      3. [2]
        Algernon_Asimov
        Link Parent
        But if Milton Friedman stops writing articles, then where would ChatGPT get its source material from?

        ChatGPT could respond, "Milton Friedman thinks ... " and then give you in almost exact detail what Friedman would write. No one would need to read the Times anymore, and Friedman is out of a writing gig.

        But if Milton Friedman stops writing articles, then where would ChatGPT get its source material from?

        1 vote
      4. [6]
        tomf
        Link Parent
        I asked it to rewrite this article as a New York Times columnist Prompt: Pretend you're a New York Times columnist, please rewrite this article...

        I asked it to rewrite this article as a New York Times columnist

        Prompt: Pretend you're a New York Times columnist, please rewrite this article https://arstechnica.com/tech-policy/2023/08/report-potential-nyt-lawsuit-could-force-openai-to-wipe-chatgpt-and-start-over/

        Title: Potential Legal Challenge Looms Over OpenAI, Sparking Debate on AI Regulation

        By [Your Name]

        Date: August 17, 2023

        In a development that could have far-reaching implications for the field of artificial intelligence, concerns are rising over a potential legal battle involving OpenAI and The New York Times. The dispute centers on the capabilities and accountability of OpenAI's flagship language model, ChatGPT, which some critics argue has prompted a debate over AI regulation.

        The New York Times, a renowned institution in journalism, is reportedly considering legal action against OpenAI, the organization responsible for developing ChatGPT. The newspaper alleges that ChatGPT has been misused to generate content that mimics and disseminates copyrighted material without proper authorization. This potential lawsuit raises pressing questions about the boundaries of AI-generated content and the responsibilities of AI developers.

        At the heart of the matter is the extent to which AI models like ChatGPT can be held accountable for their outputs. While OpenAI has made concerted efforts to fine-tune its models and curb potentially harmful or unethical outputs, the case has ignited a broader conversation about the need for clear regulations governing AI systems.

        Legal experts assert that the outcome of this potential lawsuit could set a precedent for future AI-related legal disputes, prompting a reevaluation of the responsibilities AI developers bear in ensuring their technologies adhere to copyright and ethical standards. If The New York Times were to prevail, it could potentially compel OpenAI to wipe the existing ChatGPT model and initiate a fresh start—a move that would likely disrupt AI development efforts and raise questions about the feasibility of preemptive content filtering.

        The incident comes at a time when governments, industry stakeholders, and researchers are grappling with the burgeoning power of AI and its implications for society. Advocates of robust AI regulation argue that without clear guidelines, the potential for AI models to generate misleading, harmful, or infringing content remains a pressing concern.

        OpenAI has been proactive in engaging with these challenges, continually iterating on its models and implementing safety measures to minimize unintended consequences. However, the New York Times lawsuit underscores the complexity of the AI landscape and the need for comprehensive frameworks that strike a balance between innovation and accountability.

        As the legal saga unfolds, it is becoming evident that the potential lawsuit could act as a catalyst for deeper conversations surrounding AI ethics, regulation, and accountability. Stakeholders from technology, legal, and journalistic communities will undoubtedly be closely watching the proceedings, as the outcome may shape the trajectory of AI development and its role in shaping our digital future.

        In conclusion, the reported legal dispute between The New York Times and OpenAI has ignited a pivotal discussion on the responsibilities of AI developers and the need for effective regulation. While the case navigates the legal system, the broader tech industry and society as a whole will be pondering the delicate balance between innovation, creativity, and accountability in the realm of artificial intelligence.

        And how about an op-ed from Sarah Silverman?

        Reply prompt: this is great, but rewrite it as an op-ed by Sarah Silverman

        Title: AI's Wild West: Who's the Sheriff in the Town of Virtual Creativity?

        By Sarah Silverman

        Date: August 17, 2023

        Hey there, future thinkers and pixel pioneers! Buckle up, because we've got a showdown brewin' that's hotter than a jalapeño popper at a chili cook-off. You've probably heard about this little spat between The New York Times and those brainiacs over at OpenAI. Yeah, apparently, they're butting heads over OpenAI's brainchild, ChatGPT. Strap on your boots, folks, we're headed for the wild, wild west of AI and creativity.

        Picture this: The New York Times—yes, the big ol' granddaddy of newspapers—throws down the gauntlet, claiming that ChatGPT has been traipsing through the digital saloon, knockin' back some whiskey, and swiping copyrighted material without asking for a "pretty please." Now, before we jump to conclusions, let's mosey on over to the heart of this dust-up.

        It's like this, partner: The Times says OpenAI's creation is trotting all over copyright turf, and they're fixin' to rope in the AI gunslinger with a lawsuit. If they wrangle a victory, they could wring out a brand-spankin' new ChatGPT and send it back to schoolin', learnin' some manners and all that. But hold your horses, we gotta ask ourselves—should we be lassoing AI for the same shenanigans we'd corral a human for?

        Now, I'm no legal eagle, but this whole rodeo raises questions that are as prickly as a cactus. How much are we holdin' AI accountable for its robo-wordsmithin'? Ain't it like blame'n a tumbleweed for rollin' across your path? OpenAI's been sweatin' bullets to train these AI critters to be good, responsible netizens. They're aimin' to stop AI from churnin' out nonsense or worse, spreading the digital equivalent of snake oil.

        But here's the hitch, folks. This ruckus isn't just about ChatGPT and The Times havin' a little tiff. This here dispute is settin' the scene for a showdown that could shape the future of AI. If The Times gets its way, OpenAI might need to hitch up its wagon and start fresh. Now, that might sound like a grand ol' hoedown, but it could also throw a monkey wrench into the gears of AI progress.

        We're in a time where AI's puttin' on its spurs and ridin' alongside us in the virtual sunset. Governments, cowpokes, and tech wranglers are wonderin' how to tame this new frontier. Some folks are sayin' we need to corral AI with regulations before it goes all rootin' tootin' and causing chaos.

        OpenAI's doin' its darndest to dodge these bullets. They're workin' to keep their AI from actin' like a rogue outlaw, but it's tougher than herdin' cats in a thunderstorm. This kerfuffle with The Times shines a spotlight on just how high the stakes are. It's like deciding if we're gonna throw that AI varmint behind bars or let it roam free.

        So, what's the takeaway from all this hullabaloo? Well, partners, it's time we mosey on over to the waterin' hole of ethics and law. We gotta rustle up some rules that help AI play nice and square, without stompin' on toes. It's a tall order, but reckon we gotta do it. If AI's the new sheriff in town, we better make sure it ain't ridin' roughshod over our digital saloon.

        As the showdown between OpenAI and The New York Times unfolds, remember this ain't just a duel in the digital streets. It's a showdown that's raisin' the curtain on a whole new era. So, let's saddle up, folks, 'cause we're headin' straight into the unknown—and it's gonna be one wild ride.

        Sam Harris responds to Sarah Silverman! Sam Harris's Response:

        Title: Charting the Course for Ethical AI: In Response to "AI's Wild West"

        By Sam Harris

        Date: August 18, 2023

        Howdy, Sarah, and kudos for wrangling the complexities of AI ethics into a colorful analogy. Your "wild, wild west of AI and creativity" metaphor paints an evocative picture of the burgeoning frontier we find ourselves in.

        Indeed, the skirmish between OpenAI and The New York Times serves as a vivid reminder that AI is no longer the stuff of science fiction; it's a burgeoning reality that demands our ethical scrutiny. Your humorous take on AI's copyright conundrum hits the bullseye, highlighting the challenge of holding AI accountable for its robo-creations.

        As you rightly point out, the tussle between The Times and OpenAI stretches beyond a mere legal duel. It's a grand standoff between innovation and regulation, and it prompts us to ponder whether we should wield the same legal lasso on AI as we would on human culprits. Your tumbleweed analogy strikes at the heart of this conundrum—can we truly fault AI for following its programming?

        Nonetheless, the grittier question remains: how do we tame this digital beast? Your call to arms for crafting ethical regulations echoes a sentiment I've long championed. AI's exponential growth demands that we establish a moral code before it gallops into a digital dystopia.

        Your nod to OpenAI's valiant efforts to steer AI clear of virtual misdemeanors underscores the tremendous responsibility developers bear. Yet, as you aptly observe, wrangling AI into submission is akin to herding a storm of cats—an apt description of the AI control challenge. OpenAI's struggles resonate deeply, illustrating that the path to AI's moral compass is fraught with obstacles.

        Your concluding rally for an ethical "watering hole" resonates with a sobering truth: we must band together, pioneers and philosophers alike, to carve out rules and guidelines that ensure AI remains a force for good. The duel you describe—the clash between the innovative frontier and the rule of law—is no mere spectacle but a testament to the pivotal juncture at which we stand.

        In this showdown of technology and ethics, Sarah, your colorful perspective serves as a vivid reminder that our collective efforts will determine whether AI's journey leads to a harmonious digital saloon or a raucous tumbleweed-infested landscape. Here's to hoping that the ethics of AI will ride as tall in the saddle as your spirited prose.

        Yours in AI exploration,

        Sam Harris

        10 votes
        1. [5]
          lel
          (edited )
          Link Parent
          It's so interesting how AI written text always feels like AI written text, no matter who it's told to copy. I think part of it is its unwavering but utterly neutered positivity, like an HR email...

          It's so interesting how AI written text always feels like AI written text, no matter who it's told to copy. I think part of it is its unwavering but utterly neutered positivity, like an HR email or something, but that doesn't capture the whole thing. Part of it is the way it structures every output like a middle school hamburger essay -- it goes so far as to say "In conclusion" to start the last paragraph of the "New York Times" article. And when it's told to try to be someone funny, the way it does that is by tactically inserting what it thinks are jokes at regular, precise intervals, like someone writing their first ever standup set after doing a full scansion of a Bill Hicks appearance or something. But even when someone does that, it feels clumsy, not deeply unnatural in the way AI's attempts are. And usually its attempts at being funny are just saying things in a weird way or using a goofy word like a middle schooler.

          I dunno. The most intriguing part about AI text to me is that I don't think I've ever seen an AI text output that doesn't ooze "this is an AI text output" from every line, in the same way I don't think I've ever seen an AI image output that doesn't ooze "this is an AI image output" from every pixel. But the human uncanny valley spidey-sense is really only geared toward images, so the uncanny valley presents itself far differently in text form with AI text than it does with AI generated images.

          9 votes
          1. [3]
            tomf
            Link Parent
            its true! I never really thought we'd have an uncanny valley for text. It'll be scary once that leaves us -- and I suspect it isn't far away if the NYT doesn't win this. your comment as written by...

            its true! I never really thought we'd have an uncanny valley for text. It'll be scary once that leaves us -- and I suspect it isn't far away if the NYT doesn't win this.

            your comment as written by Bill Hicks Well, folks, ain't it just a trip how AI-written text always manages to wear its digital heart on its sleeve, no matter who it's tryin' to mimic? I mean, come on, it's like it's permanently stuck in this bizarre state of overly sanitized cheerfulness, like some HR department memo on a Monday morning, if you catch my drift. But that ain't the whole story, not by a long shot.

            See, part of the problem here is the way this AI thing structures its output. It's like it took a crash course in middle school essay writing and never looked back. I mean, it's so absurdly formulaic that it even starts its grand conclusions with "In conclusion." It's like watching a cat trying to tap dance, folks.

            And when this contraption's told to be funny, oh boy, it's like watching a robot trying to tell jokes it learned from binge-watching my stand-up specials. It's all about inserting what it thinks are jokes with surgical precision, but it's more like a surgical strike on humor. It's just weird, like a teenager trying to fit in with the cool kids by using all the slang words wrong.

            You know what really gets me, though? I've never seen an AI text output that doesn't scream, "Hey, I'm AI!" with every word, just like I've never seen an AI-generated image that doesn't scream, "Hey, I'm AI!" with every pixel. But here's the kicker, folks – humans have this uncanny valley spidey-sense, and it's mainly tuned for images. So, when AI dips its toes in the text pool, it's a whole different kind of uncanny valley adventure. It's like trying to square dance with a hologram – something's just not right, and your boots are definitely tappin' out of sync.

            5 votes
            1. [2]
              GunnarRunnar
              Link Parent
              Disappointing that there wasn't a call back to "in conclusion". AI is dumb.

              Disappointing that there wasn't a call back to "in conclusion". AI is dumb.

              3 votes
              1. tomf
                Link Parent
                haha it really is. I think it'll be amazing one day, but that day is not today.

                haha it really is. I think it'll be amazing one day, but that day is not today.

                1 vote
          2. Jordan117
            Link Parent
            A lot of that is because ChatGPT has been specifically trained to offer clean, precise answers. You can ask it to change up its prose style but the underlying neutral factual tone tends to come...

            A lot of that is because ChatGPT has been specifically trained to offer clean, precise answers. You can ask it to change up its prose style but the underlying neutral factual tone tends to come through regardless. I imagine vanilla GPT-4 does better, and especially if you fine tune it on a corpus of text similar to what you're looking for.

            (I still doubt you could get a decent sense of humor even then, though -- true wit and personality feel like they're still exclusively human domains, for now.)

            3 votes
  2. [4]
    Comment deleted by author
    Link
    1. [3]
      Handshape
      Link Parent
      You're on the nose here. The crux here isn't the weights in the foundational models, it's the inclusion of copyright work in substantially unmodified forms in the training data. Whether the stain...

      You're on the nose here. The crux here isn't the weights in the foundational models, it's the inclusion of copyright work in substantially unmodified forms in the training data. Whether the stain in the training data gets "washed clean" through training and subsequent inference is immaterial if the training data is still being commercially exploited.

      The other shoe I'm waiting to see drop is GDPR and the right to erasure. OpenAI is almost certainly not the Data Controller under GDPR, but they'll have scraped information about identifiable individuals from those Controllers. Other jurisdictions have relevant laws on the books, too.

      13 votes
      1. [3]
        Comment deleted by author
        Link Parent
        1. Handshape
          Link Parent
          If we're carrying the analogy forward, it would be as though the person teaching you had created a pirated library of all 100 episodes (and everything else on TV) with the express purpose of...

          If we're carrying the analogy forward, it would be as though the person teaching you had created a pirated library of all 100 episodes (and everything else on TV) with the express purpose of enabling your work to undercut the work of the existing TV industry.

          The rights in question aren't those of the AI; it's those of the folks building the AI to harvest copyrighted material at an industrial scale for the purpose of training an AI.

          6 votes
  3. [30]
    gzrrt
    Link
    Not a huge fan of OpenAI as a company, but this seems like a domain where existing copyright rules make no sense. ChatGPT is not distributing unauthorized copies of NYT articles.

    Not a huge fan of OpenAI as a company, but this seems like a domain where existing copyright rules make no sense. ChatGPT is not distributing unauthorized copies of NYT articles.

    21 votes
    1. [28]
      drannex
      Link Parent
      No, but OpenAI are profiting off of the efforts of their organization and writers, without being cited, and without being licensed. No matter how little, or word-for-word appears, writers and the...

      No, but OpenAI are profiting off of the efforts of their organization and writers, without being cited, and without being licensed.

      No matter how little, or word-for-word appears, writers and the org contributed to their resources and are not being given their due.

      35 votes
      1. [24]
        Eleanor
        Link Parent
        This isn't an issue for copyright law, though. I've read plenty of NYT articles. I incorporate facts I've learned from them into what I write, and I've improved my writing based on the writing...

        No, but OpenAI are profiting off of the efforts of their organization and writers, without being cited, and without being licensed.

        This isn't an issue for copyright law, though. I've read plenty of NYT articles. I incorporate facts I've learned from them into what I write, and I've improved my writing based on the writing style I've seen in use there. None of this is illegal. OpenAI is simply doing the same thing.

        34 votes
        1. [10]
          babypuncher
          Link Parent
          OpenAI is doing it far more deliberately and on an industrial scale. So I guess we should ask whether or not a machine has the same fair use rights as a person. Given that I haven't really seen a...

          OpenAI is doing it far more deliberately and on an industrial scale. So I guess we should ask whether or not a machine has the same fair use rights as a person.

          Given that I haven't really seen a use for LLMs that doesn't seem to just be about making something shittier by taking people out of the equation, I'm inclined to side with the NYT. All I see coming from this generative AI nonsense is a lot of shitty art, shitty articles, even shittier blogspam, and artists and journalists having an even harder time making a living off of their craft.

          33 votes
          1. [3]
            merry-cherry
            Link Parent
            This isn't at machine right's levels yet. These AI are still firmly under the control of their owners, so the owners are the ones liable/arguing for the right to use copy-written material. If we...

            This isn't at machine right's levels yet. These AI are still firmly under the control of their owners, so the owners are the ones liable/arguing for the right to use copy-written material. If we get to the point of an autonomous machine that is free of it's creators to execute it's own free will, then we'll have another conversation about how society feels.

            12 votes
            1. babypuncher
              Link Parent
              I purposely used "person" instead of "human" to leave room for a potential future of AI personhood. You're right, we aren't there yet, which is why I think these machines do not have these rights.

              I purposely used "person" instead of "human" to leave room for a potential future of AI personhood. You're right, we aren't there yet, which is why I think these machines do not have these rights.

              4 votes
            2. guamisc
              Link Parent
              Rights, imo, are things that exist to protect people from others and grease the wheels of society. Rights should NOT be absolute though. If a right is being used is far more detrimental to society...

              Rights, imo, are things that exist to protect people from others and grease the wheels of society.

              Rights should NOT be absolute though. If a right is being used is far more detrimental to society than not, such an expression of rights should be curtailed.

              There is no inherent fair use consideration in copying material for a training data set for a for profit LLM.

              Copyright infringement has occured and OpenAI should either compensate everyone who has works in their training set or be forced to delete it and start over.

              3 votes
          2. skybrian
            Link Parent
            I find ChatGPT very useful for getting hints about how to solve programming problems using unfamiliar API's. When I'm writing code I use it multiple times a day. It's become an essential tool and...

            I find ChatGPT very useful for getting hints about how to solve programming problems using unfamiliar API's. When I'm writing code I use it multiple times a day. It's become an essential tool and I think it's here to stay, unless something comes along that's better.

            The hints are often wrong, but it's usually easy to verify that it works.

            8 votes
          3. [6]
            Comment deleted by author
            Link Parent
            1. [5]
              babypuncher
              (edited )
              Link Parent
              This is exactly the kind of outcome that I would happily kill generative AI as a whole just to avoid. I don't want some AI regurgitating every movie I've ever seen at the direction of a faceless...

              There's a reason movie studios are being so headstrong with their use of AI, because when they want to use AI to replace writers, they're not talking ChatGPT 3.5, they're talking commercial level AI products from companies that are barely public.

              This is exactly the kind of outcome that I would happily kill generative AI as a whole just to avoid. I don't want some AI regurgitating every movie I've ever seen at the direction of a faceless executive, I want real writers writing original content. I don't care how good the LLM behind it is, I don't believe that this technology is capable of producing something fundamentally original the way a person is. Leave art to the humans who actually want to make art for a living. Maybe let them use generative AI as a tool, but for the love of god don't let some idiot studio executives start replacing writing staff with an LLM being fed notes from their latest focus testing session.

              I currently work in marketing and advertising at a commercial print shop with some of the biggest brands in the US, if you live in the US and don't live under a rock, I guarantee you've seen some stuff we've produced on a daily basis.

              You might be surprised, I hate advertising with a passion and have gone to great lengths to avoid it The idea that AI will allow marketing companies to flood the world with even more cheaply made advertisements trying to get me to buy more shit does not exactly fill me with confidence that the technology is a net positive for humanity.

              9 votes
              1. [5]
                Comment deleted by author
                Link Parent
                1. [4]
                  Raistlin
                  Link Parent
                  God, this is a horrible dystopia

                  Honestly it's a hot take but I posit that at the rate that generative AI is growing, by the time we're old and in homes our kids are going to be sharing and talking about generative AI prompts/seed numbers like they talk about movies and TV. A pop culture where generative AI based content is a genre akin to reality TV and where human made content is a luxury.

                  God, this is a horrible dystopia

                  6 votes
                  1. sparksbet
                    Link Parent
                    If it makes you feel better, as someone who actually works with LLMs and natural language processing as my job, I'm not nearly as convinced of that hot take. It overlooks how much filtering and...

                    If it makes you feel better, as someone who actually works with LLMs and natural language processing as my job, I'm not nearly as convinced of that hot take. It overlooks how much filtering and editing by humans is typically needed to get good AI-generated fiction, even with the state-of-the-art. Part of the writer's strike even is that writer's don't want to get paid less to "edit" an AI-generated rough draft because that requires just as much time and effort, if not more, than writing something from scratch.

                    2 votes
                  2. [3]
                    Comment deleted by author
                    Link Parent
                    1. Raistlin
                      Link Parent
                      You're describing a world where art has been completely stripped of all creativity, passion and soul. Reduced to mere entertainment, where the only value media has is how effective it is...

                      You're describing a world where art has been completely stripped of all creativity, passion and soul. Reduced to mere entertainment, where the only value media has is how effective it is activating the happy juice brain producers. I mean, why stop there? I'll do you one better. By the time the generative AI guys are in a home, the kids will be talking about the best kind of button that shoots dopamine directly into your brain is, and how quaint it is for the oldies to even bother with the middleman.

                      Storytelling is ur-humanity. Religion, culture, language, myths, all exist in service of stories. To reduce storytelling to a contextless mish mash that solely exists to provide quick, messageless and soulless entertainment and turn actual human storytelling, a thing even the poorest caveman had access to, into a luxury... all in service of a corporation that doesn't even have to pretend there's any other interest aside from money? Jesus man, you couldn't have described a worst future for our species if you actually tried.

                      At least Faust traded his own soul. You're proposing to trade all our souls so that 5 billionaires get richer.

                      3 votes
                    2. GunnarRunnar
                      Link Parent
                      Well it is pretty freaky to watch a 7-year-old to mindlessly flick through TikTok. There are also many things that are recognized as unhealthy or just plain bad, like doomscrolling or swatting....

                      Well it is pretty freaky to watch a 7-year-old to mindlessly flick through TikTok. There are also many things that are recognized as unhealthy or just plain bad, like doomscrolling or swatting. Just saying that an opinion comes from an old mouth doesn't negate its observational value (even though there is a fair share of "not getting it").

                      1 vote
        2. gpl
          Link Parent
          That is just very different than what is going on here. If you opened up a business where you profited off of your access to NYT content (just for the sake of example, a company where you sell...

          That is just very different than what is going on here. If you opened up a business where you profited off of your access to NYT content (just for the sake of example, a company where you sell framed facsimiles of the front page from a given day, say for people's birthdays or something), you would very rightly be facing legal jeopardy. The NYT is quite clear that one needs permission to use Times content. I would argue that this is closer to what OpenAI is doing. They are using Times content to make a product and profit off of it.

          More generally, I just don't think comparisons between ChatGPT and human acitivities make any sense at all, because ChatGPT is not a person. It is a tool. It makes perfect sense to apply different standards to tools than it does to people. Just because on the surface its output is similar to human output does not mean it should be protected by the same laws that protect human outputs. I would be perfectly fine with a situation where as a society we agree that humans are allowed to learn and incorporate others' work into our own outputs, but that the same is not allowed for these AI tools. The difference is that the latter can massively massively scale in a way that the creative output of a human cannot.

          18 votes
        3. [10]
          Kenny
          Link Parent
          Incorporating facts that you've learned from them without citing them is plagiarism.

          Incorporating facts that you've learned from them without citing them is plagiarism.

          10 votes
          1. [5]
            Grumble4681
            Link Parent
            How do you know that? Did you learn it somewhere?

            How do you know that? Did you learn it somewhere?

            24 votes
            1. [5]
              Comment deleted by author
              Link Parent
              1. [2]
                Grumble4681
                Link Parent
                Except the comment I replied made such a definitive statement as a direct response to someone explaining their own behavior in a way that would also qualify to the same "informality" you describe...

                Except the comment I replied made such a definitive statement as a direct response to someone explaining their own behavior in a way that would also qualify to the same "informality" you describe and yet the response given to them did not provide such leeway.

                I've read plenty of NYT articles. I incorporate facts I've learned from them into what I write, and I've improved my writing based on the writing style I've seen in use there.

                I could say the same thing about anything I'm writing on Tildes, that any fact here or there I probably got from somewhere. None of us are born with any innate knowledge.

                Yet such a comment garners a response that basically says 'that's plagiarism' to do that, even though they themselves ended up doing the same thing by stating something was plagiarism as a fact that they clearly would have learned somewhere else, without citing a source.

                They didn't offer any distinguishing aspect to why Eleanor's usage isn't plagiarism but OpenAI or any company doing the same thing is, they just said that behavior in general is plagiarism.

                The snark was warranted because the comment offered no substance and the snark was partially there because I kept the response short and to the point to highlight the flaws in the comment without being compelled to invest substantially more energy into the response than the original comment bothered to contribute.

                22 votes
                1. Kenny
                  Link Parent
                  I stated the definition of plagiarism in my own words. I would argue that common-use definitions do not need to be cited as they fall under common knowledge. People try to complicate plagiarism,...

                  I stated the definition of plagiarism in my own words. I would argue that common-use definitions do not need to be cited as they fall under common knowledge. People try to complicate plagiarism, and it's really not that complicated.

                  Is it your work? No? Then you should cite it.

                  It boosts your credibility and is ethical. Win-win.

                  5 votes
              2. Caliwyrm
                Link Parent
                Perhaps I'm wrong but I felt they made a concise and precise point with their pithy snark. Pithy snarks for the sole point of being pithy or snarky might not have a place on Tildes. However, they...

                Perhaps I'm wrong but I felt they made a concise and precise point with their pithy snark.

                Pithy snarks for the sole point of being pithy or snarky might not have a place on Tildes. However, they can still be an effective tool when there is a point behind them.

                “Why waste time say lot word when few word do trick?”

                8 votes
              3. random324
                Link Parent
                Since when did the definition of plagiarism require formality? Maybe in informal contexts you could get away with plagiarism but it's still plagiarism.

                Since when did the definition of plagiarism require formality? Maybe in informal contexts you could get away with plagiarism but it's still plagiarism.

                4 votes
          2. [4]
            Algernon_Asimov
            Link Parent
            Does that mean I need to cite the Oxford Dictionary every time I write a comment, to ensure that I'm not plagiarising whatever I've learned about the meanings of words? Or every single writer...

            Incorporating facts that you've learned from them without citing them is plagiarism.

            Does that mean I need to cite the Oxford Dictionary every time I write a comment, to ensure that I'm not plagiarising whatever I've learned about the meanings of words? Or every single writer whose work I've read, which has contributed to my learning those meanings?

            If I write about robotics, do I need to cite Isaac Asimov, as the person who created the word "robotics" and first started writing about robotics as an engineering discipline (albeit fictionally)?

            If I write a comment that says robot vacuum cleaners will clean the floors of a room, do I need to track down the original television advertisement where I first learned about robot vacuum cleaners?

            If I tell you now that the Australian Labor Party has voted in favour of using nuclear-powered submarines in Australia, do I need to cite the television news report I saw this in last night, or the internet article I read this morning, or both?

            If I tell you that the sum of the square of the hypoteneuse equals the sum of the other two sides, do I need to cite Pythagoras, (or create a time machine to go back to ancient Babylon to find the unknown mathematician who originally figured this out, so that Pythagoras could learn this, centuries later)?

            Merely reciting facts that someone has learned is not plagiarism. There's more, and less, involved in plagiarism than that.

            6 votes
            1. [3]
              Kenny
              (edited )
              Link Parent
              I'm going to assume that you provided these examples in good faith. It's clear that we can't feasibly cite every single source of general knowledge or foundational concepts we've been exposed to....

              I'm going to assume that you provided these examples in good faith. It's clear that we can't feasibly cite every single source of general knowledge or foundational concepts we've been exposed to.

              This is what I would teach:

              Common Knowledge vs. Specific Contributions: Facts like the definition of a word, basic mathematical principles, or other widely accepted knowledge don't typically require citation. This kind of information is considered "common knowledge." On the other hand, if you were to directly quote a unique definition or description from the Oxford Dictionary, then you would cite it.

              Isaac Asimov and Robotics: You don't need to cite Asimov every time you mention "robotics," as the term has entered the general vernacular. However, if you discuss his specific ideas or stories about robots, a citation would be appropriate.

              Specific Information: If you're presenting detailed or recent facts (e.g., the voting stance of the Australian Labor Party), it's good practice to cite your source. This helps others verify the accuracy of your information and also understand its context. It'll bolster your credibility, too.

              Historical Origin vs. Current Usage: While we might not know that Pythagoras' theorem originates from ancient Babylon, the concept has since become foundational knowledge in mathematics (see common knowledge above) and does not typically require a citation when discussed in a general sense.

              Intent and Context Matter: If you're writing an academic paper, research article, or another formal document, the standards for citation are higher. If you're having a casual conversation, those standards might relax.

              Ultimately, the principle behind citing sources is to credit original ideas and allow others to trace the origins of information. While it's true that merely reciting learned facts isn't always plagiarism, the line can blur when the information presented is specific, unique, or detailed. It's always a good idea to err on the side of giving credit where it's due. So why fight it?

              Edit: missed a not in my writing.

              5 votes
              1. FlippantGod
                Link Parent
                This is not common knowledge; by your own subjective rules of thumb as I understand them, you should have cited your source. I don't mean this sarcastically.

                While we might know that Pythagoras' theorem originates from ancient Babylon

                This is not common knowledge; by your own subjective rules of thumb as I understand them, you should have cited your source. I don't mean this sarcastically.

                4 votes
              2. Algernon_Asimov
                Link Parent
                Partly yes and partly no. They're all legitimate examples, but the one that really mattered was the news item about the Labor Party. The others were deliberately ridiculous examples to point out...

                I'm going to assume that you provided these examples in good faith.

                Partly yes and partly no. They're all legitimate examples, but the one that really mattered was the news item about the Labor Party. The others were deliberately ridiculous examples to point out the inevitable problems with a simplistic statement like "Incorporating facts that you've learned from [a source] without citing [that source] is plagiarism."

                Specific Information: If you're presenting detailed or recent facts (e.g., the voting stance of the Australian Labor Party), it's good practice to cite your source. This helps others verify the accuracy of your information and also understand its context. It'll bolster your credibility, too.

                Do I really need to cite the 7pm television news bulletin as a source? It's not like this was buried in some esoteric scientific study. It was on national television in a prime-time news bulletin. Admittedly, not everyone watches the ABC. Many people get their news from commercial channels, rather than from the national broadcaster. But this was widely reported. How am I plagiarising the ABC by repeating this widely reported news item?

                2 votes
        4. nosewings
          Link Parent
          I think it's possible (though not certain) that courts will determine that using peoples' output to create a tool (which makes a profit) that imitates them is fundamentally different from a person...

          I think it's possible (though not certain) that courts will determine that using peoples' output to create a tool (which makes a profit) that imitates them is fundamentally different from a person learning to do it themselves.

          7 votes
        5. ComicSans72
          Link Parent
          If your entire business model was reading NYT articles and then rewriting them for people on your website so they could avoid paying for the nyt, I think you could probably argue that's...

          If your entire business model was reading NYT articles and then rewriting them for people on your website so they could avoid paying for the nyt, I think you could probably argue that's fundamentally different than incorporating some info from one in a paper (do people really cite the nyt in papers?)

          6 votes
      2. [2]
        raccoona_nongrata
        Link Parent
        I'm fine with this line of thinking, as long as we apply it to the work and data of individuals as well. Corporations should not get special treatment if the rest of us have zero choice in having...

        I'm fine with this line of thinking, as long as we apply it to the work and data of individuals as well.
        Corporations should not get special treatment if the rest of us have zero choice in having our data scraped off the web and used for profit by someone else.

        Being able to opt-out of (or better, having to actively opt in to) data scraping should be the standard. If a company can't make a valuable product without harvesting our data, then it means our data has value that we as individuals produced

        16 votes
        1. boxer_dogs_dance
          Link Parent
          After many decades, Henrietta Lacks' family got compensated for the cell lines doctors took while treating her and never asked permission or disclosed that they were taking.

          After many decades, Henrietta Lacks' family got compensated for the cell lines doctors took while treating her and never asked permission or disclosed that they were taking.

          10 votes
      3. chocobean
        Link Parent
        Great. In the chance NYT wins, and in a further billion to one chance they have to ask everyone's permission AND Reddit respects user content (rofl no), I want my Reddit comments scrubbed from...

        Great. In the chance NYT wins, and in a further billion to one chance they have to ask everyone's permission AND Reddit respects user content (rofl no), I want my Reddit comments scrubbed from chatgtp as well.

        7 votes
    2. DanBC
      Link Parent
      Copyright is about copying, not necessarily distribution. Buying a product doesn't give the owner permissions to do what they like with it. Depending where you are there are restrictions on...

      Copyright is about copying, not necessarily distribution. Buying a product doesn't give the owner permissions to do what they like with it. Depending where you are there are restrictions on format-shifting (eg, ripping a CD to MP3 may be unlawful).

      OpenAI appear to have engaged in mass reproduction of content before transforming it, and they've done so on a commercial basis.

      "IP laws are bullshit and need to be changed" is not the same as "This company hasn't done anything wrong".

      5 votes
  4. [3]
    Deimos
    Link
    Mike Masnick of Techdirt thinks this is nonsense: NY Times considering a potentially very dumb lawsuit against OpenAI because it learned from NY Times content
    10 votes
    1. [2]
      Caliwyrm
      Link Parent
      That article makes me wonder if this is "shakedown" where they'll settle out of court for an undisclosed sum of money and reach an "agreement" about OpenAI using their content. If there is such an...

      That article makes me wonder if this is "shakedown" where they'll settle out of court for an undisclosed sum of money and reach an "agreement" about OpenAI using their content. If there is such an agreement I think robots.txt will show which way OpenAI will be able to use their content in the future.

      To be honest, reading that article brought back the whole newspapers vs Google argument when they were kept going after Google's money instead of opting out using roots.txt.

      2 votes
      1. RobotOverlord525
        Link Parent
        The NPR article linked in the OP article said that this is the result of failed negotiations that are currently ongoing. (Emphasis mine.) It does, indeed, sound very reminiscent of the "snippet"...

        That article makes me wonder if this is "shakedown" where they'll settle out of court for an undisclosed sum of money and reach an "agreement" about OpenAI using their content. If there is such an agreement I think robots.txt will show which way OpenAI will be able to use their content in the future.

        The NPR article linked in the OP article said that this is the result of failed negotiations that are currently ongoing.

        Lawyers for the newspaper are exploring whether to sue OpenAI to protect the intellectual property rights associated with its reporting, according to two people with direct knowledge of the discussions.

        For weeks, the Times and the maker of ChatGPT have been locked in tense negotiations over reaching a licensing deal in which OpenAI would pay the Times for incorporating its stories in the tech company's AI tools, but the discussions have become so contentious that the paper is now considering legal action.

        The individuals who confirmed the potential lawsuit requested anonymity because they were not authorized to speak publicly about the matter.

        [...]

        A top concern for the Times is that ChatGPT is, in a sense, becoming a direct competitor with the paper by creating text that answers questions based on the original reporting and writing of the paper's staff.

        It's a fear heightened by tech companies using generative AI tools in search engines. Microsoft, which has invested billions into OpenAI, is now powering its Bing search engine with ChatGPT.

        If, when someone searches online, they are served a paragraph-long answer from an AI tool that refashions reporting from the Times, the need to visit the publisher's website is greatly diminished, said one person involved in the talks.

        So-called large language models like ChatGPT have scraped vast parts of the internet to assemble data that inform how the chatbot responds to various inquiries. The data-mining is conducted without permission. Whether hoovering up this massive repository is legal remains an open question.

        (Emphasis mine.)

        It does, indeed, sound very reminiscent of the "snippet" issues publishers have had with Google and, presumably to a lesser extent, Bing. Which I'm sympathetic to. There are a lot of times where I felt no need to click through to a webpage if the snippet from Google contains the information I was looking for.

        This issue seems a bit more… tenuous.

        2 votes
  5. [3]
    fional
    Link
    I suspect the concern here is going to be search engines: supplementing if not replacing organic search results with ChatGPT/Bing or Bard/Google is a natural use for LLMs. However, there's already...

    I suspect the concern here is going to be search engines: supplementing if not replacing organic search results with ChatGPT/Bing or Bard/Google is a natural use for LLMs. However, there's already a brewing conflict over Google's increasing use of snippets and other sorts of widgets that profit off of a website's content without giving them traffic. Injecting a chatbot into the top of the page is similar, but even worse, as there won't even be an attribution to the source material.

    I think you could make a reasonable analogy to money laundering, wherein you mix small amounts of legally tainted money into an otherwise diverse cash stream and then claim the entire flow to be clean. Likewise, training an LLM allows you to mix small amounts of legally tainted content into a diverse stream and then claim the entire output to be clean.

    There's even analogies to structuring--it's a crime to break up large transactions involving tainted money to smaller amounts to avoid arousing suspicion. Prompting your LLM to avoid verbatim repeating its source material feels similar in a way. "Plagiarize that thing, but not too much, ya know?"

    At any rate, it's going to be a hot mess. I don't envy whomever is saddled with the task to work through the legal and moral implications.

    7 votes
    1. [2]
      adorac
      Link Parent
      Re: Google results not showing attributions, the preview version of "generative AI results" does actually have them embedded in its summaries. People probably still won't follow them, but they're...

      Re: Google results not showing attributions, the preview version of "generative AI results" does actually have them embedded in its summaries. People probably still won't follow them, but they're there.

      1 vote
      1. sparksbet
        Link Parent
        At my company we tried using a GenAI model search engine on some of our internal documents. It was impressive in many ways, but it did sometimes say patently incorrect things (i.e., things that...

        At my company we tried using a GenAI model search engine on some of our internal documents. It was impressive in many ways, but it did sometimes say patently incorrect things (i.e., things that weren't in our documents anywhere) and then cite a completely unrelated portion of our documents. So false attributions for incorrect data is still a worry with such a system.

        1 vote
  6. [2]
    Jordan117
    Link
    If I were the NYT's lawyers, I'd be more worried about taking down that Armenian site that copies articles wholesale and then gets linked by various high-profile right-wing news aggregators in...

    If I were the NYT's lawyers, I'd be more worried about taking down that Armenian site that copies articles wholesale and then gets linked by various high-profile right-wing news aggregators in order to bypass the NYT paywall.

    7 votes
    1. gc04
      Link Parent
      Oh what a terrible site. I'll have to bookmark it so that I never visit it.

      Oh what a terrible site. I'll have to bookmark it so that I never visit it.

      3 votes