41 votes

The problem with ChatGPT is that all of these websites like W3Schools and TutorialsPoint will go bankrupt

ChatGPT got all of its information from these websites, but these websites still use advertising to gain revenue. When a user asks ChatGPT a question, instead of going to the site, it's using the information stored on the site without giving the site any revenue.

That's why they're being sued. (Also why Reddit is doing what it's doing with the API)

What do we do? How can we keep these sites alive, and still make use of ChatGPT? I can write code and solve problems days faster than I used to now, but it seems kind of morally bankrupt of me to use this service which is so clearly putting the foundations it was built on out of business.

67 comments

  1. [18]
    hxii
    Link
    I honestly don't mind if W3Schools disappears. Maybe then actual documentation websites, official resources and proper code are going to appear on the first results page.

    I honestly don't mind if W3Schools disappears. Maybe then actual documentation websites, official resources and proper code are going to appear on the first results page.

    90 votes
    1. [11]
      Farshief
      Link Parent
      You bring up an excellent point in that w3schools could be accused of monetizing off of information available in documentation.

      You bring up an excellent point in that w3schools could be accused of monetizing off of information available in documentation.

      26 votes
      1. [10]
        UP8
        (edited )
        Link Parent
        It’s worse than that, if A.I. is trained on trashy sites like that, it will learn mistakes. Look at the horror of StackOverflow where you usually have to scroll past at least one code example that...

        It’s worse than that, if A.I. is trained on trashy sites like that, it will learn mistakes.

        Look at the horror of StackOverflow where you usually have to scroll past at least one code example that doesn’t work before, if you are lucky, you get one that works.

        I feel so much more secure working in a programming language or with libraries where I know how to look up answers in the real manual as opposed to programming splogs, StackOverflow and such.

        25 votes
        1. [6]
          skybrian
          (edited )
          Link Parent
          I think it's helpful to look at this from an evolutionary perspective. There are two extremes: memes people share without checking, and things people actually check all the time. Spelling is an...

          I think it's helpful to look at this from an evolutionary perspective. There are two extremes: memes people share without checking, and things people actually check all the time.

          Spelling is an example of something that gets checked fairly often, due to spell checkers and people having some idea how to spell. Certainly, common misspellings exist, LLM's will imitate them if given the right context, and LLM's do make lots of mistakes. But the existence of spellcheckers is still likely to make correctly-spelled words more common. I wouldn't expect LLM's to make spelling mistakes all that often, unless it's imitating a particular writing style.

          I think source code is more like spelling than it is like memes. There's relatively strong evolutionary pressure towards not having bugs that break things too badly in production code. We have lots of ways to check source code, like compilers and testing, both formal and informal. Yes, there are bugs everywhere, but at a low rate, and they tend to be in code that's not that important. If they break anything important, they get fixed. When using static analysis to search for bugs in Google's codebase, we would expect most bugs we find would be in code that isn't called for one reason or another, or for the bug to somehow not matter. (It's still worth doing to find the less-common bugs that do matter.)

          Stack Overflow has lots of examples of code that isn't actually used, only read. As code goes, it's closer to the meme end of things.

          There does seem to be something like a "doom loop" within a particular chat session. If an LLM starts making mistakes, it learns the pattern and generates more mistakes. But that code will likely be thrown away, and you can start a new chat. That's another evolutionary pressure. Typically there is a programmer watching over what the LLM generates, and the programmer will usually (but not always) notice if it doesn't work.

          Memes have their own evolutionary pressures: they evolve to be funny or outrageous or something else that gets people to share them. Sometimes people notice memes are wrong, but the mistakes that spread are the ones people don't see or don't care about, often due to ideological bias.

          7 votes
          1. snake_case
            Link Parent
            Your description of code reminds me of "conserved genes" a name for genes within an entity's DNA that is highly conserved because it is important, think, like, the gene for how to make hemoglobin....

            Your description of code reminds me of "conserved genes" a name for genes within an entity's DNA that is highly conserved because it is important, think, like, the gene for how to make hemoglobin. We use the conservation of genes pretty often when determining evolutionary origin because typically when they change, it's for a good reason.

            Also, a doom loop is a little bit like cancer.

            5 votes
          2. [4]
            public
            Link Parent
            I hope this isn't too Reddity of a reply, but this reminds me of the concept of the ant mill, in which a foraging party of army ants loses the scent of the main trail and follows itself until it...

            There does seem to be something like a "doom loop" within a particular chat session. If an LLM starts making mistakes, it learns the pattern and generates more mistakes. But that code will likely be thrown away, and you can start a new chat.

            I hope this isn't too Reddity of a reply, but this reminds me of the concept of the ant mill, in which a foraging party of army ants loses the scent of the main trail and follows itself until it dies of exhaustion.

            2 votes
            1. [3]
              skybrian
              (edited )
              Link Parent
              Well, sort of. Here’s what I mean by “learning the pattern:” an LLM is always imitating some writing style, based on context. Once it starts making mistakes, the writing style it’s imitating...

              Well, sort of. Here’s what I mean by “learning the pattern:” an LLM is always imitating some writing style, based on context. Once it starts making mistakes, the writing style it’s imitating becomes that of someone who makes mistakes.

              When writing code with GPT4 and Code Interpreter, I find it’s better to change a prompt to help it avoid the mistake and regenerate, instead of asking for a correction and having it remember that it made a mistake. Though, I might debug a little before going back and changing an earlier prompt.

              (Even if it learned anything useful from a mistake, it would forget it when I start a new chat session anyway. The real learning going on is me learning to write better prompts.)

              2 votes
              1. [2]
                majromax
                Link Parent
                This has been postulated as the Waluigi effect. The basic idea is that any simulated persona is a superposition of two opposite states: The persona as intended, and More or less the exact...

                Well, sort of. Here’s what I mean by “learning the pattern:” an LLM is always imitating some writing style, based on context. Once it starts making mistakes, the writing style it’s imitating becomes that of someone who makes mistakes.

                This has been postulated as the Waluigi effect. The basic idea is that any simulated persona is a superposition of two opposite states:

                • The persona as intended, and
                • More or less the exact opposite, pretending to be the persona specified.

                In this theory, if the LLM ever 'slips up' and responds according to the anti-persona, then the superposition collapses and you're left with just that one.

                Specializing this to the case of a 'smart, helpful code generator', the personas would be:

                • The smart, helpful, and correct code generator, or
                • A persona that acts like it's smart and helpful but really knows little.

                If the LLM begins making mistakes, then it's increasingly more likely to settle on the "incompetent at code but good at bluster and ego" version. Once that happens, the "incompetent at code" persona is unlikely to rediscover technical aptitude.

                This hypothesis is not without its critics (see the comments of the above link for more discussion), but I consider it a neat way to break our human-based intuition and grapple with LLM simulators on their own terms.

                † — Namely, that humans do have a consistent persona, such that it's possible to learn about the 'real you' with increasing fidelity.

                1 vote
                1. skybrian
                  (edited )
                  Link Parent
                  Yeah, I was and am one of those critics. I appreciate that you called it a hypothesis. Cleo Nardo has apparently had some unusual experiences using LLM’s and has some interesting things to say...

                  Yeah, I was and am one of those critics. I appreciate that you called it a hypothesis. Cleo Nardo has apparently had some unusual experiences using LLM’s and has some interesting things to say about them, but it’s annoying that a blog post with so little evidence behind it got so much attention, so now people will talk about the Walugi Effect like it’s a thing.

                  They also persist in calling chatting with a bot a simulation. Although I did manage to talk GPT4 through building a very simple physics simulation recently, I think it’s misleading to think of a chat as a simulation. It’s collaboratively writing a story and there’s a difference. Stories aren’t mathematical, don’t have clear rules, and don’t require consistent world-building. (What’s the world model of Alice in Wonderland?)

                  Within a story, plot twists can do anything. For example, you can write a story where everything changes based on a pun, or where astrology is real and learning someone’s sign is important. Since story tropes are a pattern that can be learned, LLM’s understand plot twists well enough to sometimes go along with them and change character appropriately, even if they appear out of nowhere. If I told GPT4 that a witch cast a healing spell on it, maybe that would break it out of its doom loop and it would write better code? I should try that.

                  Maybe the Waluigi Effect could be a thing, in the sense of being a story trope that you can use to change how a chat goes? It’s as real as the witch I just made up. Similarly, telling ChatGPT that it’s living in a simulation might have interesting effects on its performance, even though it’s not true?

                  Storytelling with a willing partner is fun and sometimes useful, but when we’re experimenting with an LLM from the outside and want to understand it scientifically, I think we need to be pretty behavioralist about what we observe and cautious about making up mental states for it. “Persona” itself is a metaphor we use to describe how chatbots behave; the behavior is real but how they work is a mystery. They are trained on the writing of authors from all over the Internet and clearly can imitate different writing styles, but we simply don’t know whether they have coherent representations that correspond to what we call a persona, or what levels of abstraction they use for it. Sophisticated thinking can be implied by simple tricks, so it’s easy to guess wrong about it.

                  I look forward to learning what mechanistic interpretability researchers figure out about how LLM’s imitate people.

                  (I kept editing. Sorry about that, I’m done now.)

                  1 vote
        2. [2]
          public
          Link Parent
          IME, this varies by language. It's notably bad in Python (with Py2 answers sticking around) and JS (same but with JQuery). Not sure which languages it's less bad, but I'm certain they exist.

          where you usually have to scroll past at least one code example that doesn’t work before, if you are lucky, you get one that works

          IME, this varies by language. It's notably bad in Python (with Py2 answers sticking around) and JS (same but with JQuery). Not sure which languages it's less bad, but I'm certain they exist.

          4 votes
          1. UP8
            Link Parent
            Python 2 is one problem, people will make so many stupid excuses about why they can't replace "print x" with "print(x)" More fundamentally there is the problem that the question frequently has an...

            Python 2 is one problem, people will make so many stupid excuses about why they can't replace "print x" with "print(x)"

            More fundamentally there is the problem that the question frequently has an incorrect code example which is compatible with the Q & A "game" (people answering the question might need to understand what the original poster is doing wrong) but is usually of no value to the person searching for answers who only needs enough of a problem statement for the search engine to find and a single correct answer.

            Maybe it is my neurodivergence but I find it really annoying to ignore things, and StackOverflow makes me do a lot of ignoring.

            2 votes
        3. ButteredToast
          Link Parent
          What I see a lot of on Stack Overflow for mobile stuff (native iOS/Android, especially Android) are answers that only kinda accomplish what the original poster is trying to do and/or do it in a...

          What I see a lot of on Stack Overflow for mobile stuff (native iOS/Android, especially Android) are answers that only kinda accomplish what the original poster is trying to do and/or do it in a very hacky/brittle way.

          This wasn't obvious to me when I first started using SO many years ago but scrolling through it after getting some experience under my belt it's plain as day. Starting maybe 4 or 5 years ago my usage of SO fell off of a cliff because of how answers there were so consistently low quality, making it only useful on occasion, usually by way of tangential information in a thread there that's pertinent to a weird behavior I'm chasing down.

          4 votes
    2. [6]
      Akir
      Link Parent
      I honestly think the same but just couldn’t think of a way to say it that wasn’t overly negative. Every time I try to find info on an obscure feature I rarely use their websites pop up and...

      I honestly think the same but just couldn’t think of a way to say it that wasn’t overly negative.

      Every time I try to find info on an obscure feature I rarely use their websites pop up and irritate me because I know they don’t have the information I need. You’d think I would know to just look it up on MDN by now.

      11 votes
      1. sqew
        Link Parent
        Anytime I'm writing web stuff, I just keep an MDN tab open or use the DuckDuckGo !mdn bang. It's such a pain to deal with trying to parse out whatever any of the top search results are saying....

        Anytime I'm writing web stuff, I just keep an MDN tab open or use the DuckDuckGo !mdn bang. It's such a pain to deal with trying to parse out whatever any of the top search results are saying.

        Speaking of MDN, it's gotta be near the top of my list for "design of a spec documentation site". I'm not sure what I'd say is my favorite for like a programming language or a framework, though.

        3 votes
      2. pesus
        Link Parent
        I do that a lot too. Every once in a blue moon I’ll find it has a good example of something I can’t find easily elsewhere and then I get more annoyed.

        I do that a lot too. Every once in a blue moon I’ll find it has a good example of something I can’t find easily elsewhere and then I get more annoyed.

        1 vote
      3. snake_case
        Link Parent
        W3 hasn't done me wrong yet, but I could see how it might if you were asking it the best way to do something, rather than just how to do something.

        W3 hasn't done me wrong yet, but I could see how it might if you were asking it the best way to do something, rather than just how to do something.

      4. [2]
        Protected
        Link Parent
        DuckDuckGo bangs really help you build those good habits. !mdn Promise.resolve() (I have DuckDuckGo set as my default search engine, so I can just type that sort of thing directly in my web...

        DuckDuckGo bangs really help you build those good habits.

        !mdn Promise.resolve()

        (I have DuckDuckGo set as my default search engine, so I can just type that sort of thing directly in my web browser input bars.)

        1. Akir
          Link Parent
          I really have no idea why I haven't set DDG as my primary search engine yet. Laziness is a curse.

          I really have no idea why I haven't set DDG as my primary search engine yet. Laziness is a curse.

          2 votes
  2. [12]
    Liru
    Link
    Why would we want to keep them alive? If the rise of ChatGPT means that TutorialsPoint and similar sites (GeeksForGeeks, TutorialsTeacher, etc.) will finally stop appearing in my search results, I...

    How can we keep these sites alive, and still make use of ChatGPT?

    Why would we want to keep them alive?

    If the rise of ChatGPT means that TutorialsPoint and similar sites (GeeksForGeeks, TutorialsTeacher, etc.) will finally stop appearing in my search results, I would say that that's one of the few outright positive effects that it would bring. Those sites are awful. I have never found them to actually contain any useful information; 95% of their content seems to be ripped from official or better documentation, which wouldn't be so bad if they didn't then remove everything except for most basic use cases, which makes it worthless.

    Even that wouldn't be so bad if a good chunk of those basic examples weren't horribly flawed or opaque, though.

    52 votes
    1. [7]
      EgoEimi
      Link Parent
      Agreed. I find it difficult to sympathize with their demise. I always found their content to be poorly written and shallow. I think the new state of things—ChatGPT for novice and exploratory...
      • Exemplary

      Agreed. I find it difficult to sympathize with their demise. I always found their content to be poorly written and shallow.

      I think the new state of things—ChatGPT for novice and exploratory content, officials docs for advanced content—is a great improvement at least in this aspect.

      On the other hand, @snake_case does broach the broader topic that the production, curation, and distribution of knowledge on the internet is not free, and they begin to broach a larger issue.

      Knowledge always has some indeterminable cost, value, and price.

      In the past, the bespoke nature of knowledge made the resolution of the matter not particularly pressing because the market for knowledge was vast and competitive. And there were enough eyeballs and ad money to go around.

      But now that ChatGPT threatens to centralize vast swaths of the internet, which is virtually the entirety of human knowledge space, the matter has emerged quite explicitly.

      Anyway, to get a bit more philosophical, I think it goes beyond OpenAI and its business practices: we simply don't have the economic and social tools to address this phenomenon. It's sorta IP theft... but... also fundamentally not IP theft insomuch that you or I steal IP from a book by reading and learning it and then being inspired by it. Never before in human history do we have an 'artificial superhuman' capable of reading and learning everything and then teaching it to others at a massive scale. How do we even begin to approach licensing?

      24 votes
      1. [2]
        RodneyRodnesson
        Link Parent
        So well put. Also applies to the wider issues (as you said, philosophical) such as unemployment due to AI. I've thought a fair bit about the wider impact of AI over the years but there seems to be...

        we simply don't have the economic and social tools to address this phenomenon.

        So well put.

        Also applies to the wider issues (as you said, philosophical) such as unemployment due to AI.

        I've thought a fair bit about the wider impact of AI over the years but there seems to be things I never imagined popping up all the time.

        Certainly interesting times ahead.

        6 votes
        1. flowerdance
          Link Parent
          Yeah... I think that's by design.

          we simply don't have the economic and social tools to address this phenomenon.

          Yeah... I think that's by design.

          2 votes
      2. [2]
        CosmicDefect
        Link Parent
        This is a great comment. Also, this destruction or centralization of the internet might likely be self defeating for entities like ChatGPT themselves. How will these tools learn and advance...

        This is a great comment. Also, this destruction or centralization of the internet might likely be self defeating for entities like ChatGPT themselves. How will these tools learn and advance further if they kill off the parts of the web they needed to train on?

        6 votes
        1. Very_Bad_Janet
          Link Parent
          Also, how will LLMs grow and learn more when the parts of the web they don't kill off will be filled with LLM generated material, leading to weird distortions in the output? One thing that I've...

          Also, how will LLMs grow and learn more when the parts of the web they don't kill off will be filled with LLM generated material, leading to weird distortions in the output?

          One thing that I've gotten from this very thought provoking post and thread - I'm no longer going to feel bad about streaming music and videos with ad blockers. Why should I, when Big Tech is racing to gobble up the world's IP and put most of the web out of business? Fewer and fewer people will be going to the (ad-filled) sources because of these LLM tools. It's nuclear strength piracy.

          2 votes
      3. snake_case
        Link Parent
        Yeah, you really hit the nail on the head of what I was getting at here. It's not really about those sites in particular. The only solution I've come up with so far is forcing tech companies who...

        Yeah, you really hit the nail on the head of what I was getting at here. It's not really about those sites in particular.

        The only solution I've come up with so far is forcing tech companies who hire software developers to subsidize some of these programming help sites, like, some kind of recognition that the existence of these sites is actually fueling their progress.

        Right now everyone is kind of just passing the bucket down, everyone benefits from the existence of this knowledge, but no one wants to pay for it because it seems to just continue to exist without any financial support. I've learned that sites in crisis can fund raise pretty quickly (archive.org) but what if all of these separate entities all need funds at the same time? Will archive.org just absorb them? Then what? Where will new knowledge be posted?

        5 votes
      4. skybrian
        Link Parent
        If they run out of good free content, Maybe AI companies will start paying people to make more? It’s not like Netflix gets videos for free. (Though YouTube mostly does.) There’s been some research...

        If they run out of good free content, Maybe AI companies will start paying people to make more? It’s not like Netflix gets videos for free. (Though YouTube mostly does.)

        There’s been some research suggesting that using smaller amounts of higher-quality content (like textbooks) has its advantages.

        It doesn’t seem like we need any fundamental innovation in economics for that to happen.

        3 votes
    2. Aurimus
      Link Parent
      I agree here. There are many conversations to be had around Generative AI, but the death of shitty sites like those isn’t a thing to bemoan.

      I agree here. There are many conversations to be had around Generative AI, but the death of shitty sites like those isn’t a thing to bemoan.

      2 votes
    3. SteeeveTheSteve
      Link Parent
      I have a plugin for firefox called "uBlacklist" which lets you block sites from results in google (there's an option to enable other search engines including bing and duckduckgo). Originally got...

      I have a plugin for firefox called "uBlacklist" which lets you block sites from results in google (there's an option to enable other search engines including bing and duckduckgo). Originally got it because I was sick of getting pintrest in my results.

      I haven't tried it, but it looks like you can subscribe to lists similar to uBlock too.

      2 votes
    4. [2]
      Hobbykitjr
      Link Parent
      new info? Ask GPT about .net core, net 7...it'll parse stackoverflow cache and give you an answer... Stackoverflow goes under... now where do we discuss .net super 3 (that ChatGPT could parse, but...

      new info?

      Ask GPT about .net core, net 7...it'll parse stackoverflow cache and give you an answer... Stackoverflow goes under... now where do we discuss .net super 3 (that ChatGPT could parse, but it doesn't exist)

      2 votes
      1. skybrian
        Link Parent
        Hopefully it’s some web forum accessible by search engines. If it’s some Discord server then the information is lost to most of us. There wouldn’t be any immediate effect, though. GPT4 is pretty...

        Hopefully it’s some web forum accessible by search engines. If it’s some Discord server then the information is lost to most of us.

        There wouldn’t be any immediate effect, though. GPT4 is pretty far behind. The cutoff date is September 2021. Other chatbots have newer info.

  3. [2]
    devalexwhite
    Link
    I don’t use an LLMs in my coding, first because they love to be confidently wrong, and second because you have no idea where it stole the code from. That said I also don’t use the sites you...

    I don’t use an LLMs in my coding, first because they love to be confidently wrong, and second because you have no idea where it stole the code from. That said I also don’t use the sites you listed, and I don’t see sites like MDN going anywhere.

    23 votes
    1. flowerdance
      Link Parent
      As a cultured individual, I only use LLMs for roleplaying.

      As a cultured individual, I only use LLMs for roleplaying.

      3 votes
  4. [3]
    Macha
    Link
    I'm surprised someone is worrying about Tutorials Point. It's one of those sites (like Baeldung, w3 schools, etc.) that seems to just exist to clog up my search results with barely relevant blog...

    I'm surprised someone is worrying about Tutorials Point. It's one of those sites (like Baeldung, w3 schools, etc.) that seems to just exist to clog up my search results with barely relevant blog spam when I'd rather just get the technical documentation for the thing I'm looking for in my search results.

    18 votes
    1. [2]
      underdog
      Link Parent
      Baeldung actually has excellent documents though, unlike the others.

      Baeldung actually has excellent documents though, unlike the others.

      9 votes
      1. meff
        Link Parent
        Baeldung has a lot of original exploration on his blog, aside from the SEO optimization.

        Baeldung has a lot of original exploration on his blog, aside from the SEO optimization.

        3 votes
  5. [6]
    simplify
    Link
    It’s a complex problem and I don’t have an answer. I don’t really use ChatGPT but I do use and love Copilot. I guess the only thing these sites can do is evolve or die. The internet is changing in...

    It’s a complex problem and I don’t have an answer. I don’t really use ChatGPT but I do use and love Copilot. I guess the only thing these sites can do is evolve or die. The internet is changing in a big way and you can’t expect old revenue models to work until the end of time.

    17 votes
    1. [2]
      SnakeJess
      Link Parent
      I'm the same way for mostly the same reasons. Lots of workplaces bar the use of chatgpt for work usages. There's a reason for that.

      I'm the same way for mostly the same reasons. Lots of workplaces bar the use of chatgpt for work usages. There's a reason for that.

      5 votes
      1. stu2b50
        Link Parent
        Well, it’s mainly because they don’t have contracts specifying corporate data security for the public version and haven’t yet made one with openai. I work at a well known tech company and we have...

        Well, it’s mainly because they don’t have contracts specifying corporate data security for the public version and haven’t yet made one with openai. I work at a well known tech company and we have access to chatgpt now via a bespoke contract with openai.

        Additionally, chatgpt is coming to Microsoft office 365, which will be covered by the 365 data policies.

        15 votes
    2. [3]
      GunnarRunnar
      Link Parent
      Or governments could recognize a problem and step in.

      Or governments could recognize a problem and step in.

      2 votes
      1. [3]
        Comment deleted by author
        Link Parent
        1. [2]
          godzilla_lives
          Link Parent
          I'm reminded of that hearing where a congressman asked the CEO of Google why he had difficulties with his iPhone, and the baffled dismay I felt.

          I'm reminded of that hearing where a congressman asked the CEO of Google why he had difficulties with his iPhone, and the baffled dismay I felt.

          6 votes
          1. public
            Link Parent
            Or the social media censorship hearing where the CEOs of Microsoft, Apple, and Google were pummeled with questions about moderation decisions on Facebook and Twitter.

            Or the social media censorship hearing where the CEOs of Microsoft, Apple, and Google were pummeled with questions about moderation decisions on Facebook and Twitter.

            5 votes
  6. [6]
    CharlieBeans
    Link
    I see here quite negative feedback towards these sites. While I agree that they are not ideal source for seasoned tech people, but we should never forget how many of us get there. These sites are...

    I see here quite negative feedback towards these sites. While I agree that they are not ideal source for seasoned tech people, but we should never forget how many of us get there. These sites are very valuable to people that are trying to understand new concepts and get into tech, they don't need complete information, not even always absolutely correct. It needs to be good enough, engaging enough to help people progress in their journey. I can't remember last time I've checked SO and similar, but I can really remember how many years ago I looked into these sites and how much they helped me to stay motivated! These sites are stepping stones for I believe quite a lot of people.

    And these sites offer some kind of map, guidance and links to new unknowns. So I would argue that they are more useful than just plain AI response.

    14 votes
    1. [3]
      Aurimus
      Link Parent
      The biggest problem is that these sites teach new devs incorrect things, or out of date things

      The biggest problem is that these sites teach new devs incorrect things, or out of date things

      11 votes
      1. [2]
        Deely
        Link Parent
        And solution is...? I`m really curious.

        And solution is...? I`m really curious.

        1. skybrian
          Link Parent
          One solution is finding a better website. MDN is good for web development.

          One solution is finding a better website. MDN is good for web development.

          2 votes
    2. feanne
      (edited )
      Link Parent
      I agree, W3Schools was a major stepping stone for me when I knew nothing about coding. It's very beginner-friendly. I definitely would not have gotten the same results from just looking at...

      I agree, W3Schools was a major stepping stone for me when I knew nothing about coding. It's very beginner-friendly. I definitely would not have gotten the same results from just looking at official code documentation.

      4 votes
    3. tmax
      Link Parent
      W3schools really helped me back in the days, so yeah I agree they may be helpful for some to get into tech.

      W3schools really helped me back in the days, so yeah I agree they may be helpful for some to get into tech.

      2 votes
  7. [2]
    hobbes64
    Link
    A few commenters here said they don't like the sites you listed so they don't care if they disappear. But I think your question is related to a broader question of "where does original information...

    A few commenters here said they don't like the sites you listed so they don't care if they disappear. But I think your question is related to a broader question of "where does original information come from?".

    Fifteen or twenty years ago you would buy a computer book from O'Reilly or Wrox or Microsoft Press and have a physical copy at your desk. Those books had authors who would do research and get paid for this. I don't know if any of them got very much, but it would be part of their livelyhood for sure.

    Since then almost all knowledge is supplied on the web, where most of the information is from some unpaid dude somewhere, and this is linked and copied over and over and over again. This is fine for the commons as long as the info is correct, but there are forces at play (clickbait, corporate propaganda, etc) that may cause junk to rise to the surface.

    If the info is getting pulled by an AI, this is just a variation of the same problem. The AI doesn't understand the context so you might be getting a lot of junk, and there isn't going to be a passionate and responsible person behind the info.

    By the way, this closely parallels the problem with news in the last 20 years. There used to be local newspapers who funded themselves with personal ads and newspaper sales. Personal ads have been gone since before the internet, they were destroyed by Craigslist. And now newspaper sales are gone, everyone expects the news to be free. But this means there is nobody to pay a guy that could be investigating some small town political corruption or whatever, and all the news you see is run through a corporate filter, probably funded by someone who would really like to steer your opinion in some way that gives them money or power.

    11 votes
    1. feanne
      Link Parent
      I agree. Generative AI is exacerbating the already enormous problem of loss of context/provenance in the information age.

      I agree. Generative AI is exacerbating the already enormous problem of loss of context/provenance in the information age.

      3 votes
  8. [5]
    SnakeJess
    Link
    I simply don't use chatgpt. Problem solved. Maybe one day I will use a better version of an ai tool designed specifically for use as a programming aid but for now I feel at no disadvantage not...

    I simply don't use chatgpt. Problem solved. Maybe one day I will use a better version of an ai tool designed specifically for use as a programming aid but for now I feel at no disadvantage not using it. The stuff u run into problems with are mucharger in scale that chat gpt can handle anyway.

    And if you are relying on chatgpt too much to do easy stuff early on your learning you will hamper yourself.

    5 votes
    1. [3]
      snake_case
      Link Parent
      I find that it's really great for massive questions where it's hard to even figure out where to begin, the sort of thing that in the past I would bug a senior developer about. I'm asking it things...

      I find that it's really great for massive questions where it's hard to even figure out where to begin, the sort of thing that in the past I would bug a senior developer about.

      I'm asking it things like "In Python, what's the best way to implement a shared sessions object in a test suite with many classes and objects?"

      The answer it gave me was just phenomenal. It changed my whole world.

      (fixtures are the answer, btw, which I kind of knew, but fixtures are a hard concept to grasp and that's specifically what ChatGPT's answer helped with)

      9 votes
      1. [2]
        JCAPER
        Link Parent
        GPT 4 was a game changer for my work. For quick wins, where I just need a small and simple algorithm for something, I just ask it and it gives me something that I can quickly adapt. For more...

        GPT 4 was a game changer for my work. For quick wins, where I just need a small and simple algorithm for something, I just ask it and it gives me something that I can quickly adapt.

        For more complex problems I don't trust the code it outputs, however it works great to debate ideas, troubleshoot and deconstruct the problems I'm trying to solve. It's like my rubber duck, but it talks back

        8 votes
        1. snake_case
          Link Parent
          I've been using it the same way, it's really excellent in explaining how to do something complex, and I find that the code that it does generate are really amazing examples, better than I find in...

          I've been using it the same way, it's really excellent in explaining how to do something complex, and I find that the code that it does generate are really amazing examples, better than I find in any of the previously mentioned websites.

          It basically summarizes entire chapters of documentation, and provides better examples, all in one go. Multiple times now it's taken an issue that would have had me researching for days and turned it into a one day snag.

          1 vote
    2. skybrian
      Link Parent
      I’m a hobbyist now and I’m not sure how useful it will be for production code, but I think it’s pretty good for writing fun code as a learning experience. For example, I hadn’t used Pillow or...

      I’m a hobbyist now and I’m not sure how useful it will be for production code, but I think it’s pretty good for writing fun code as a learning experience. For example, I hadn’t used Pillow or numpy before and learned about it from how ChatGPT used it. I might even try advent of code this year.

      I don’t think you hamper yourself by working with a partner sometimes, even if they make mistakes. Pair programming has its advantages. GPT4 isn’t a bad partner. (As a Python environment, though, it sucks compared to a Jupyter notebook. Still beta.)

      2 votes
  9. [2]
    skybrian
    (edited )
    Link
    W3Schools used to have a poor reputation but I haven't tried it in years. Is it actually good now? I always avoid it and use MDN. As to your actual question, I don't know what sort of answer...

    W3Schools used to have a poor reputation but I haven't tried it in years. Is it actually good now? I always avoid it and use MDN.

    As to your actual question, I don't know what sort of answer you're hoping for? Nobody knows the future. I expect people will still read websites and support the ones they like.

    But I wouldn't expect copyright violations to result in any AI slowdown. The web is built on copyright violations. (See YouTube, user behavior in most Internet forums, and so on.)

    Also, even if lawsuits did result in some companies having to nerf their services, they have competitors that are being more careful about licensing their data and are almost as good. (For example, if Midjourney got shut down, Adobe has products that can generate images too, and they have legal agreements.)

    5 votes
    1. Minori
      Link Parent
      Having worked with a lot of intro level JS stuff recently, MDN is still the best resource out there. GeeksForGeeks is decent sometimes, but W3Schools is very dicey. And of course Stack Overflow is...

      Having worked with a lot of intro level JS stuff recently, MDN is still the best resource out there. GeeksForGeeks is decent sometimes, but W3Schools is very dicey. And of course Stack Overflow is only good if you want jQuery...

      5 votes
  10. [4]
    vektor
    Link
    I for one am hopeful that one day soon, someone will figure out how to let an AI model explore tech systems within a sandbox environment. Imagine if GPT could interact with a linux machine to...

    I for one am hopeful that one day soon, someone will figure out how to let an AI model explore tech systems within a sandbox environment. Imagine if GPT could interact with a linux machine to validate its answers. There's a few ways I could see this being done, and if done properly they could really elevate the degree to which LLMs understand technology. Then, instead of GPT regurgitating shitty web tutorials or cobbling together code and screwing it up in sneaky ways, maybe we could get a lot more sensible answers out of it.

    2 votes
    1. [3]
      skybrian
      Link Parent
      I expect people collaborating with AI to write real code will be a good source of training data. In the short term, though, the bot doesn’t learn, you do.

      I expect people collaborating with AI to write real code will be a good source of training data. In the short term, though, the bot doesn’t learn, you do.

      1 vote
      1. [2]
        vektor
        Link Parent
        Oh, as far as ChatGPT as it exists now goes, sure. But I also expect that one (i.e. AI researchers) could come up with some way of putting e.g ChatGPT into a reinforcement learning loop of e.g....

        Oh, as far as ChatGPT as it exists now goes, sure.

        But I also expect that one (i.e. AI researchers) could come up with some way of putting e.g ChatGPT into a reinforcement learning loop of e.g. programming something or doing sysadmin tasks, and you can use a computer environment to train the model. For example, you could ask chatGPT to write test cases or invariants for a target program, then to implement the program, and reward it for producing code that satisfies the test cases. Of course you're not guaranteed to have good test cases there, so that's where the research comes in (and/or a larger data annotation effort), but I believe such a setup would be very interesting as far as increasing LLM understanding of technical systems goes.

        As a very basic example, you could have the LLM play compiler/interpreter. Prompt: Some code snippet. Expected answer: Output of the compiler and/or output of the code when executed. Cost to implement? Relatively simple, all you need is a github crawl and a sandbox within which you can safely generate the output of a given piece of code. There's your training data. Maybe for free, maybe with some careful cross-training you'd train the LLM to produce less faulty code and predict better how a piece of code would evaluate.

        1 vote
        1. skybrian
          Link Parent
          Similarly, I suspect that the logs from people using GPT4 with Code Interpreter will be used this way. It has the advantage of training on things people actually want to do with it, but is limited...

          Similarly, I suspect that the logs from people using GPT4 with Code Interpreter will be used this way. It has the advantage of training on things people actually want to do with it, but is limited to Python. If they add other languages, though, they could get training data for them, too.

          (This is similar to how one Google’s sources of info about the Internet is people Googling things.)

          Commit logs from git repos combined with the results of continuous builds seems pretty good, too. All those unit test results are training data. The commit logs alone would be almost as good, assuming people fix or disable broken tests. Tests that don’t get disabled are probably passing.

          If they still want continuous build results, one idea would be to just take source code from git repos and run the tests in the sandbox. Getting a project compiling and tests working in your own environment is often a challenge, but they could automate that using GPT4 itself.

          This itself would be a useful product. Imagine a continuous build service that you can give some crappy code to and it automatically configures itself, sends pull requests fixing any broken tests it sees, and offers to write more tests? The human feedback from people either accepting or rejecting the pull requests is itself training data about what human programmers like to see.

          In this way, I expect collecting more training data about code will not only be easy, but profitable for AI companies. Assuming intellectual property issues can be worked out? Maybe they would only collect training data from your company’s open source code?

          Google is already doing something like this internally. I suspect they won’t want to directly release it as a product due to not wanting to reveal their own intellectual property, but the experience of building it would still be valuable, and there may be indirect ways of bootstrapping a different system.

          An AI bot that trains on and sends pull requests for all public Go packages (that opt in) would be pretty cool.

  11. [2]
    JackA
    Link
    *Off topic Agree with everything else but I'm not going to roll over and let Reddit get away with their blatantly false messaging that closing off 3rd party apps had anything to do with AI API...

    *Off topic

    (Also why Reddit is doing what it's doing with the API)

    Agree with everything else but I'm not going to roll over and let Reddit get away with their blatantly false messaging that closing off 3rd party apps had anything to do with AI API access. They lied when they said that, otherwise 3rd party apps would have been given an incredibly easy exemption and spez's obvious vitriol towards them in all communications wouldn't have caused all of that drama.

    All of the historical training data (the valuable stuff) that would come off of reddit is already easily downloadable in one copy without using any live connection to reddit, any future data can just as easily be captured via web scraper as they were already breaking the API TOS regardless.

    You may have just been unaware, but please don't help spread their PR bs.

    2 votes
    1. snake_case
      Link Parent
      So, no. I still use Alien Blue on an old Iphone 4s and it works fine. I eventually get rate limited, but if I stop doom scrolling and come back a couple minutes later, it works. There have been no...

      So, no.

      I still use Alien Blue on an old Iphone 4s and it works fine. I eventually get rate limited, but if I stop doom scrolling and come back a couple minutes later, it works. There have been no negotiations between Reddit and Alien Blue, because Alien Blue no longer exists, it's an old, outdated, unsupported 3rd party Reddit app.

      The fastest way to get new content off Reddit is to call the API. If you have a working Reddit model, and you just want to update it, you call the API. If you're a ChatGPT bot and posting shitposts everywhere, you POST to the API. I really think the initial intention was to target these consumers, and not us.

      That being said - Spez is an absolute baffoon. He should have given existing third party apps preferred pricing and he did not. He knew that we were not a part of the issue, and he still did not. He's upset the very people who make his website something that people even want to scrape data from and influence, and that was a bad move, as always. What an idiot.

  12. [3]
    BusAlderaan
    Link
    I think there’s legitimacy in questioning the ethics of how chatGPT obtained its data, they probably shouldn’t be allowed to scrape all of the internet. But the argument “We have to stop X,...

    I think there’s legitimacy in questioning the ethics of how chatGPT obtained its data, they probably shouldn’t be allowed to scrape all of the internet. But the argument “We have to stop X, because if we don’t then it will kill Y” has existed for all of civilized humanity, right?

    “We can’t allow civilians to own cars, it will kill the horse industry.”

    “We can allow electricity to be used in every home, it will kill the candle industry.”

    Etc, etc.

    Yes we need to throttle the methods used to create AI, but no I don’t think we need to harbor industries for their safety. I think they will adapt and/or die.

    1 vote
    1. skybrian
      Link Parent
      I agree that the economy changes due to technological improvements, trying to stop it from changing often fails, and often, we shouldn’t want to stop it because the improvements are useful. But we...

      I agree that the economy changes due to technological improvements, trying to stop it from changing often fails, and often, we shouldn’t want to stop it because the improvements are useful.

      But we should be careful not to conclude too much from the patterns we see in history. Yes, they exist. Knowing about them is useful for brainstorming about what might happen. They are pretty weak evidence for ruling scenarios out. Things change enough that we don’t know whether it will happen the same way again. This time, it could be different?

      The future is uncertain and preparing for disasters can still be important, even when it didn’t happen that way before.

      2 votes
    2. snake_case
      Link Parent
      People keep saying that, but really take a look at what computers did to us. One pre-programmed computer can do the job of an entire department of paper pushers. What are those paper pushers doing...

      People keep saying that, but really take a look at what computers did to us. One pre-programmed computer can do the job of an entire department of paper pushers. What are those paper pushers doing now? Working min wage service jobs.

      Technology is the reason why "the good jobs" are so difficult to get now, it's the reason why the middle class is sinking farther and farther into poverty. The people who own the technology are profiting off of it, and the people who's jobs the technology took are left out in the cold.

      It wouldn't be so bad if we didn't let Bangladesh make all of our textiles, but it would still be pretty bad.

  13. [2]
    Hobbykitjr
    Link
    i think recipes will be a big one. When i want to make a new dish i'll google 4 or 5 and combine/merge them. I could see ChatGPT be useful for "whats a [low carb/GF/Vegan] [generic dish] without...

    i think recipes will be a big one.

    When i want to make a new dish i'll google 4 or 5 and combine/merge them. I could see ChatGPT be useful for "whats a [low carb/GF/Vegan] [generic dish] without [ingredient] to [cook outside/in 1 pot/to reheat later]"

    1. snake_case
      Link Parent
      That's something I can almost see being subsidized by everyone's taxes, everyone benefits from having ways to prepare food at home, so the production and storage of recipe knowledge makes sense to...

      That's something I can almost see being subsidized by everyone's taxes, everyone benefits from having ways to prepare food at home, so the production and storage of recipe knowledge makes sense to subsidize like that.

      Programming websites don't though, not everyone uses them and benefits from them, so I'm kinda struggling to see how we could enforce subsidies for the production and storage of this information. Probably some kind of Wikipedia model? Non-for profit? Is having a giant Wikipedia of all programming information good for the health of the information or do we need lots of separate competing entities? Would software be more reliable if everyone was taught how to code in the same way and had access to exactly the same resource?

  14. Removed by admin: 2 comments by 2 users
    Link