55 votes

Let us show you how GPT works

27 comments

  1. [2]
    unga
    Link
    I find myself consistently disappointed with how mainstream media presents topics in tech, but the NYTimes does a great job here.

    I find myself consistently disappointed with how mainstream media presents topics in tech, but the NYTimes does a great job here.

    16 votes
    1. sparksbet
      Link Parent
      yeah ngl I went into this expecting to be disappointed but it's a surprisingly decent introduction to how these models work under the hood and even mentions the safety issues at the end....

      yeah ngl I went into this expecting to be disappointed but it's a surprisingly decent introduction to how these models work under the hood and even mentions the safety issues at the end. Anthromorphizes a bit much for my taste but that's so common it's unavoidable at this point I think.

      6 votes
  2. [7]
    Jdtunn
    (edited )
    Link
    See below comment

    Paywall bypass

    Note: This is a rather interactive article and most of those interactions are broken by the archiving process.

    See below comment

    5 votes
    1. [6]
      WittyPat
      Link Parent
      Here's a gift of the article, should be unlocked for anyone who clicks it (expires 2 weeks from now): Let Us Show You How GPT Works — Using Jane Austen

      Here's a gift of the article, should be unlocked for anyone who clicks it (expires 2 weeks from now): Let Us Show You How GPT Works — Using Jane Austen

      30 votes
      1. [2]
        Comment deleted by author
        Link Parent
        1. WittyPat
          Link Parent
          I have a NYT subscription and allowed to gift 10 articles a month. The gift articles only work for 2 weeks.

          I have a NYT subscription and allowed to gift 10 articles a month. The gift articles only work for 2 weeks.

          10 votes
      2. [4]
        cfabbro
        Link Parent
        Do you mind if I edit the original link to include that gift code, or would that go against the NYT ToS to openly share it with so many people like that? I wouldn't want to get you in any trouble...

        Do you mind if I edit the original link to include that gift code, or would that go against the NYT ToS to openly share it with so many people like that? I wouldn't want to get you in any trouble over it.

        4 votes
        1. Tharrulous
          Link Parent
          I believe the Gift Article feature on News websites are intentionally designed to be shared on social media. When you click the button on NYT, you get a pop-up that allows you to share not only...

          I believe the Gift Article feature on News websites are intentionally designed to be shared on social media. When you click the button on NYT, you get a pop-up that allows you to share not only via URL link, but also via FB, Twitter, Whatsapp, Reddit, and other socials.

          So feel free to use gifted article links when needed in the future.

          1 vote
  3. [2]
    the_man
    Link
    Thanks so much. English is not one f my first languages. Sorry, I teach undergrad college and I want to require the use of CHAT-GPT in my Fall environmental health class. This article is an...

    Thanks so much.
    English is not one f my first languages. Sorry,
    I teach undergrad college and I want to require the use of CHAT-GPT in my Fall environmental health class. This article is an excellent tool to let students understand that maybe CHAT-GPT is a very powerful and fast parrot. Maybe that is what we are modeling our students to be: capture the pattern and repeat it!.
    I am old enough to remember that when handheld calculators were massified many thought that learning math will become obsolete. What did happen was that deeper math was easier to teach.
    I expect, optimistically, that by using CHAT-GPT my students will be able to focus on how to use knowledge for the best of a given situation that needs solutions. I am planning to ask them for good prompts to come with the better 150 words simple and complete explanation of complex problems, like overgrazing, desertification, CO2 quotes and CO2 global market, etc.
    Currently, I see CHAT-GPT as a super much better version of a search engine. Eventually, we will not be able to capture all the complexity of what it can offer. Sometimes, that happens to me when I use a search engine for, as an example, changes in fruit tree's roots due to poor environmental conditions. I just do not understand 80% of the results. With CHAT-GPT, I can ask the system to provide answers of increasing complexity until I reach a good level of understanding and, then, move on to applying that knowledge and onto the next inquiry.
    Will there be a time in which we will just know AI to the point of not questioning its answers to our prompts? We do not argue with a calculator when we use it well, but we could if a manufacturer is using the wrong programming. Will we be able to detect those AI mistakes and have conversations about it?
    One of my goals is to help my students to discover how tedious tasks (like writing a report by using their parroting skills) can be optimized with CHAT-GPT and how they could focus in applying the knowledge contained in their parroted reports to a specific situation. Maybe to keep using AI until they fully understand what they are doing and are able to defend/explain their final opinion can be one way of conceptualizing what I would like to do.
    I would love to know what you think.

    4 votes
    1. userexec
      Link Parent
      I think that's an interesting idea that will help students develop critical skills for the future, and I'm surprised to see it used in this type of class already. I bought a subscription to GPT4...

      I think that's an interesting idea that will help students develop critical skills for the future, and I'm surprised to see it used in this type of class already.

      I bought a subscription to GPT4 yesterday because I like using ChatGPT to help my language learning. I've been taking Japanese classes for about 3 years and am in that intermediate stage where I know much of the basic grammar, but need routine interaction and uncertainty to improve. ChatGPT can speak Japanese, so I use it as a sort of instant, asynchronous pen pal. Reading Japanese media is great, but having a directable conversation immediately at any hour with instant feedback and the ability to ask what was meant by something is such a valuable experience for language learning.

      I hadn't thought to use ChatGPT in a general learning context like this to teach students how to express what they'd like to know, discover their own knowledge gaps, and mine for new knowledge. I could see this becoming a critical research skill with AI specifically, but also very helpful to their development of critical thinking in general.

      1 vote
  4. [3]
    Bipolar
    Link
    That’s a pretty good article, kinda surprised that it takes so little to become that good. I mean it was still nonsensical sentences but they were are at least words. I need to start playing with...

    That’s a pretty good article, kinda surprised that it takes so little to become that good. I mean it was still nonsensical sentences but they were are at least words.

    I need to start playing with these things on my setup.

    3 votes
    1. unga
      Link Parent
      Indeed. Now just imagine how much data the tech giants use, and we begin to understand just how powerful these models are.

      Indeed. Now just imagine how much data the tech giants use, and we begin to understand just how powerful these models are.

      3 votes
    2. Glissy
      Link Parent
      First time I asked ChatGPT to write me some code and it produced a perfectly functioning thing I was more than a little impressed, it's pretty astounding when it gets things right.

      First time I asked ChatGPT to write me some code and it produced a perfectly functioning thing I was more than a little impressed, it's pretty astounding when it gets things right.

      2 votes
  5. [6]
    MdPhoenix
    Link
    Forgive me, but this doesn't really tell me anything. They gave it all of Jane Austen's works and then presented it with a sentence. What was it's goal? To finish the paragraph?

    Forgive me, but this doesn't really tell me anything. They gave it all of Jane Austen's works and then presented it with a sentence. What was it's goal? To finish the paragraph?

    1 vote
    1. [5]
      onyxleopard
      Link Parent
      The goal, in this toy model, is to predict the next character, repeatedly, until a special “STOP” character is predicted. It’s a bit more complicated in some fancier models as they will be trained...

      The goal, in this toy model, is to predict the next character, repeatedly, until a special “STOP” character is predicted. It’s a bit more complicated in some fancier models as they will be trained to predict special things called tokens that are more like words than characters, but the principle remains the same. That’s why these models are called generative—they are trained to generate sequences based on previous sequences.

      7 votes
      1. [4]
        PantsEnvy
        Link Parent
        I was a little surprised they didn't call tokens out as another difference between the BabyGPT and larger model GPT's. Baby GPT predicted the next character. Larger model GPT's, I thought,...

        I was a little surprised they didn't call tokens out as another difference between the BabyGPT and larger model GPT's.

        Baby GPT predicted the next character. Larger model GPT's, I thought, predicted the next token, which could be a letter or a word or a symbol.

        2 votes
        1. [3]
          onyxleopard
          (edited )
          Link Parent
          LLMs can predict any unit you care to train them on. Here’s Karpathy offering a short argument as to why tokenization isn’t necessarily desirable. Right now, given compute budgets, training...

          LLMs can predict any unit you care to train them on. Here’s Karpathy offering a short argument as to why tokenization isn’t necessarily desirable. Right now, given compute budgets, training tokenizers with large, but not too large, vocabularies is likely still desirable. I think you can get away with byte-level prediction, but I haven’t seen any state-of-the-art model architectures trained on pure bytes. You could go as far as predicting bits (so, essentially, three vocabulary items: 0, 1, and STOP). But, I suppose at that point you’re really asking the model to learn text-encoding in addition to natural language (and might have to do something to clean up illegally encoded predictions).

          2 votes
          1. [2]
            DataWraith
            Link Parent
            There's ByT5, which is a variant of Google's T5 architecture. They find that not using tokens is beneficial with regards to robustness (e.g. spelling errors), but of course the sequences get...

            I think you can get away with byte-level prediction, but I haven’t seen any state-of-the-art model architectures trained on pure bytes.

            There's ByT5, which is a variant of Google's T5 architecture. They find that not using tokens is beneficial with regards to robustness (e.g. spelling errors), but of course the sequences get longer, so you may have to spend more compute or, alternatively, live with a shorter context window.

            3 votes
            1. onyxleopard
              Link Parent
              Right! I forgot about ByT5. It shares its heritage with Google’s mT5 base models, but I haven’t seen anyone clamoring to build RLHF-based chat bots on top of either ByT5 or mT5. Is this because...

              Right! I forgot about ByT5. It shares its heritage with Google’s mT5 base models, but I haven’t seen anyone clamoring to build RLHF-based chat bots on top of either ByT5 or mT5. Is this because these models still use the encoder/decoder model architecture?

              2 votes
  6. [7]
    cfabbro
    Link
    I have changed the title back to the original, since it wasn't very descriptive of the actual contents. If you want to say it's the best you've ever read in order to encourage people to read it,...

    I have changed the title back to the original, since it wasn't very descriptive of the actual contents. If you want to say it's the best you've ever read in order to encourage people to read it, please just make a comment for that instead, @unga, since having that directly in the title is pretty clickbaity.

    p.s. Please label this comment Offtopic, so it doesn't distract from any on-topic discussions here.

    10 votes
    1. [2]
      unga
      Link Parent
      Thanks. First post. Trying my best to navigate the site appropriately

      Thanks. First post. Trying my best to navigate the site appropriately

      12 votes
      1. cfabbro
        Link Parent
        No worries. It's all good. Thanks for submitting it in the first place. I look forward to reading it. And welcome to Tildes. :)

        No worries. It's all good. Thanks for submitting it in the first place. I look forward to reading it. And welcome to Tildes. :)

        7 votes
    2. [4]
      mycketforvirrad
      Link Parent
      Ha! This is hilarious. The headline keeps changing on the NYT site. What a bamboozler!

      Ha! This is hilarious. The headline keeps changing on the NYT site. What a bamboozler!

      5 votes
      1. [3]
        cfabbro
        Link Parent
        LOL. A lot of papers also use A-B testing to see which titles maximizes their views, which makes trying to nail down a title here difficult. But we'll just have to do the best we can with what we...

        LOL. A lot of papers also use A-B testing to see which titles maximizes their views, which makes trying to nail down a title here difficult. But we'll just have to do the best we can with what we got. :P

        6 votes
        1. [2]
          mycketforvirrad
          Link Parent
          But this is wackier than A-B tested headlines. This is an interactive headline rapidlly scrolling through a changing sequence of subjects. How do you even get that across in a static headline on...

          But this is wackier than A-B tested headlines. This is an interactive headline rapidlly scrolling through a changing sequence of subjects. How do you even get that across in a static headline on Tildes?!

          6 votes
          1. cfabbro
            Link Parent
            Yeah, I just realized that. Hah. I have edited the title to remove the author name, since that keeps changing anyways. :)

            Yeah, I just realized that. Hah. I have edited the title to remove the author name, since that keeps changing anyways. :)

            5 votes