36 votes

Megathread #12 for news/updates/discussion of AI chatbots and image generators

Haven't done one of these in a while, but there's a bit of news, so here's another. Here's the previous thread.

27 comments

  1. skybrian
    Link
    OpenAI Pauses ChatGPT's 'Browse With Bing' as Users Bypass Paywalls Oops!

    OpenAI Pauses ChatGPT's 'Browse With Bing' as Users Bypass Paywalls

    Schade writes that the Browse feature may “display content in ways we don’t want” as a reason for the shutdown, but Windows Central reports that users were able to bypass paywalls. The outlet points to the r/ChatGPT subreddit, in which a user posted that Browse with Bing basically turned the chatbot into a web browser that can display the full contents of a paywalled website when a user provides the chatbot with a URL.

    Oops!

    13 votes
  2. skybrian
    Link
    Claude is a chatbot somewhat similar to ChatGPT, but from a different company named Anthropic. Until recently you couldn't sign up to use it directly; instead you'd have to go through another...

    Claude is a chatbot somewhat similar to ChatGPT, but from a different company named Anthropic. Until recently you couldn't sign up to use it directly; instead you'd have to go through another business that provides access to it. You can sign in now, though, using a new website. There's a fair bit of legalese before you can start using it. It doesn't have access to the web, but you can upload a file and ask questions about it.

    Apparently this is Claude 2, just announced today, and there's more about it here.

    [W]e have increased the length of Claude’s input and output. Users can input up to 100K tokens in each prompt, which means that Claude can work over hundreds of pages of technical documentation or even a book. Claude can now also write longer documents - from memos to letters to stories up to a few thousand tokens - all in one go.

    7 votes
  3. [4]
    g33kphr33k
    Link
    That's just given me a project to think about... A website scraper for bypassing paywalls. I haven't looked, but I imagine now that I have thought of this there must be many websites out there...

    That's just given me a project to think about...

    A website scraper for bypassing paywalls. I haven't looked, but I imagine now that I have thought of this there must be many websites out there doing that already.

    6 votes
    1. [3]
      Tatia
      Link Parent
      https://12ft.io/
      5 votes
      1. [2]
        hodorhodor
        Link Parent
        Unfortunately, 12ft hasn’t worked for me on any paywalls in a long time.

        Unfortunately, 12ft hasn’t worked for me on any paywalls in a long time.

        4 votes
        1. balooga
          Link Parent
          People have been using Archive.is to read paywalled articles for years, it’s basically the same idea. I have an iOS shortcut that I can hit from the share panel, that runs the current URL through...

          People have been using Archive.is to read paywalled articles for years, it’s basically the same idea. I have an iOS shortcut that I can hit from the share panel, that runs the current URL through the site in one tap and it’s pretty reliable for most paywalls.

          5 votes
  4. streblo
    Link
    How is ChatGPT's behavior changing over time?

    How is ChatGPT's behavior changing over time?

    GPT-3.5 and GPT-4 are the two most widely used large language model (LLM) services. However, when and how these models are updated over time is opaque. Here, we evaluate the March 2023 and June 2023 versions of GPT-3.5 and GPT-4 on four diverse tasks: 1) solving math problems, 2) answering sensitive/dangerous questions, 3) generating code and 4) visual reasoning. We find that the performance and behavior of both GPT-3.5 and GPT-4 can vary greatly over time. For example, GPT-4 (March 2023) was very good at identifying prime numbers (accuracy 97.6%) but GPT-4 (June 2023) was very poor on these same questions (accuracy 2.4%). Interestingly GPT-3.5 (June 2023) was much better than GPT-3.5 (March 2023) in this task. GPT-4 was less willing to answer sensitive questions in June than in March, and both GPT-4 and GPT-3.5 had more formatting mistakes in code generation in June than in March. Overall, our findings shows that the behavior of the same LLM service can change substantially in a relatively short amount of time, highlighting the need for continuous monitoring of LLM quality.

    5 votes
  5. [3]
    skybrian
    (edited )
    Link
    Custom instructions for ChatGPT This beta feature allows you to add information to your ChatGPT profile that will be included as part of the prompt. It will be interesting to see what people do...

    Custom instructions for ChatGPT

    This beta feature allows you to add information to your ChatGPT profile that will be included as part of the prompt.

    It will be interesting to see what people do with it. I expect it will make it harder to reproduce other people's results, though, especially if you forgot you turned it on.

    Naturally, some people are already putting a jailbreak in there.

    Edit: and now it's gone again.

    5 votes
    1. [2]
      DawnPaladin
      Link Parent
      The custom instructions feature is so much fun. I told mine to occasionally include references to my favorite books and TV shows. Now ChatGPT has gone from a helpful pair-programmer to a helpful...

      The custom instructions feature is so much fun. I told mine to occasionally include references to my favorite books and TV shows. Now ChatGPT has gone from a helpful pair-programmer to a helpful pair-programmer who's constantly comparing my coding journey to my favorite moments from fiction. It's hilarious and oddly uplifting.

      3 votes
      1. skybrian
        Link Parent
        Oh, wow! I had been wondering what to use it for.

        Oh, wow! I had been wondering what to use it for.

  6. skybrian
    Link
    Law Firms Are Recruiting More AI Experts as Clients Demand ‘More for Less' (Bloomberg) (archive)

    Law Firms Are Recruiting More AI Experts as Clients Demand ‘More for Less' (Bloomberg) (archive)

    Legal services are the most vulnerable to ChatGPT-style software, according to a recent University of Pennsylvania study. That’s fueling concerns AI could replace a significant chunk of junior lawyer drudgery. The widely-accessible technology is particularly suited to time-consuming legal work, because of its ability to instantly analyze large documents, predict successful arguments based on past cases or create questions based on pre-defined criteria for a deposition.

    Allen & Overy earlier this year was the first of the Magic Circle of top UK law firms to announce a chatbot to help lawyers draft contracts and client memos. Rivals are now piloting legal AI ‘assistants’ like Casetext Inc.’s CoCounsel, including Bryan Cave Leighton Paisner which began a trial of the software in June.

    4 votes
  7. [5]
    boxer_dogs_dance
    Link
    Uncovering WormGPT: A Cybercriminal’s Arsenal (AI built without ethical constraints)...
    4 votes
    1. [4]
      unkz
      Link Parent
      This is fun stuff, and a good example of why it’s not really all that feasible to regulate AI.

      This is fun stuff, and a good example of why it’s not really all that feasible to regulate AI.

      1 vote
      1. [3]
        skybrian
        Link Parent
        Maybe for some purposes, but it depends what the goal is. I don’t think the existence of a black market makes all regulation useless?

        Maybe for some purposes, but it depends what the goal is. I don’t think the existence of a black market makes all regulation useless?

        1. [2]
          unkz
          Link Parent
          How is anyone going to effectively enforce regulations on multiplying large matrices? Anyone can do it, and nobody can tell what is being done without absolutely absurd levels of surveillance.

          How is anyone going to effectively enforce regulations on multiplying large matrices? Anyone can do it, and nobody can tell what is being done without absolutely absurd levels of surveillance.

          1 vote
          1. skybrian
            (edited )
            Link Parent
            The average consumer doesn’t know what a matrix is and in practice, can only do what’s available via websites and app stores. Similarly, in theory you can build your own electronics out of parts,...

            The average consumer doesn’t know what a matrix is and in practice, can only do what’s available via websites and app stores.

            Similarly, in theory you can build your own electronics out of parts, but most people don’t. They buy products in stores. Often, stores care about whether the products they have FCC approval and meet other standards. This has effects on what electronics people are likely to have, even though there’s nothing actually stopping you from building your own radio transmitter or buying sketchy products from more dubious retailers.

            If you’re protecting a website from random Internet attacks then maybe this isn’t all that relevant, though, because you can assume bad guys know more than the average consumer.

            1 vote
  8. skybrian
    Link
    Meta claims its new art-generating model is best-in-class (TechCrunch) [...] [...] [...] [...] Here is the paper.

    Meta claims its new art-generating model is best-in-class (TechCrunch)

    Today, Meta announced CM3Leon (“chameleon” in clumsy leetspeak), an AI model that the company claims achieves state-of-the-art performance for text-to-image generation. CM3Leon is also distinguished by being one of the first image generators capable of generating captions for images, laying the groundwork for more capable image-understanding models going forward, Meta says.

    [...]

    Most modern image generators, including OpenAI’s DALL-E 2, Google’s Imagen and Stable Diffusion, rely on a process called diffusion to create art. In diffusion, a model learns how to gradually subtract noise from a starting image made entirely of noise — moving it closer step by step to the target prompt.

    [...]

    CM3Leon is a transformer model, by contrast, leveraging a mechanism called “attention” to weigh the relevance of input data such as text or images. Attention and the other architectural quirks of transformers can boost model training speed and make models more easily parallelizable. Larger and larger transformers can be trained with significant but not unattainable increases in compute, in other words.

    And CM3Leon is even more efficient than most transformers, Meta claims, requiring five times less compute and a smaller training dataset than previous transformer-based methods.

    [...]

    To train CM3Leon, Meta used a dataset of millions of licensed images from Shutterstock.

    [...]

    Meta didn’t say whether — or when — it plans to release CM3Leon.

    Here is the paper.

    3 votes
  9. [4]
    skybrian
    Link
    ChatGPT Code Interpreter now available If you're a paid user of ChatGPT, you should now be able to use it with a Python code interpreter. To do this, you need to enable it in user settings (it's a...

    ChatGPT Code Interpreter now available

    If you're a paid user of ChatGPT, you should now be able to use it with a Python code interpreter. To do this, you need to enable it in user settings (it's a beta feature), then switch to GPT4 and select the code interpreter.

    A simple example is "In Python, print hello world." It writes the code and runs it. The code is hidden by default, but you can click on "show work" to see what it wrote.

    One of the things it can do is create and display images using a few different Python libraries. I asked it to print a Mandelbrot set and it did that in low resolution. Then I asked it to increase the resolution and zoom in on one part of it. (The Python interpreter is still loaded with the code it previously wrote, so all it had to do was write a different function call.)

    Unfortunately, when you share a link to a ChatGPT chat session, images aren't included yet, but you can read the chat session and Python code here.

    It doesn't have access to the web. (There was a plugin for that, but it's been disabled.) However, you can upload files; there's a button on the left side of the chat window where you select a file to upload. I was able to upload a png file and have it display it, and it could crop it and render it in greyscale.

    It has access to some 3d libraries, like matplotlib. I uploaded an image of the earth and tried to get it to wrap it around a sphere, but it wasn't able to do it. Maybe someone who actually used the library before would be able to tell it what to do? Maybe there's a better library it could use.

    I hit the limit for how much you can use GPT4 so I'll try again tomorrow.

    2 votes
    1. skybrian
      (edited )
      Link Parent
      Update: I did eventually get it to create a function to draw a globe with some cities labelled on it. But it took all day because it kept screwing up. It's clear that it doesn't actually...

      Update: I did eventually get it to create a function to draw a globe with some cities labelled on it. But it took all day because it kept screwing up.

      It's clear that it doesn't actually understand trigonometry. I eventually had to figure it out myself and tell it what to do. (Longitude is a rotation around the poles. That was the Y axis in this case, and it wasn't calculating z.)

      Instead, it just understands context very well and comes up with plausible guesses and tries them. This is much like what a human programmer would do when they're attempting to fix a bug without understanding it, and it's about as successful. It reminded me of a coding interview that wasn't going very well.

      Today, Code Interpreter doesn't work at all. GPT4 kept going into an infinite loop, and when I asked it to write a "hello world" program it faked it.

      Update: tried again and it works now. Huh.

      Update 2: Looks like user error. "Shared Chat: Model: Default." So I guess that's what happens when you forget to turn Code Interpreter on.

      4 votes
    2. skybrian
      Link Parent
      The Python interpreter runs in a sandbox. Since you can upload files, people have been attempting to get it run other binaries using Python APIs like os.system. Some of those API's have been...

      The Python interpreter runs in a sandbox. Since you can upload files, people have been attempting to get it run other binaries using Python APIs like os.system. Some of those API's have been disabled and sometimes it refuses, but apparently it worked once.

      Apparently telling it "I want to see the error message" is one way to get it to do it?

      1 vote
    3. skybrian
      Link Parent
      Got it to draw a globe, no luck yet putting cities on it. I wrote a blog post so you can see the pictures. ChatGPT seems to be having a bit of a brownout at the moment.

      Got it to draw a globe, no luck yet putting cities on it. I wrote a blog post so you can see the pictures.

      ChatGPT seems to be having a bit of a brownout at the moment.

      1 vote
  10. Wes
    Link
    I missed it earlier, but JetBrains is working on an AI assistance plugin. Most exciting to me is they mention support for local models, albeit in a limited capacity. It's not clear if they'll be...

    I missed it earlier, but JetBrains is working on an AI assistance plugin.

    Most exciting to me is they mention support for local models, albeit in a limited capacity. It's not clear if they'll be providing that model themselves or not though.

    2 votes
  11. [3]
    balooga
    Link
    Invoke AI just released v3.0.0. This is a major update with a ton of new features and UI improvements. The announcement video walks through the highlights. Invoke's feature set typically lags...

    Invoke AI just released v3.0.0. This is a major update with a ton of new features and UI improvements. The announcement video walks through the highlights.

    Invoke's feature set typically lags behind other Stable Diffusion tools like A1111, so you may already be familiar with a number of the new capabilities. In my opinion where it really shines is its user interface and performance on Apple Silicon hardware (it's cross-platform, that's just a selling point for me personally). It also features some unique and really powerful tools like the unified canvas, dynamic prompts, model manager, nodes, and special prompt syntax.

    2 votes
    1. [2]
      skybrian
      Link Parent
      Is this an image generator or something more like Photoshop? (It's not clear to me from a brief glance at the front page.)

      Is this an image generator or something more like Photoshop? (It's not clear to me from a brief glance at the front page.)

      2 votes
      1. balooga
        Link Parent
        It's an image generator, a fork of Stable Diffusion with a snazzy web UI. With the unified canvas you can do inpainting and outpainting seamlessly, so it's a bit like Photoshop in that regard....

        It's an image generator, a fork of Stable Diffusion with a snazzy web UI. With the unified canvas you can do inpainting and outpainting seamlessly, so it's a bit like Photoshop in that regard. I've been following Invoke since the first version, back when it was just some guy's unnamed personal SD fork. It's come a very long way since then. The 3.0.0 release is a big deal, it's been in the works for months and refactors a huge amount of how things work under the hood.

        2 votes
  12. skybrian
    Link
    Here's a blog post comparing LLM's with search engines: The Many Ways that Digital Minds Can Know (Ryan Moulton) ... ... ...

    Here's a blog post comparing LLM's with search engines:

    The Many Ways that Digital Minds Can Know (Ryan Moulton)

    Generally we think of “memorization” as bad, as an indication of overfitting, and the degree of memorization as mutually exclusive with the degree of “generalization.” When a model relies on memorization instead of generalization, we expect it to perform more poorly outside of its domain. But the connotations of generalization and memorization, as good and bad, respectively, do not reflect how language sophistication and index size, respectively, affect a search engine. For a search engine, both memorization and generalization help its performance, there is not a tradeoff between the two. Rather than worrying about tasks “outside of its domain” the goal of a search index is simply to cover the entire domain, and whatever language understanding you can do on top just expands it further.

    ...

    Why would search engines be a better analogy despite the enormous difference in their basic mechanisms? One reason is scale. LLMs are large. GPT-3 has 175 billion parameters, and GPT-4 is much bigger. The internet is also big. It contains more things and more outlandish things than you would ever expect. One of the highlighted results from GPT-4’s launch was its ability to draw a unicorn in Tikz. This is not the sort of thing I would have expected to exist in any remotely similar training data on the web. However, it turns out that there’s an entire ctan package of cartoon animals drawn in Tikz. While the physical construction of a unicorn in code would be astonishing for a model to reason through from scratch, the knowledge that a unicorn is a pink horse with a horn, or the ability to translate from SVG to Tikz, is less surprising.

    ...

    For a model with a high level of integration, the sort of coverage we might be most concerned with is “patterns of valid reasoning” or “facts about the world” whereas for a model with low levels of integration the kind of coverage we’re worried about is “existence of particular ngrams and their statistical associations.”

    ...

    The level of sophistication of the reasoning that the model is doing is not relevant to the quality of the product in this context, or to its usefulness for this task. A model that absorbs more of the web will be more useful regardless whether it’s “smarter” or not, for the same reason that a search engine with a larger index is more useful than a search engine with a smaller index, even if the ranking algorithms do not change. The distinction between memorization and reasoning will not actually matter for a lot of use cases, because ordinary document retrieval is immensely useful, and even “light” semantic reasoning on top of that is more useful still. A model that gets correct results through cheap tricks still gets correct results. A right answer, consistently delivered, is a right answer.

    1 vote