7 votes

MDN’s AI Help and lucid lies

2 comments

  1. blindmikey
    (edited )
    Link
    Completely agree with this piece. To make substantial and crucial progress, we need more than what LLMs can alone provide. We need something that can "continue to" learn and correct itself....

    Completely agree with this piece. To make substantial and crucial progress, we need more than what LLMs can alone provide. We need something that can "continue to" learn and correct itself. Personally I think that will come with further research & development; it's simply a matter of time. But we certainly don't have that now, and to act like we do, while good for marketing, is harmful to users.

    3 votes
  2. creesch
    (edited )
    Link
    I think I agree with the article, although it is lacking some context on the MDN chat feature for those not familiar with it. Quickly glancing the linked GitHub issue it seems the chat feature is...

    I think I agree with the article, although it is lacking some context on the MDN chat feature for those not familiar with it. Quickly glancing the linked GitHub issue it seems the chat feature is GPT-3.5 based.

    Knowing that and knowing the limitation of GPT-3.5 or even GPT-4 it is somewhat surprising that Mozilla went ahead and implemented it on MDN. Although not that surprising. One of the treacherous things about the current generation of most capable LLMs is that they are very convincing at first glance. So when developing proof of concepts based on them, they can "fool" people much longer than you'd often see with traditional software.
    I put "fool" in quotations there because for a lot of information they give they are right. It is just that, like demonstrated in the article, they will also miss the mark. But when that happens, they are still doing so with 100% confidence.

    This means it is really easy to promise management golden mountains. But also means you only start noticing it further down the line of the development process. If you notice it at all, developers implementing an LLM around a domain they are not familiar with might notice at all. In that case, it highly depends on the QA processes in place if it does get flagged. And even if it then does get flagged you have spent a considerable amount of time into a product and I can fully see some management layer pressing for release of the product anyway.

    I still think LLM's are really great tools. I use ChatGPT and the OpenAI api on a daily basis. But mostly as a tool in my tool belt where I am fully aware of the limitation and make sure to only ask it for things I have enough domain knowledge myself to validate the outcome. If I do implement some automation around it, I only do in a way that ensures people with the right knowledge do validate the outcome.

    If I had to implement something to be used by a wider audience, I would plaster warnings all around it to make people aware of the limitations.

    Having now checked out the MDN AI thing, they failed miserably there. The only thing I see is a very tiny gray text under the input

    Results based on MDN's most recent documentation and powered by GPT-3.5, an LLM by OpenAI. Please verify information independently as LLM responses may not be 100% accurate. Read our full guidance for more details.

    Which probably in the mind of some management type did cover their asses but I think is not nearly enough.

    2 votes