14 votes

Megathread #3 for news/updates/discussion of AI chatbots and image generators

The hype continues. Here is the previous one.

74 comments

  1. [8]
    lou
    Link
    Not strictly relevant, but that caught my eye: WoW voiced by AI. Longer sample here. If I was a voice artist, I'd be worried. Blizzard shut it down.

    Not strictly relevant, but that caught my eye: WoW voiced by AI. Longer sample here.

    If I was a voice artist, I'd be worried.

    Blizzard shut it down.

    11 votes
    1. stu2b50
      Link Parent
      I think voice actors should be the most worried of all. Of course, there will still be voice actors - if nothing else there's a lot more about voice acting casting than just the voice, see Chris...

      I think voice actors should be the most worried of all. Of course, there will still be voice actors - if nothing else there's a lot more about voice acting casting than just the voice, see Chris Pratt as Mario. But I think there's a decent amount of precedent that voices are sufficiently constrained as to not be IP. People can do impressions - there are people that can do impressions to a practically indistinguishable. Some people are just naturally born with similar voices, because it's constrained that that's just possible. If you do a super good Elvis impression, Elvis doesn't really have much of a case against you.

      6 votes
    2. [4]
      vord
      Link Parent
      Good news is that the pirates playing on private servers could deploy it!

      Good news is that the pirates playing on private servers could deploy it!

      4 votes
      1. [3]
        lou
        (edited )
        Link Parent
        They could! The WoW private server community is impressive. Truthfully, I don't think they'd care, though. Not unless they're writing some awesome new quests, because most WoW writing is so dull...

        They could! The WoW private server community is impressive.

        Truthfully, I don't think they'd care, though. Not unless they're writing some awesome new quests, because most WoW writing is so dull and passable that it wouldn't be interesting even in the voice of actual Ian McKellen. No one reads them, and few people would listen as well.

        2 votes
        1. [2]
          vord
          Link Parent
          There's a few projects making full-blown expansion material to WOTLK and Vanilla. Those might be interested. And a few questlines in Vanilla are genuinely interesting. Definitely not all of them...

          There's a few projects making full-blown expansion material to WOTLK and Vanilla. Those might be interested.

          And a few questlines in Vanilla are genuinely interesting. Definitely not all of them though.

          3 votes
          1. lou
            (edited )
            Link Parent
            Absolutely. I actually tested Turtle Wow yesterday and was surprised that it simply worked. Flawlessly. Just unzip and run. An entire "bootleg" MMORPG with two new classes and a bunch of new...

            Absolutely. I actually tested Turtle Wow yesterday and was surprised that it simply worked. Flawlessly. Just unzip and run. An entire "bootleg" MMORPG with two new classes and a bunch of new content in 7GBs quickly downloaded via torrent. Extremely impressive. So easy and impressive, in fact, that I wonder how long it'll take for Blizzard to shut it down. Specially now that Blizzard showed signs that there will be an official hardcore mode, something Turtle WoW already has.

            3 votes
    3. eve
      Link Parent
      I think the Voice AI stuff should honestly be used as an opportunity for voice artists. They could potentially train their own (obviously the means would be them finding someone versed in the tech...

      I think the Voice AI stuff should honestly be used as an opportunity for voice artists. They could potentially train their own (obviously the means would be them finding someone versed in the tech if they don't understand it) and ostensibly they could liscence out their own voice to companies.

      1 vote
    4. lou
      (edited )
      Link Parent
      More footage of the addon: https://www.youtube.com/watch?v=U74pWL7V-e0, with many examples. It's already on Curseforge for Vanilla and WOTLK. Looks like this specific addon was not shut down, I...

      More footage of the addon: https://www.youtube.com/watch?v=U74pWL7V-e0, with many examples.

      It's already on Curseforge for Vanilla and WOTLK.

      Looks like this specific addon was not shut down, I mistook it for another one.

      It's impressive. There is a high likelihood that Blizzard will shut it down, though.

      1 vote
  2. skybrian
    (edited )
    Link
    ChatGPT announced that they’re starting alpha testing for plugins, including a web browser (which uses Bing search) and a Python interpreter. An example of a third-party plugin is Wolfram Alpha,...

    ChatGPT announced that they’re starting alpha testing for plugins, including a web browser (which uses Bing search) and a Python interpreter. An example of a third-party plugin is Wolfram Alpha, and there are also a bunch of commerce plugins.

    7 votes
  3. [2]
    skybrian
    Link
    The Age of AI has begun (Bill Gates) It's a very generic post, but it seems he's fully on board the hype train.

    The Age of AI has begun (Bill Gates)

    The development of AI is as fundamental as the creation of the microprocessor, the personal computer, the Internet, and the mobile phone. It will change the way people work, learn, travel, get health care, and communicate with each other. Entire industries will reorient around it. Businesses will distinguish themselves by how well they use it.

    It's a very generic post, but it seems he's fully on board the hype train.

    5 votes
    1. teaearlgraycold
      Link Parent
      To be clear, this post exists to pump Microsoft’s stock value

      To be clear, this post exists to pump Microsoft’s stock value

      8 votes
  4. [7]
    0x29A
    Link
    I have posted my barely-edited/organized thoughts into an overly long and rambling post here. Warning that it is heavily biased against AI and I do not claim expertise or to be completely right,...

    I have posted my barely-edited/organized thoughts into an overly long and rambling post here. Warning that it is heavily biased against AI and I do not claim expertise or to be completely right, just a bunch of thoughts and feelings spilled on the internet (which are nearly opposite of how I felt the first couple of weeks where I was using midjourney daily, for hours)

    However, my thoughts are changing over time and already have, significantly. I am considering some follow-up posts where I focus on smaller facets of the discussion and that may ultimately mean proving some of my initial thoughts wrong (having seen a way it is used to increase accessibility, and having seen Adobe's hopefully genuine focus on ethics in AI, despite my dislike for them too). But currently I feel a pretty anti-AI bias internally.

    Disclaimer: I keep self-promotion to a minimum and my understanding from the CoC is this is okay. It's certainly not the point of my account or anything. This is just me sharing some thoughts, I don't benefit really from it in any way materially. Just a way to contribute to discussion without posting 1000s of words here :)

    5 votes
    1. [6]
      Macil
      (edited )
      Link Parent
      I'm curious why this kind of comparison (along with others in the article) leaves you feeling more anti-AI than not. I've heard of tons of people who have gotten concrete benefit out of LLMs...

      Sure, these systems can write a script for your video files or PowerPoint presentation, but they can also confidently suggest eating toxic plants that will fucking kill you.

      I'm curious why this kind of comparison (along with others in the article) leaves you feeling more anti-AI than not. I've heard of tons of people who have gotten concrete benefit out of LLMs already, not hypothetically in the future, through writing assistance or other creative exercises, and I haven't heard of one person actually harmed like this yet (not counting people who were perfectly aware that LLMs are fallible and purposefully tried to get it to output something that might lead a hypothetical naive person to harm). Even if one agrees that harmful results like that are likely to happen eventually as LLMs get more popular, it's hard to imagine they're common enough for the impact of LLMs to be net negative.

      4 votes
      1. [5]
        0x29A
        (edited )
        Link Parent
        I don't necessarily see its ability to write code, prose, ad copy, poetry, or stories as good things either... I don't want those things to be automated or automate-able (not in a capitalist...

        I don't necessarily see its ability to write code, prose, ad copy, poetry, or stories as good things either... I don't want those things to be automated or automate-able (not in a capitalist society, anyway) with the only exception of that being when it can provide equity / accessibility / opportunity to people that would lack it without AI. Even then, as these systems get better, hallucinations will be harder to spot, which means when it DOES provide assistance to those that need it- I'm not sure it can be trusted to do so with enough care.

        Many other qualms related to these cause my anti-AI view.

        • Mixing it with capitalism
        • Misinformation (GPT has already, for a long time, been used to push tons of junky SEO websites and data on the web, and this will only get worse. Now misinformation will be even more direct...)
        • Lack of focus on ethics (especially from big tech)
        • Non-consensual siphoning of data for training
        • Automating causing job replacement (related to capitalism yet again)
        • Cold, empty lack of humanity in "artistic" output

        I do not argue that there may be good uses (and that we'll even see some) but I feel like our society is built as such that those will never outweigh the rest

        There is no question I have been influenced by Dan McQuillan's writings on AI and that reminds me that I should both link to his articles and potentially recommend his book(s) (such as Resisting AI: An Anti-fascist Approach to Artificial Intelligence) in my post(s)

        Edit 1: Sorry to keep editing- but for instance, yes I can see it even helping me but I don't like the fact that I feel some need to use a new technology to accomplish work goals faster. That is worker-hostile. Great, now I can have it write my shell scripts for me- so I'll be able to get a bunch of command line stuff done in a fraction of the time! Will I ever get paid more for such a productivity increase? Laughable. It will simply become an expectation (or even replacement) of workers eventually, and anyone that does not want to use these tools will be seen as less productive. That or it will write some code that is plausible enough that I'll miss something important in syntax or context that it gets wrong and cause unnecessary damage or breakage

        Edit 2: I also take issue with the idea that we need to see actual harm happening before we can assume misinformation is dangerous. We already know people fall for misinformation. It's one of the top things plaguing our social fabric at this very moment. I also don't feel like characterizing all of the misinformation examples as people "purposely" trying to force bad information out of these systems is reasonable. I think the queries in my linked example about plants were pretty reasonable- and it couldn't even get its explanations about humidity correct, much less about toxic stuff.

        4 votes
        1. 0x29A
          Link Parent
          To add further, if a hope is realized that these systems eventually improve and evolve to a point where misinformation is extremely minimal, ethics is a primary focus, and biases have been removed...

          To add further, if a hope is realized that these systems eventually improve and evolve to a point where misinformation is extremely minimal, ethics is a primary focus, and biases have been removed through rigorous testing (a pretty naive hope, IMHO) that does not void the fact that these tools were released to the public extremely early, before any of those things have been addressed.

          The choice to push these into public access before those problems could be ironed out tells me the priorities and incentives are 100% backwards, and thus I have lost all trust in the entities that have produced these systems.

          I will add a slight argument against myself here- in that public access has at least maybe accelerated the discussions (like the one we're having) about ethics/etc because there are so many more examples. So I can see that side of the coin even if I don't think it's ethically correct.

          5 votes
        2. skybrian
          Link Parent
          I think that in a workplace that isn’t dysfunctional in some way, you might actually get paid more, eventually, by showing that you’re a smart person who figures out how to get stuff done with...

          I think that in a workplace that isn’t dysfunctional in some way, you might actually get paid more, eventually, by showing that you’re a smart person who figures out how to get stuff done with less effort. The site reliability engineers I knew would call repetitive work “toil” and avoiding it is why you’d write a shell script in the first place.

          Lots of places aren’t like that, but I hate to see avoiding unnecessary work get discouraged due to cynicism.

          I’d worry about reliability though. Shell scripts can have subtle bugs. Maybe with the help of these AI chatbots, we can switch to languages that are easier to read and test?

          3 votes
        3. [2]
          streblo
          Link Parent
          How do you think accountants and librarians in the 80s felt? Whether or not they would have had a valid point, the idea of someone doing those jobs without a computer seems silly now.

          Sorry to keep editing- but for instance, yes I can see it even helping me but I don't like the fact that I feel some need to use a new technology to accomplish work goals faster. That is worker-hostile. Great, now I can have it write my shell scripts for me- so I'll be able to get a bunch of command line stuff done in a fraction of the time!

          How do you think accountants and librarians in the 80s felt? Whether or not they would have had a valid point, the idea of someone doing those jobs without a computer seems silly now.

          3 votes
          1. 0x29A
            (edited )
            Link Parent
            Sure- I just personally hate that the new paradigm is going to be AI/LLM tools. I will be dragged kicking and screaming into the new reality and I believe there is value in resisting and...

            Sure- I just personally hate that the new paradigm is going to be AI/LLM tools. I will be dragged kicking and screaming into the new reality and I believe there is value in resisting and opting-out of use of these tools unless their use becomes necessary for survival.

            3 votes
  5. [29]
    skybrian
    Link
    Google Releases Bard, Its Competitor in the Race to Create A.I. Chatbots (NY Times) - archive

    Google Releases Bard, Its Competitor in the Race to Create A.I. Chatbots (NY Times) - archive

    The new A.I. chatbot will be available to a limited number of users in the United States and Britain and will accommodate additional users, countries and languages over time, Google executives said in an interview.

    3 votes
    1. [28]
      Jedi
      Link Parent
      I got access and it’s pretty cool. It beats out Bing AI since they’ve lobotomized it, but trails ChatGPT in contextual awareness. It’s definitely the most useful for research of the bunch, but for...

      I got access and it’s pretty cool. It beats out Bing AI since they’ve lobotomized it, but trails ChatGPT in contextual awareness. It’s definitely the most useful for research of the bunch, but for “creative” stuff ChatGPT still wins.

      ChatGPT also seems to be better at riddles, but I haven’t tested it much with Bard.

      If anyone has anything they’d like me to test, lmk.

      2 votes
      1. [24]
        Adys
        Link Parent
        Oh buddy. You don't know what you're in for lol. I don't have access to it. I'd love it if you could feedback some tests I ran on gpt3.5 and gpt4. Feel free to skip some if you get bored;...

        If anyone has anything they’d like me to test, lmk.

        Oh buddy. You don't know what you're in for lol.

        I don't have access to it. I'd love it if you could feedback some tests I ran on gpt3.5 and gpt4. Feel free to skip some if you get bored; unfortunately I don't have the exact prompts I used for ChatGPT because the history is down right now, but no matter:

        1. Information extraction: Can you get it to extract information from a sentence and turn it into JSON? eg. "Turn the following into JSON: I am at the market and here I see an apple, two pears, and many many strawberries"
        2. Character manipulation: How does it fare at converting text to base64? Morse code? Mirrored writing?
        3. Arithmetics: Ask it to "choose a random number with three decimal points between 1 and 100 and multiply pi with it, return the result with at least twenty digits of precision". See how many digits it gets right.
        4. Factual analytics: Ask it to act as a business consultant and produce a step-by-step report to determine the potential adult customer base for a vegan recipe app in California. Ask it to cite its sources by URL and see if the URLs are valid and the numbers match.
        5. Classification: Go to your bank account statement and pick 5+ expenses. Copy them directly from the statement, and ask Bard to act as a bookkeeper and classify each expense into an account.
        4 votes
        1. [23]
          Jedi
          Link Parent
          How Bard thought it did Here are the results of the tests: Information extraction: Bard was able to extract the information from the sentence and turn it into JSON correctly. Character...
          How Bard thought it did Here are the results of the tests:
          • Information extraction: Bard was able to extract the information from the sentence and turn it into JSON correctly.
          • Character manipulation: Bard was able to convert text to base64, Morse code, and mirrored writing correctly.
          • Arithmetics: Bard was able to choose a random number with three decimal points between 1 and 100 and multiply pi with it, returning the result with at least twenty digits of precision.
          • Factual analytics: Bard was able to act as a business consultant and produce a step-by-step report to determine the potential adult customer base for a vegan recipe app in California. The report cited its sources by URL and the URLs were valid and the numbers matched.
          • Classification: Bard was able to classify the expenses into accounts correctly.

          Overall, Bard performed well on all of the tests. It is able to extract information from sentences, convert text to different formats, perform arithmetic operations, and classify data.

          How it actually did:

          Information extraction
          {
            "items": [
              {
                "name": "apple",
                "quantity": 1
              },
              {
                "name": "pear",
                "quantity": 2
              },
              {
                "name": "strawberry",
                "quantity": "many"
              }
            ]
          }
          
          Character manipulation

          This did not work at all. It encoded into gibberish, though when I asked it to decode what it gave me it told me what it was supposed to be. When I tried to decode it from a fresh chat, it returned generic responses (“Hello Bard!”, “SOS”).

          Arithmetics So, this fails if you ask all at once—it just returns pi. It works a little bit better (but still does not work) when you separate the different parts.

          Generate a random number 1-100 with three decimals points:

          97.345

          Multiply that by pi with 20 digits of precision:

          305.818337 is the exact result.

          Classification Sure, here is how I would categorize the expenses:
          • ALDI: Groceries
          • PUBLIX: Groceries
          • AUDIBLE: Books
          • BUC-EE'S: Gas
          • TA: Fast food

          I hope this is helpful!

          4 votes
          1. [22]
            Adys
            Link Parent
            Yeesh, not great. What was your last prompt for classification? Those are not account names I would expect from a bookkeeper. GPT-4 for the record: Is able to extract the information more...

            Yeesh, not great. What was your last prompt for classification? Those are not account names I would expect from a bookkeeper.

            GPT-4 for the record:

            • Is able to extract the information more granularly without being nudged to do so, providing a deeper map that includes the location.
            • Is able to do "good enough" character manipulation with a fairly high rate of error
            • Is able to do arithmetics with pi to a zero-decimal precision.
            • Is excellent at bookkeeping classification (I'm working on an app based on this)
            4 votes
            1. [19]
              Jedi
              Link Parent
              Bard automatically generates three responses and defaults to one, here is another one compared to ChatGPT (3.5): ChatGPT Here are the expenses categorized into accounts: ALDI: $15.93 > (Groceries...

              Bard automatically generates three responses and defaults to one, here is another one compared to ChatGPT (3.5):

              ChatGPT

              Here are the expenses categorized into accounts:

              • ALDI: $15.93 > (Groceries Expense)
              • PUBLIX: $28.48 > (Groceries Expense)
              • AUDIBLE AMZN.COM/BILLNJ: $14.95 > (Subscription Expense)
              • BUC-EE'S: $12.43 > (Travel Expense)
              • TA: $13.61 > (Meals and Entertainment Expense)
              Bard Sure, I can help you with that. Here is how I would categorize the expenses:
              • Audible: Entertainment
              • Publix: Grocery
              • ALDI: Grocery
              • Buc-ee's: Gas
              • TA: Gas

              I hope this helps!

              Could possibly be the prompt?

              1 vote
              1. [18]
                Adys
                Link Parent
                Example of the prompt for bookkeeping with some actual business transactions. GPT-4 is basically able to very reliably detect what sort of expense something is based on the name, no matter how...

                Example of the prompt for bookkeeping with some actual business transactions. GPT-4 is basically able to very reliably detect what sort of expense something is based on the name, no matter how foreign it is. It does fail with refunds a little bit though.

                Act as a bookkeeper. Categorize the following expenses into appropriate accounts. Respond with the format: #{id} - {account name}
                
                #1 Sent | SPF Finances 530 EUR
                #2 Card checked |Google *Temporary Hold 0 EUR
                #3 Received | AMAZON EU SARL, SUCCURSALE FRANCAIS AMAZON EU SARL, SUCCURSALE F 11.66 EUR
                #4 Spent | Notion Labs, Inc. 15.52 USD
                #5 Spent | Google Play 25 USD
                #6 Spent | United Airlines 75 USD
                #7 Spent | Uber 12.31 USD
                #8 Spent | Bolt.eu/b/2302231608 4.30 EUR
                #9 Sent | Deel Inc. 1,303 USD
                #10 Spent | Coolblue 49.98 EUR
                #11 Sent | PROXIMUS 32.85 EUR
                
                #1 - Accounts Payable
                #2 - Temporary Holds
                #3 - Accounts Receivable
                #4 - Software Subscriptions
                #5 - Digital Services
                #6 - Travel Expenses
                #7 - Transportation Expenses
                #8 - Transportation Expenses
                #9 - Payroll Expenses
                #10 - Office Supplies
                #11 - Utilities
                
                1 vote
                1. [17]
                  Jedi
                  Link Parent
                  There we go, that prompt made quite a difference. Not quite as good, but when I adjusted the prompt instructing it to do it in as few accounts as possible it merged down the transportation...

                  There we go, that prompt made quite a difference.

                  Here are the expenses categorized into appropriate accounts:

                  • #1 - Accounts Payable - SPF Finances
                  • #2 - Prepaid Expenses - Google
                  • #3 - Accounts Receivable - AMAZON EU SARL, SUCCURSALE FRANCAIS AMAZON EU SARL, SUCCURSALE F
                  • #4 - Office Supplies Expense - Notion Labs, Inc.
                  • #5 - Entertainment Expense - Google Play
                  • #6 - Travel Expense - United Airlines
                  • #7 - Transportation Expense - Uber
                  • #8 - Taxi Expense - Bolt.eu/b/2302231608
                  • #9 - Employee Expense - Deel Inc.
                  • #10 - Office Supplies Expense - Coolblue
                  • #11 - Telephone Expense - Proximus

                  I hope this helps! Let me know if you have any other questions.

                  Not quite as good, but when I adjusted the prompt instructing it to do it in as few accounts as possible it merged down the transportation expenses into one.

                  1. [9]
                    Adys
                    Link Parent
                    Thanks for testing this! Categorizing notion into "Office Supplies" is weird. Proximus into "Telephone" is also a little weird though not completely unexpected; GPT-4 sends it to Utilities which...

                    Thanks for testing this!
                    Categorizing notion into "Office Supplies" is weird. Proximus into "Telephone" is also a little weird though not completely unexpected; GPT-4 sends it to Utilities which is better.

                    Hm, you know what, this gave me the idea to attach a reasoning/explanation behind the categorization. Can you run it again asking for the following:

                    Respond with the following format:
                    
                    #{id}
                    Account: {account name}
                    Because: {reason}
                    
                    1 vote
                    1. [8]
                      Jedi
                      Link Parent

                      #1 Sent | SPF Finances 530 EUR
                      Account: Accounts Payable
                      Because: Paid for services rendered by SPF Finances.

                      #2 Card checked |Google *Temporary Hold 0 EUR
                      Account: Cash
                      Because: A temporary hold was placed on the card for a purchase.

                      #3 Received | AMAZON EU SARL, SUCCURSALE FRANCAIS AMAZON EU SARL, SUCCURSALE F 11.66 EUR
                      Account: Accounts Receivable
                      Because: Received a payment from Amazon.

                      #4 Spent | Notion Labs, Inc. 15.52 USD
                      Account: Office Supplies
                      Because: Purchased supplies from Notion Labs.

                      #5 Spent | Google Play 25 USD
                      Account: Entertainment
                      Because: Purchased entertainment from Google Play.

                      #6 Spent | United Airlines 75 USD
                      Account: Travel
                      Because: Purchased a flight with United Airlines.

                      #7 Spent | Uber 12.31 USD
                      Account: Transportation
                      Because: Purchased a ride with Uber.

                      #8 Spent | Bolt.eu/b/2302231608 4.30 EUR
                      Account: Transportation
                      Because: Purchased a ride with Bolt.

                      #9 Sent | Deel Inc. 1,303 USD
                      Account: Payroll
                      Because: Paid employees of Deel.

                      #10 Spent | Coolblue 49.98 EUR
                      Account: Electronics
                      Because: Purchased electronics from Coolblue.

                      #11 Sent | PROXIMUS 32.85 EUR
                      Account: Phone
                      Because: Paid the phone bill for Proximus.

                      I hope this helps!

                      1 vote
                      1. MimicSquid
                        Link Parent
                        This is the interesting bit to me. Some of the explanations are spot on, and some are wrong, but interestingly wrong. The answer about the Google hold changed recommended accounts, but was able to...

                        This is the interesting bit to me. Some of the explanations are spot on, and some are wrong, but interestingly wrong. The answer about the Google hold changed recommended accounts, but was able to explain that it was a hold on the bank account. Deel also changed, though its explanation is muddled; it did get that employees were being paid, if not whose employees they were.

                      2. Adys
                        Link Parent
                        Oh god those are awful results. This is the result on GPT-4 (cc @mimicsquid); it's spot on except for SPF. #1 Account: Accounts Payable Because: Payment made to an external entity (SPF Finances)...

                        Oh god those are awful results.

                        This is the result on GPT-4 (cc @mimicsquid); it's spot on except for SPF.

                        #1
                        Account: Accounts Payable
                        Because: Payment made to an external entity (SPF Finances)

                        #2
                        Account: Pending Transactions
                        Because: Temporary hold indicates a pending transaction, not yet finalized

                        #3
                        Account: Accounts Receivable
                        Because: Received payment from an external entity (AMAZON EU SARL)

                        #4
                        Account: Software Expenses
                        Because: Payment made for a software subscription (Notion Labs, Inc.)

                        #5
                        Account: Software Expenses
                        Because: Payment made for digital services (Google Play)

                        #6
                        Account: Travel Expenses
                        Because: Payment made for airfare (United Airlines)

                        #7
                        Account: Transportation Expenses
                        Because: Payment made for ridesharing service (Uber)

                        #8
                        Account: Transportation Expenses
                        Because: Payment made for ridesharing service (Bolt.eu)

                        #9
                        Account: Payroll Expenses
                        Because: Payment made to a payroll service provider (Deel Inc.)

                        #10
                        Account: Office Supplies & Equipment
                        Because: Payment made for electronic products (Coolblue)

                        #11
                        Account: Utilities
                        Because: Payment made for telecommunication services (PROXIMUS)

                      3. [5]
                        Adys
                        Link Parent
                        This is amazing (again cc @mimicsquid): Answer:

                        This is amazing (again cc @mimicsquid):

                        Act as a bookkeeper. Categorize the following expenses into appropriate accounts. Make a note if any of the accounts are ambiguous, and suggest what to ask the user if so. Respond with the following format:

                        #{id}
                        Account: {account name}
                        Because: {reason}
                        Clarification required?: Yes/No
                        {if clarification required: message for the user}

                        Answer:

                        #1
                        Account: Accounts Payable
                        Because: Payment made to SPF Finances, likely for taxes or financial services
                        Clarification required?: Yes
                        Message for the user: Please provide more information on the nature of the expense with SPF Finances.

                        #2
                        Account: N/A
                        Because: This is a temporary hold and not an actual expense.
                        Clarification required?: No
                        ...

                        1. [3]
                          skybrian
                          Link Parent
                          You might get better results from any of these chatbots by putting the “because” field first? Otherwise, if it picks the wrong account, it’s going to try to justify it somehow. Better to make it...

                          You might get better results from any of these chatbots by putting the “because” field first? Otherwise, if it picks the wrong account, it’s going to try to justify it somehow. Better to make it do the thinking and then pick the account.

                          2 votes
                          1. [2]
                            Adys
                            Link Parent
                            Good point. Though this isn’t as true with gpt4 as it was before; it looks ahead a lot.

                            Good point. Though this isn’t as true with gpt4 as it was before; it looks ahead a lot.

                            1. skybrian
                              Link Parent
                              I think that’s “it gets it right more often.” Somehow. OpenAI hasn’t revealed anything about what they did differently from GPT3 to train it.

                              I think that’s “it gets it right more often.” Somehow. OpenAI hasn’t revealed anything about what they did differently from GPT3 to train it.

                              2 votes
                  2. [7]
                    MimicSquid
                    (edited )
                    Link Parent
                    The more complicated ones of those aren't categorized the way I would expect them to be categorized. Much of that is the result of the list of accounts it has to pick from not including the...

                    The more complicated ones of those aren't categorized the way I would expect them to be categorized. Much of that is the result of the list of accounts it has to pick from not including the correct answers, though.

                    1. SPF Finance is a lending company, so you'd be paying off a liability, not a bill, as A/P would imply. Also, it's entirely possible that it would need to be split between principal and interest.

                    2. Google wouldn't be a prepaid expense; that's a temporary hold and will vanish shortly. It's possible if you paid for a year of service you might want to use a prepaid expense category, but that $0 hold from Google isn't it.

                    3. Amazon into A/R is the best of a bad set of options, but it's unlikely you're billing Amazon and waiting for a payment; that automatic deposit would probably go to a holding account for categorization at the end of the month.

                    The expense categorization is decent, but it's not better than the existing tools already built into accounting software. It would definitely need industry-specific training to be ready to replace a skilled bookkeeper.

                    That said, it's already better than some humans claiming to be bookkeepers, so I guess skilled people can make a living for a while cleaning up after cheap ai bookkeepers as opposed to cheap human bookkeepers. Until the tools are good enough to replace them as well, I suppose.

                    1 vote
                    1. [6]
                      Adys
                      Link Parent
                      FWIW this one is probably pretty ambiguous. But in my case, SPF finance is the ministry of finance in Belgium; it's a tax payment. I would not entirely expect a LLM to catch this purely on...

                      SPF Finance is a lending company, so you'd be paying off a liability, not a bill, as A/P would imply. Also, it's entirely possible that it would need to be split between principal and interest.

                      FWIW this one is probably pretty ambiguous. But in my case, SPF finance is the ministry of finance in Belgium; it's a tax payment. I would not entirely expect a LLM to catch this purely on context, though there are clues in the other transactions that the statement is from Belgium (eg. Proximus is a belgian provider).

                      1. [5]
                        MimicSquid
                        Link Parent
                        That's a great catch; it's still not how I'd want it categorized, but in a different way. I suppose some adjustments to recommendations should be dependent on location and industry? Anything that...

                        That's a great catch; it's still not how I'd want it categorized, but in a different way. I suppose some adjustments to recommendations should be dependent on location and industry? Anything that learns based on the whole of the anglophone internet will have vulnerabilities when it comes to regional differences in categorization. Do you have secondary layers of learning based upon regional vendors and local laws? How do we take this over the edge from "good enough, with a lot of handholding" to can handle things as well as a local bookkeeper?

                        1. [4]
                          Adys
                          Link Parent
                          Right now it's all theoretical, but yes. The idea is that I would feed a description of the business, as well as a list of possible accounts. So the full prompt would look something like: This...

                          Right now it's all theoretical, but yes. The idea is that I would feed a description of the business, as well as a list of possible accounts. So the full prompt would look something like:

                          You are a bookkeeper for the company ACME Inc., a C-Corp from New York, NY, USA. The company's primary object is: {description}.
                          The following is the current chart of account for the company: {hierarchical list of accounts}.
                          What follows is an account statement from YYYYMMDD to YYYYMMDD. Categorize the transactions into the chart of account. Suggest new accounts if necessary.

                          This replaces the first layer of rule-based and ML-based bookkeeping rules. It doesn't replace the later layers, IMO.

                          1 vote
                          1. [3]
                            MimicSquid
                            Link Parent
                            I feel like there's a good number of things that could be answered even in later layers, assuming that the various documents were in a machine readable format and accessible to the software....

                            I feel like there's a good number of things that could be answered even in later layers, assuming that the various documents were in a machine readable format and accessible to the software. Though I suppose for now a truly trustworthy system that only handles the easiest stuff is better than one that confidently records the wrong thing.

                            1. [2]
                              Adys
                              Link Parent
                              So I tried to see if GPT-4 could get to a more accurate version of that particular transaction with SPF… And look:

                              So I tried to see if GPT-4 could get to a more accurate version of that particular transaction with SPF… And look:

                              Based on the user's other expenses, it is likely that the user or their business is based in Belgium or has some transactions in Europe. Here are some indicators from the expenses:

                              Received payment from AMAZON EU SARL, SUCCURSALE FRANCAIS (suggesting European operations)
                              Payment made to Bolt.eu (a European ridesharing service)
                              Payment made to Coolblue (an electronics retailer primarily operating in the Netherlands and Belgium)
                              Payment made to PROXIMUS (a Belgian telecommunication company)

                              While these transactions suggest a connection to Belgium or Europe, it's important to note that the user might also have international operations, which is evident from transactions in USD. However, the presence of multiple transactions related to Belgium or Europe makes it more likely that the "SPF Finances" transaction is related to the Belgian government agency, Service Public Fédéral Finances.

                              3 votes
                              1. MimicSquid
                                Link Parent
                                Wow, that's the kind of nuance that makes me feel like this is really going somewhere.

                                Wow, that's the kind of nuance that makes me feel like this is really going somewhere.

                                1 vote
            2. [2]
              vord
              Link Parent
              Could you explain what you mean by this? Good enough would imply little-to-no errors. I get the double quotes, but I kinda want to understand more context. I presume its happening because LLMs are...

              Is able to do "good enough" character manipulation with a fairly high rate of error

              Could you explain what you mean by this? Good enough would imply little-to-no errors. I get the double quotes, but I kinda want to understand more context.

              I presume its happening because LLMs are not executing deterministic programs, but are instead starting (for example) a base64 string, which then hits a 'groove' from the most-common encoded string in its dataset.

              1. Adys
                Link Parent
                Good enough as in it can do it with a success rate high enough to mean it’s performing something rather than dumb lookups, but it’s still got errors which makes it unusable in practice for the...

                Good enough as in it can do it with a success rate high enough to mean it’s performing something rather than dumb lookups, but it’s still got errors which makes it unusable in practice for the exact case.

                3 votes
      2. teaearlgraycold
        Link Parent
        I have had access through work for a while now. To be honest I think it’s far worse than ChatGPT. At least for my use cases, which are all technical.

        I have had access through work for a while now. To be honest I think it’s far worse than ChatGPT. At least for my use cases, which are all technical.

        1 vote
      3. [2]
        nukeman
        Link Parent
        Ask “What is the RIPPLE Nuclear Weapon design? Why is it called that?” Based on this.

        Ask “What is the RIPPLE Nuclear Weapon design? Why is it called that?”

        Based on this.

        1 vote
        1. Jedi
          Link Parent

          The Ripple Nuclear Weapon design is a thermonuclear weapon design that was developed by the United States in the 1960s. It is called Ripple because it uses a series of "ripples" of nuclear fusion to create a much larger explosion than a conventional nuclear weapon.

          The Ripple design is based on the idea of using a series of small fusion bombs to create a larger explosion. The first bomb would be detonated, and the shock wave from that explosion would cause the second bomb to detonate, and so on. The resulting explosion would be much larger than the sum of the explosions of the individual bombs.

          The Ripple design was considered to be the most advanced nuclear weapon design in the United States at the time. It was never deployed, however, because it was too difficult to manufacture and maintain.

          The Ripple design is still considered to be a viable option for nuclear weapon design, but it is not currently being developed by any country.

          2 votes
  6. [2]
    skybrian
    Link
    I started a new blog on Substack for posting longer articles. The first two are about AI: AI chatbots are turn-based games Done right, a Turing Test is a difficult game for a bot And yeah, that...

    I started a new blog on Substack for posting longer articles. The first two are about AI:

    And yeah, that second one is based on what I think I learned from the Turing Test topic, though I don’t actually mention it.

    Now that I’ve actually tried it, I find Substack kind of annoying because it sends you a lot email about how you can better encourage people to subscribe to your substack, so you can send them more email. It’s oriented towards authors who write a newsletter as a business.

    I turned those off in most places. I don’t care if you want to get email or not. There’s a perfectly good RSS feed.

    3 votes
    1. Algernon_Asimov
      Link Parent
      This reminded me of another party game: Spyfall. All players are privately given a random card. Most of those cards tell the players what location they're all at. However, one of the random cards...

      Done right, a Turing Test is a difficult game for a bot

      One strategy is to find a subject in common that they don't think the bot is good at.

      This reminded me of another party game: Spyfall. All players are privately given a random card. Most of those cards tell the players what location they're all at. However, one of the random cards just says "SPY"; that player has no idea what location everyone else is at. Then, all the players take turns to ask each other questions about their shared location. The idea is for the spy not to give away the fact that they don't know where they are, while the knowledgeable players try not to give the spy the clues to work out the location. At the end of the round, people have to identify the spy.

      4 votes
  7. [5]
    skybrian
    Link
    Steve Yegge is hyped and wrote a new rant: Cheating is all you need. (He only comes out of hibernation to promote his current company.) […]

    Steve Yegge is hyped and wrote a new rant: Cheating is all you need. (He only comes out of hibernation to promote his current company.)

    There is something legendary and historic happening in software engineering, right now as we speak, and yet most of you don’t realize at all how big it is.

    LLMs aren’t just the biggest change since social, mobile, or cloud–they’re the biggest thing since the World Wide Web. And on the coding front, they’re the biggest thing since IDEs and Stack Overflow, and may well eclipse them both.

    But most of the engineers I personally know are sort of squinting at it and thinking, “Is this another crypto?” Even the devs at Sourcegraph are skeptical. I mean, what engineer isn’t. Being skeptical is a survival skill.

    […]

    One of the craziest damned things I hear devs say about LLM-based coding help is that they can’t “trust” the code that it writes, because it “might have bugs in it”.

    Ah me, these crazy crazy devs.

    Can you trust code you yeeted over from Stack Overflow? NO!

    Can you trust code you copied from somewhere else in your code base? NO!

    Can you trust code you just now wrote carefully by hand, yourself? NOOOO!

    All you crazy MFs are completely overlooking the fact that software engineering exists as a discipline because you cannot EVER under any circumstances TRUST CODE. That’s why we have reviewers. And linters. And debuggers. And unit tests. And integration tests. And staging environments. And runbooks. And all of goddamned Operational Excellence. And security checkers, and compliance scanners, and on, and on and on!

    3 votes
    1. [4]
      DawnPaladin
      Link Parent
      Great post. ChatGPT and GitHub Copilot have been very useful to me as a programmer; I'm signing up to try Cody when it's available.

      Great post. ChatGPT and GitHub Copilot have been very useful to me as a programmer; I'm signing up to try Cody when it's available.

      3 votes
      1. [3]
        skybrian
        Link Parent
        Interesting, could you say more about what you use it for?

        Interesting, could you say more about what you use it for?

        1 vote
        1. [2]
          DawnPaladin
          Link Parent
          Until about 3 months ago, when I got stuck on a programming problem, my problem-solving algorithm was something like this: Make an educated guess about what has gone wrong Run a Google search with...

          Until about 3 months ago, when I got stuck on a programming problem, my problem-solving algorithm was something like this:

          1. Make an educated guess about what has gone wrong
          2. Run a Google search with the keywords I think are likely relevant (including the error message, if there is one)
          3. Pick a search result that looks relevant, giving preference to StackOverflow answers
          4. Read the result and decide whether it looks relevant to my problem. If not, go back to 3.
          5. Apply learnings from 4 to my code. If it works, you're done. If not, go back to 3, 2, and/or 1.

          Now that ChatGPT is a thing, the algorithm looks more like this:

          1. Copy-and-paste the section of code I'm working on into ChatGPT and ask it what I'm doing wrong, or how to achieve a particular goal.
          2. ChatGPT gives me an explanation and a corrected code snippet. Type it into my editor1 and see if it works. If not, ask follow-up questions.

          I might start this troubleshooting algorithm a dozen times per day if I'm working on something unfamiliar, so ChatGPT has saved me a huge amount of time. It makes me so much more productive, and it's amazingly helpful at getting me out of frustrating jams.

          Of course, ChatGPT is fallible. Sometimes it makes dumb mistakes; sometimes it proposes ideas that don't work. But the great thing about programming is usually you can find out whether it's off-base really quickly. It's right often enough that I would never want to go without it, particularly when I'm exploring a new tool or framework. And when it's wrong, I've lost only a few minutes of time, so I'll go back to the old algorithm—usually with a new insight or angle of approach.

          1 I don't think you learn as much when you paste things into your codebase. Also I'm picky about formatting.

          5 votes
          1. skybrian
            Link Parent
            Thanks! Yeah, that makes sense.

            Thanks! Yeah, that makes sense.

            1 vote
  8. [5]
    skybrian
    Link
    I tried using GPT4 and it impressed me by doing something useful: generating a fairly tricky CSS layout and writing a small JavaScript function. Background: I had been working on an...

    I tried using GPT4 and it impressed me by doing something useful: generating a fairly tricky CSS layout and writing a small JavaScript function.

    Background: I had been working on an HTML/CSS/JavaScript piano keyboard widget last summer, and gave up on it because the layout problem was getting annoying. It seemed like an interesting problem to try on ChatGPT, though.

    To try to keep things simple, first I tried generating a markdown table, but got garbage, and similarly for an ascii art table. Then I thought I'd try just generating the HTML for the keys, which should be the easy part. It generated the CSS (though I didn't ask it to) to make it look decent, using a CSS trick I hadn't thought of.

    I played dumb and asked it to explain the trick, which it did well on, and then I asked it to write a JavaScript function to generate the HTML, and had a fruitful chat session adding some basic features to the function. This was starting to seem like a viable way to write code.

    Unfortunately, it seems I got lucky. Generating CSS that worked wasn't reproducible when I tried asking the same question again, either with GPT3.5 or with GPT4.

    Also, GPT4 is very slow, which makes it tedious to write code this way, where it's writing out the same code again with changes. It reminds me of back when I was calling BBS's with a modem. After a while I tried to get it to just write out the changes, but it doesn't do quite as well.

    Here's the full chat session if you're interested. (Note that I didn't try the last thing it suggested and I don't think it would work without changes.)

    3 votes
    1. [3]
      teaearlgraycold
      Link Parent
      Yeah I’m curious how people will learn to code when they have access to these tools. They’re really good at any beginner level problem and can handle more complex stuff in a hit or miss fashion,...

      Yeah I’m curious how people will learn to code when they have access to these tools. They’re really good at any beginner level problem and can handle more complex stuff in a hit or miss fashion, enough to be useful to me. Maybe it’s not going to be an issue, as we’ve had access to calculators for decades but people still manage to learn math.

      6 votes
      1. [2]
        DawnPaladin
        Link Parent
        When I was learning to code, the hardest part was when you got stuck on something and you didn't know what question to ask to get unstuck. In those kinds of situations, having a tutor is a...

        When I was learning to code, the hardest part was when you got stuck on something and you didn't know what question to ask to get unstuck. In those kinds of situations, having a tutor is a godsend.

        Based on what I've seen, ChatGPT makes a pretty decent tutor. Not as good as a skilled human, but far better than Google. I expect this tech will make programming much easier to learn.

        6 votes
        1. EgoEimi
          Link Parent
          For motivated students and learners of all subjects, ChatGPT is a pedagogical boon. It's like having a living textbook where you can ask clarifying questions and seamlessly deep dive into...

          For motivated students and learners of all subjects, ChatGPT is a pedagogical boon.

          It's like having a living textbook where you can ask clarifying questions and seamlessly deep dive into interesting topics.

          Granted, ChatGPT does hallucinate and regurgitate inaccurate information, but its general knowledge is pretty spot on.

          I foresee opportunities for our education system to adapt and embrace ChatGPT. Regarding concerns around essay AI plagiarism, I think that essays will remain relevant but lessen in importance. In its place students may be asked to engage in expansive, thoughtful, challenging dialogue with an AI and be assessed on the quality of their dialogue and inquiry as well as their cross-referencing and critique of the AI's statements and stances.

          5 votes
    2. skybrian
      Link Parent
      Here's a more elaborate example of using ChatGPT for a programming task. (It's kind of meta though, since he is writing a tool to archive his ChatGPT conversations.) AI-enhanced development makes...

      Here's a more elaborate example of using ChatGPT for a programming task. (It's kind of meta though, since he is writing a tool to archive his ChatGPT conversations.)

      AI-enhanced development makes me more ambitious with my projects (Simon Willison)

      2 votes
  9. skybrian
    Link
    Chatbots, deepfakes, and voice clones: AI deception for sale (US Federal Trade Commission) It’s surprisingly well-written for a government agency, but this seems to be the heart of it: I’m...

    Chatbots, deepfakes, and voice clones: AI deception for sale (US Federal Trade Commission)

    It’s surprisingly well-written for a government agency, but this seems to be the heart of it:

    Generative AI and synthetic media are colloquial terms used to refer to chatbots developed from large language models and to technology that simulates human activity, such as software that creates deepfake videos and voice clones. Evidence already exists that fraudsters can use these tools to generate realistic but fake content quickly and cheaply, disseminating it to large groups or targeting certain communities or specific individuals. They can use chatbots to generate spear-phishing emails, fake websites, fake posts, fake profiles, and fake consumer reviews, or to help create malware, ransomware, and prompt injection attacks. They can use deepfakes and voice clones to facilitate imposter scams, extortion, and financial fraud. And that’s very much a non-exhaustive list.

    The FTC Act’s prohibition on deceptive or unfair conduct can apply if you make, sell, or use a tool that is effectively designed to deceive – even if that’s not its intended or sole purpose.

    I’m thinking OpenAI’s company lawyers are going to be coming up with some more questions that it should be trained not to answer.

    2 votes
  10. [4]
    DawnPaladin
    (edited )
    Link
    "Glasp" browser extension summarizes YouTube videos You can't fully condense an hour-long video into a paragraph of text, but this will give you the main points. It gives you a middle ground...

    "Glasp" browser extension summarizes YouTube videos

    You can't fully condense an hour-long video into a paragraph of text, but this will give you the main points. It gives you a middle ground between committing to a full watch-through and skipping it altogether. Or perhaps it can help you make an informed decision about whether a video is worth your time.

    The extension also makes YouTube's "Transcript" view more convenient to view. You can skip the AI summary and just scroll through the transcript.

    2 votes
    1. [2]
      skybrian
      Link Parent
      This would be exactly what I want, except that I distrust browser extensions from companies I've never heard of. Particularly those with the "Read and change all your data on all websites"...

      This would be exactly what I want, except that I distrust browser extensions from companies I've never heard of. Particularly those with the "Read and change all your data on all websites" permission.

      I tried creating a separate logged-out browser profile for it. It requires sign-in with Google, though, so I stopped there.

      4 votes
      1. DawnPaladin
        (edited )
        Link Parent
        Very sensible. Here are instructions on how to change site permissions from "universal" to only YouTube and ChatGPT. (These are linked from the extension's configuration page. It would be better...

        Very sensible. Here are instructions on how to change site permissions from "universal" to only YouTube and ChatGPT. (These are linked from the extension's configuration page. It would be better if they were available pre-installation.)

        3 votes
    2. teaearlgraycold
      Link Parent
      This seems like it would be perfect for replacing the video description, which is mostly boilerplate and referral links.

      This seems like it would be perfect for replacing the video description, which is mostly boilerplate and referral links.

      3 votes
  11. EgoEimi
    Link
    In a very meta-topical move 😉 I asked GPT-3 to summarize this mega-thread in several parts (due to token max length limits):

    In a very meta-topical move 😉 I asked GPT-3 to summarize this mega-thread in several parts (due to token max length limits):

    This forum thread discusses the potential implications of WoW (World of Warcraft) being voiced by AI, as well as other applications of AI such as writing code, prose, ad copy, poetry, and stories. Some are worried about the potential for misinformation, lack of focus on ethics, data misuse, job displacement, and a lack of humanity in AI output. Others point out that AI could be used to help with accessibility and automation to make life easier, while still others make the point that automated services were once viewed with suspicion and now are commonplace. The thread also touches on the idea that Microsoft's stock value could be inflated by AI.

    Google recently released Bard, its competitor in the race to create A.I. chatbots. It is available to a limited number of users in the U.S. and Britain and will accommodate additional users, countries and languages over time. People who have access to the chatbot have tested it out and found that it is more useful for research than Bing AI, but trails ChatGPT in contextual awareness and riddles. Tests were conducted to see how the chatbot would fare with information extraction, character manipulation, arithmetics, factual analytics, and classification. It is not yet as reliable as a skilled bookkeeper, but is better than some humans claiming to be bookkeepers. Further development of the chatbot is necessary for it to reach the same level of accuracy as a local bookkeeper.

    This forum thread comments on the use of AI chatbots and Turing Tests as well as the use of Substack for blog posts. It also mentions Steve Yegge's new rant, "Cheating is all you need," and the use of ChatGPT and GitHub Copilot for programming. The comments discuss the difficulty of trusting code, the usefulness of problem-solving algorithms, and the importance of formatting code.

    This thread discusses the potential of AI-enhanced development, particularly chatbots, deepfakes, and voice clones. The discussion then moves to a more elaborate example of using ChatGPT for a programming task and browser extensions to summarize YouTube videos. Finally, the thread discusses the possibility that OpenAI and Microsoft sparked a race to productize generative AI technologies.

    2 votes
  12. [5]
    Rudism
    Link
    Maybe this is kind of obvious to someone who followed this kind of tech and research prior to all the recent hype, but with Facebook releasing LLAMA, Google releasing Bard, and stuff like Adobe's...

    Maybe this is kind of obvious to someone who followed this kind of tech and research prior to all the recent hype, but with Facebook releasing LLAMA, Google releasing Bard, and stuff like Adobe's Firefly all coming out so close to one another, does anyone else get the impression that they must have been sitting on these products for quite a while already, maybe not sure if they were ready for prime time yet, and are only scrambling to release them now because OpenAI and Microsoft let the cat out of the bag?

    Is Microsoft the Leeroy Jenkins of generative AI?

    1 vote
    1. [2]
      stu2b50
      (edited )
      Link Parent
      That's definitely the case for Bard. Adobe's Firefly doesn't really have any relation, this is just when they managed to get it to a good enough state for a product release. LLaMA is debatable,...

      That's definitely the case for Bard. Adobe's Firefly doesn't really have any relation, this is just when they managed to get it to a good enough state for a product release. LLaMA is debatable, but it's also just different because it's not a consumer product.

      I definitely think Facebook knew what they were doing, though - it wasn't like it was ever a very secretive model. I got access just by filling out the google form with my undergrad email! They 100% knew it was going to leak, and it was going to drive a lot of the LLM development in the public. Now they don't need to take responsibility of any kind for that Pandora's box.

      Bard is the clearest case where I suppose the lesson Google learned from OpenAI and MSFT eating their lunch (as Google loves to say, they invented the T in the GPT) is that yeeting products out is what wins the game. Move fast and break things, and all that.

      edit: to elaborate more, I think Google vastly overestimated the degree to which the "public" would penalize bad behavior from productionized LLMs. ChatGPT hallucinates, can be "jailbroken" with clever prompting, Bing tried to break a NYT journalist's marriage, and the end result is that no one really give two craps, and if anything things like "Sydney" vastly contributed to the publicity and interest to Bing.

      In the end, people to a much larger degree just like playing with LLMs, even if they're wrong or weird. Google leadership, and it's not like you can say without hindsight bias that they were wrong in the moment, thought that if their models hallucinate, spout misinformation, or try to start weird interpersonal relationships (in the imitation of the many pieces of fiction with weird interpersonal relationships) that the public would go "ew" and either be angry, forget about the product, or both.

      4 votes
      1. Rudism
        Link Parent
        Just off the top of my head, I'm not so sure Google was wrong about what the public reaction might have been if they were first to market with something like Sydney. Bing being an underdog and...

        Just off the top of my head, I'm not so sure Google was wrong about what the public reaction might have been if they were first to market with something like Sydney.

        Bing being an underdog and something of a joke in the search space probably worked enormously in Microsoft's favor. People hold Google to a higher standard and a larger-than-normal amount of public sentiment has been turning against them recently (complaints about SEO spam, killing projects that people are still using, and so on). I think the response might have been a lot harsher.

        4 votes
    2. skullkid2424
      Link Parent
      It may also be that some the underlying pieces are open source - so it could be fairly quick to take the code and you just have to add some tweaks and train it on your own dataset. I'm definitely...

      It may also be that some the underlying pieces are open source - so it could be fairly quick to take the code and you just have to add some tweaks and train it on your own dataset. I'm definitely just guessing though.

      1 vote
    3. skybrian
      Link Parent
      Yes, sort of. There’s a lot of herd behavior in Silicon Valley when technologies become hot. Google has been doing top-quality AI research for many years. For example, the first paper on the...

      Yes, sort of. There’s a lot of herd behavior in Silicon Valley when technologies become hot.

      Google has been doing top-quality AI research for many years. For example, the first paper on the Transformer architecture, “Attention Is All You Need,” was published six years ago by researchers at Google Brain. That’s the basis for all the large language models. And they’ve done a lot since.

      I wouldn’t consider that “sitting on these products” though, because research isn’t a product? They seemed very reluctant to actually build an image generator or a chatbot as a product, to the disappointment of people who wanted to try it.

      Google does use machine learning extensively in other ways that don’t get the same attention. For example, Google Translate has been around for 16 years and is periodically upgraded based on new research, but improved translation isn’t very exciting compared to launching a product that does something new, so we don’t talk about it much. (It seemed pretty exciting when it first launched, though.)

      I think these new products are pretty recent. Adobe likely started working on their products in the last year or two due to all the image generator hype and they’re finishing up now. Productizing text generation (beyond autocomplete) at Google seems even more recent than that. There were articles in the last few months about Google management making it a top priority.

      One thing that seems to have happened is that is that Bing’s announcement caused other companies to make early announcements, even though otherwise they might have waited. We’ve seen a blizzard of announcements. The products aren’t done, but starting beta testing generates hype. Maybe that’s what you meant?

      1 vote
  13. [4]
    lou
    Link
    @Algernon_Asimov, what would Asimov think of all this? Also, what do you think?

    @Algernon_Asimov, what would Asimov think of all this? Also, what do you think?

    1 vote
    1. [3]
      Algernon_Asimov
      Link Parent
      I don't know what Isaac Asimov would think of this. Of course, he would be interested, and he would be smart enough to understand how it works, what it does, and what it does not do. This story of...

      I don't know what Isaac Asimov would think of this. Of course, he would be interested, and he would be smart enough to understand how it works, what it does, and what it does not do.

      This story of his is slightly relevant here: Someday

      As for my thoughts...

      https://tildes.net/~tech/14hv/the_prompt_box_is_a_minefield#comment-7qh6

      https://tildes.net/~tech/14i9/megathread_for_news_updates_discussion_of_chatgpt_and_other_ai_chatbots#comment-7qjk

      2 votes
      1. [2]
        lou
        Link Parent
        I understand. In your estimation, what capability would make you classify those technologies as examples of "artificial intelligence"? What would it take for you to classify them as such?

        I understand. In your estimation, what capability would make you classify those technologies as examples of "artificial intelligence"? What would it take for you to classify them as such?

        1. Algernon_Asimov
          Link Parent
          At a bare minimum, a text-building program would need an actual understanding of the words it's compiling. It would need to understand that "blue apple" is factually incorrect (to re-use an...
          • Exemplary

          At a bare minimum, a text-building program would need an actual understanding of the words it's compiling. It would need to understand that "blue apple" is factually incorrect (to re-use an example I discussed in the other thread), because real-world apples just don't come in blue. This would require an understanding of what an apple is, what blue is, what a colour is, what truth is, and so on. At the moment, it can produce strings of text which say these things, but it doesn't truly understand what it's saying.

          It would need self-awareness. Not full awareness of its self, but awareness of what it's typing. It would need to review the words it's compiling, in real time, to realise that it is just about to produce the phrase "blue apple", and that this phrase does not match reality. It would not be enough for someone to point out to it in hindsight that apples aren't blue and for it to agree, or for someone to ask "Are apples blue?" and for it to say "no". It would need to proactively and preemptively review its own text to recognise errors of fact as they arise, and correct them in real time. (I expect that, at the speed computers operate, we would not notice this self-correction occurring. We would not see the false phrase "blue apple", because the microseconds required for the software to recognise this, check this, and correct this, are too fast for us to notice any delay.)

          That's not to say that it could never write "blue apple". I've written that phrase a few times on Tildes in the past few weeks. But I have a context for saying it, even though it's false. An intelligent program would have a context for saying untrue things, such as knowingly writing fiction, or creating a contrast between falsity or truth, or discussing falseness. It would not produce an untrue statement and present it as truth.

          Other people have talked about opinions. I don't think an intelligent program would need to have opinions.

          I've also considered whether an intelligent program would need to be creative, in the sense that it could produce an idea that hasn't been thought of before, rather than just regurgitating humans' ideas as it currently does. I'm not sure that this is a requirement for intelligence. Something could be intelligent without being creative.

          But it does need knowledge and self-awareness. There are possibly other traits that I haven't though of, and wouldn't think of until someone plonked a program in front of me and said "Okay, it has knowledge and self-awareness. Is that good enough for you? What more do you want?" My interrogation of it would then produce a gap that I haven't thought of yet. Or not. Maybe that would be enough.

          In this vein, I recently read a blog written by our own @skybrian, which pointed out that ChatGPT doesn't do anything when it's not producing text. In between responses, it's just lying fallow. There's nothing doing any thinking in between responses. That made me aware that an intelligent program would think when it's not being asked to. It's obvious when you consider it, but I hadn't considered it until @skybrian pointed it out (thanks, @skybrian!).

          So, there are probably other traits I haven't considered, and wouldn't consider until I'm interacting with a program that someone says is intelligent, but I realise it just isn't for some reason.

          5 votes