• jsomae@lemmy.ml
    link
    fedilink
    English
    arrow-up
    0
    arrow-down
    1
    ·
    edit-2
    5 months ago

    I’d just like to point out that, from the perspective of somebody watching AI develop for the past 10 years, completing 30% of automated tasks successfully is pretty good! Ten years ago they could not do this at all. Overlooking all the other issues with AI, I think we are all irritated with the AI hype people for saying things like they can be right 100% of the time – Amazon’s new CEO actually said they would be able to achieve 100% accuracy this year, lmao. But being able to do 30% of tasks successfully is already useful.

      • jsomae@lemmy.ml
        link
        fedilink
        English
        arrow-up
        0
        ·
        5 months ago

        I’m not claiming that the use of AI is ethical. If you want to fight back you have to take it seriously though.

        • outhouseperilous@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          0
          arrow-down
          1
          ·
          5 months ago

          It cant do 30% of tasks vorrectly. It can do tasks correctly as much as 30% of the time, and since it’s llm shit you know those numbers have been more massaged than any human in history has ever been.

          • jsomae@lemmy.ml
            link
            fedilink
            English
            arrow-up
            0
            ·
            5 months ago

            I meant the latter, not “it can do 30% of tasks correctly 100% of the time.”

              • jsomae@lemmy.ml
                link
                fedilink
                English
                arrow-up
                0
                ·
                5 months ago

                yes, that’s generally useless. It should not be shoved down people’s throats. 30% accuracy still has its uses, especially if the result can be programmatically verified.

                • Knock_Knock_Lemmy_In@lemmy.world
                  link
                  fedilink
                  English
                  arrow-up
                  0
                  ·
                  5 months ago

                  Run something with a 70% failure rate 10x and you get to a cumulative 98% pass rate. LLMs don’t get tired and they can be run in parallel.

                  • MangoCats@feddit.it
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    ·
                    5 months ago

                    I have actually been doing this lately: iteratively prompting AI to write software and fix its errors until something useful comes out. It’s a lot like machine translation. I speak fluent C++, but I don’t speak Rust, but I can hammer away on the AI (with English language prompts) until it produces passable Rust for something I could write for myself in C++ in half the time and effort.

                    I also don’t speak Finnish, but Google Translate can take what I say in English and put it into at least somewhat comprehensible Finnish without egregious translation errors most of the time.

                    Is this useful? When C++ is getting banned for “security concerns” and Rust is the required language, it’s at least a little helpful.

                  • jsomae@lemmy.ml
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    ·
                    5 months ago

                    The problem is they are not i.i.d., so this doesn’t really work. It works a bit, which is in my opinion why chain-of-thought is effective (it gives the LLM a chance to posit a couple answers first). However, we’re already looking at “agents,” so they’re probably already doing chain-of-thought.

    • Shayeta@feddit.org
      link
      fedilink
      English
      arrow-up
      0
      ·
      5 months ago

      It doesn’t matter if you need a human to review. AI has no way distinguishing between success and failure. Either way a human will have to review 100% of those tasks.

      • jsomae@lemmy.ml
        link
        fedilink
        English
        arrow-up
        0
        ·
        5 months ago

        Right, so this is really only useful in cases where either it’s vastly easier to verify an answer than posit one, or if a conventional program can verify the result of the AI’s output.

        • MangoCats@feddit.it
          link
          fedilink
          English
          arrow-up
          0
          ·
          5 months ago

          It’s usually vastly easier to verify an answer than posit one, if you have the patience to do so.

          I’m envisioning a world where multiple AI engines create and check each others’ work… the first thing they need to make work to support that scenario is probably fusion power.

          • zbyte64@awful.systems
            link
            fedilink
            English
            arrow-up
            0
            ·
            5 months ago

            It’s usually vastly easier to verify an answer than posit one, if you have the patience to do so.

            I usually write 3x the code to test the code itself. Verification is often harder than implementation.

            • MangoCats@feddit.it
              link
              fedilink
              English
              arrow-up
              0
              ·
              5 months ago

              Yes, but the test code “writes itself” - the path is clear, you just have to fill in the blanks.

              Writing the proper product code in the first place, that’s the valuable challenge.

              • zbyte64@awful.systems
                link
                fedilink
                English
                arrow-up
                1
                ·
                5 months ago

                Maybe it is because I started out in QA, but I have to strongly disagree. You should assume the code doesn’t work until proven otherwise, AI or not.