• NeoNachtwaechter@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    Turing test? LMAO.

    I asked it simply to recommend me a supermarket in our next bigger city here.

    It came up with a name and it told a few of it’s qualities. Easy, I thought. Then I found out that the name does not exist. It was all made up.

    You could argue that humans lie, too. But only when they have a reason to lie.

    • Lmaydev@programming.dev
      link
      fedilink
      English
      arrow-up
      0
      arrow-down
      1
      ·
      3 months ago

      That’s not what LLMs are for. That’s like hammering a screw and being irritated it didn’t twist in nicely.

      The turing test is designed to see if an AI can pass for human in a conversation.

      • NeoNachtwaechter@lemmy.world
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        3 months ago

        turing test is designed to see if an AI can pass for human in a conversation.

        I’m pretty sure that I could ask a human that question in a normal conversation.

        The idea of the Turing test was to have a way of telling humans and computers apart. It is NOT meant for putting some kind of ‘certified’ badge on that computer, and …

        That’s not what LLMs are for.

        …and you can’t cry ‘foul’ if I decide to use a question for which your computer was not programmed :-)

  • NutWrench@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    Each conversation lasted a total of five minutes. According to the paper, which was published in May, the participants judged GPT-4 to be human a shocking 54 percent of the time. Because of this, the researchers claim that the large language model has indeed passed the Turing test.

    That’s no better than flipping a coin and we have no idea what the questions were. This is clickbait.

    • SkyeStarfall@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      While I agree it’s a relatively low percentage, not being sure and having people pick effectively randomly is still an interesting result.

      The alternative would be for them to never say that gpt-4 is a human, not 50% of the time.

          • Hackworth@lemmy.world
            link
            fedilink
            English
            arrow-up
            0
            arrow-down
            1
            ·
            3 months ago

            Aye, I’d wager Claude would be closer to 58-60. And with the model probing Anthropic’s publishing, we could get to like ~63% on average in the next couple years? Those last few % will be difficult for an indeterminate amount of time, I imagine. But who knows. We’ve already blown by a ton of “limitations” that I thought I might not live long enough to see.