Ponder.cat
  • Communities
  • heart
  • search
    • Login
    • Sign Up
    • Communities

    • heart
      Support Lemmy
    • search
      Search

    • Login
    • Sign Up
    Lugh@futurology.todayM to Futurology@futurology.todayEnglish · 5 months ago

    When AI is tested on questions it can't model from pre-existing answers on the internet, it only scores 10% in the test.

    qz.com

    external-link
    message-square
    15
    fedilink
    79
    external-link

    When AI is tested on questions it can't model from pre-existing answers on the internet, it only scores 10% in the test.

    qz.com

    Lugh@futurology.todayM to Futurology@futurology.todayEnglish · 5 months ago
    message-square
    15
    fedilink
    Researchers just stumped AI with their most difficult test — but for how long?
    qz.com
    external-link
    A new AI benchmark called "Humanity's Last Exam" stumped top models
    • NuraShiny [any]@hexbear.net
      link
      fedilink
      English
      arrow-up
      6
      ·
      5 months ago

      No, because this test will now be discussed and invalidated for that purpose.

      • Lugh@futurology.todayOPM
        link
        fedilink
        English
        arrow-up
        8
        ·
        5 months ago

        They say the answer to this issue is they’ve released public question samples, but the real questions are kept private.

        https://agi.safe.ai/

    Futurology@futurology.today

    futurology@futurology.today

    Subscribe from Remote Instance

    Create a post
    You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !futurology@futurology.today
    Visibility: Public
    globe

    This community can be federated to other instances and be posted/commented in by their users.

    • 3 users / day
    • 22 users / week
    • 54 users / month
    • 5.8K users / 6 months
    • 0 local subscribers
    • 2.6K subscribers
    • 770 Posts
    • 4.13K Comments
    • Modlog
    • mods:
    • voidx@futurology.today
    • Lugh@futurology.today
    • Espiritdescali@futurology.today
    • AwesomeLowlander@futurology.today
    • UI: unknown version
    • BE: 0.19.8
    • Modlog
    • Instances
    • Docs
    • Code
    • join-lemmy.org