I love to show that kind of shit to AI boosters. (In case you’re wondering, the numbers were chosen randomly and the answer is incorrect).

They go waaa waaa its not a calculator, and then I can point out that it got the leading 6 digits and the last digit correct, which is a lot better than it did on the “softer” parts of the test.

  • Architeuthis@awful.systems
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 day ago

    Here’s the exact text in the prompt that I had in mind (found here), it’s in the function specification for the js repl:

    […] The analysis tool (also known as the REPL) can be used to execute code in a JavaScript environment in the browser.

    What is the analysis tool?

    The analysis tool is a JavaScript REPL. You can use it just like you would use a REPL. But from here on out, we will call it the analysis tool.

    When to use the analysis tool

    Use the analysis tool for:

    • Complex math problems that require a high level of accuracy and cannot easily be done with “mental math
    • To give you the idea, 4-digit multiplication is within your capabilities, 5-digit multiplication is borderline, and 6-digit multiplication would necessitate using the tool.
    • […]

    What if this is not a being terminally AI pilled thing? What if this is the absolute pinnacle of what billions and billions of dollars in research will buy you for requiring your lake-drying sea-boiling LLM-as-a-service not look dumb compared to a pocket calculator?

    • diz@awful.systemsOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      13 hours ago

      Still seems terminally AI pilled to me, an iteration or two later. “5 digit multiplication is borderline”, how is that useful?

      I think there’s a combination of it being a pinnacle of billions and billions of dollars, and probably theirs firing people for slightest signs of AI skepticism. There’s another data point, “reasoning math & code” is released as stable by Google without anyone checking if it can do any kind of math.

      edit: imagine that a calculator manufacturer in 1970s is so excited about microprocessors they release an advanced scientific calculator that can’t multiply two 6 digit numbers (while their earlier discrete component model could). Outside the crypto sphere, that sort of insanity is new.