You know how Google’s new feature called AI Overviews is prone to spitting out wildly incorrect answers to search queries? In one instance, AI Overviews told a user to use glue on pizza to make sure the cheese won’t slide off (pssst…please don’t do this.)
Well, according to an interview at The Vergewith Google CEO Sundar Pichai published earlier this week, just before criticism of the outputs really took off, these “hallucinations” are an “inherent feature” of AI large language models (LLM), which is what drives AI Overviews, and this feature “is still an unsolved problem.”
They keep saying it’s impossible, when the truth is it’s just expensive.
That’s why they wont do it.
You could only train AI with good sources (scientific literature, not social media) and then pay experts to talk with the AI for long periods of time, giving feedback directly to the AI.
Essentially, if you want a smart AI you need to send it to college, not drop it off at the mall unsupervised for 22 years and hope for the best when you pick it back up.
No he’s right that it’s unsolved. Humans aren’t great at reliably knowing truth from fiction too. If you’ve ever been in a highly active comment section you’ll notice certain “hallucinations” developing, usually because someone came along and sounded confident and everyone just believed them.
We don’t even know how to get full people to do this, so how does a fancy markov chain do it? It can’t. I don’t think you solve this problem without AGI, and that’s something AI evangelists don’t want to think about because then the conversation changes significantly. They’re in this for the hype bubble, not the ethical implications.
We do know. It’s called critical thinking education. This is why we send people to college. Of course there are highly educated morons, but we are edging bets. This is why the dismantling or coopting of education is the first thing every single authoritarian does. It makes it easier to manipulate masses.
It’s called critical thinking education.
Yeah, I mean, we have that, and parents are constantly trying to dismantle it. No amount of “critical thinking education” can undo decades of brainwashing from parents and local culture.
You could only train AI with good sources
I mean yes, but also no. If you only train it with “good sources” then you miss out on a whole bunch of other valuable information.
Just like scholar.google.com only has “good sources” but generally it’s not going to have the information that 90% of your search queries will be about.
I let you in on a secret: scientific literature has its fair share of bullshit too. The issue is, it is much harder to figure out its bullshit. Unless its the most blatant horseshit you’ve scientifically ever seen. So while it absolutely makes sense to say, let’s just train these on good sources, there is no source that is just that. Of course it is still better to do it like that than as they do it now.
The issue is, it is much harder to figure out its bullshit.
Google AI suggested you put glue on your pizza because a troll said it on Reddit once…
Not all scientific literature is perfect. Which is one of the many factors that will stay make my plan expensive and time consuming.
You can’t throw a toddler in a library and expect them to come out knowing everything in all the books.
AI needs that guided teaching too.
Google AI suggested you put glue on your pizza because a troll said it on Reddit once…
Genuine question: do you know that’s what happened? This type of implementation can suggest things like this without it having to be in the training data in that format.
In this case, it seems pretty likely. We know Google paid Reddit to train on their data, and the result used the exact same measurement from this comment suggesting putting Elmer’s glue in the pizza:
https://old.reddit.com/r/Pizza/comments/1a19s0/my_cheese_slides_off_the_pizza_too_easily/
And their deal with Reddit: https://www.cbsnews.com/news/google-reddit-60-million-deal-ai-training/
It’s going to be hilarious to see these companies eventually abandon Reddit because it’s giving them awful results, and then they’re completely fucked
You’re wrong. Anyone who has ever used Google knows Reddit is an absolute goldmine of valuable information. The problem is it’s also full of jokes and puns and bad information, and AI isn’t able to sort one from the other (yet).
This doesn’t mean that there are reddit comments suggesting putting glue on pizza or even eating glue. It just means that the implementation of Google’s LLM is half baked and built it’s model in a weird way.
these hallucinations are an “inherent feature” of AI large language models (LLM), which is what drives AI Overviews, and this feature "is still an unsolved problem”.
Then what made you think it’s a good idea to include that in your product now?!
“We do AI now”. Shareholders creaming themselves, stocks going to the moon. New yacht for PichAI.
Since when has feeding us misinformation been a problem for capitalist parasites like Pichai?
Misinformation is literally the first line of defense for them.
But this is not misinformation, it is uncontrolled nonsense. It directly devalues their offering of being able to provide you with an accurate answer to something you look for. And if their overall offering becomes less valuable, so does their ability to steer you using their results.
So while the incorrect nature is not a problem in itself for them, (as you see from his answer)… the degradation of their ability to influence results is.
But this is not misinformation, it is uncontrolled nonsense.
The strategy is to get you to keep feeding Google new prompts in order to feed you more adds.
The AI response is just a gimmick. It gives Google something to tell their investors, when they get asked “What are you doing with AI right now? We hear that’s big.”
But the real money is getting unique user interactions for the purpose of serving up more ad content. In that model, bad answers are actually better than no answers, because they force the end use to keep refining the query and searching through the site backlog.
If you don’t know the answer is bad, which confident idiots spouting off on reddit and being upvoted into infinity has proven is common, then you won’t refine your search. You’ll just accept the bad answer and move on.
Your logic doesn’t follow. If someone doesn’t know the answer and are searching for it, they likely won’t be able to tell if the answer is correct. We literally already have that problem with misinformation. And what sounds more confident than an AI?