Key Takeaways
- Google’s AI Overview feature may pull the plug soon due to hilarious yet inaccurate suggestions like glue on pizza cheese.
- Reddit data likely fuels Google’s AI Overviews, possibly leading to questionable results such as using glue in food preparation.
- AI unpredictability is a challenge for Google, with potentially dangerous results prompting the need for a reevaluation of the feature.
Unless you’ve been living under a rock recently, then you’ve probably seen how Google’s new “AI Overview” feature has made claims that you should use Nutella to cool your PC instead of thermal paste, use glue to thicken your pizza cheese, or to make mustard gas in order to clean your washing machine. It’s been a mess of hilarious proportions, and the craziest thing about it is that it’s hard to even say what Google can do about it.
Google’s “AI Overview” feature isn’t available in the EU, but with the way things are going, I could see Google pulling the plug on the feature in the near future for further tuning. A similar situation happened with Google’s Gemini image generator in the past, where the company eventually pulled it back and re-tuned it before releasing it again. Google says “Generative AI is experimental,” and that warning is extremely apt.
Google’s AI Overviews appear to be powered in part by Reddit
According to a report back in February from ReutersGoogle struck a deal with Reddit to use the site’s content in its AI training sets. While Google and Reddit both declined to comment, one of the funny results that has come up seems to lend credence to Reddit being partly to blame. One user on X (formerly Twitter) shared a screenshot of Google telling them to thicken pizza cheese… with glue. Later, someone found a Reddit comment that was similar enough to raise eyebrows, though that comment, posted 11 years ago originally, has since been removed.
Of course, there are long-standing assertions that in order to make food look appealing in advertisements, companies use non-edible substances (like glue) in promotional imagery. It’s entirely possible that Google picked up on some of these comments and the AI managed to link up two entirely separate concepts in its training, but given that Gemini itself doesn’t give results like these, something weird is definitely happening with Google’s results that isn’t happening in its other models.
What we can assume about the data powering AI Overviews
It’s all educated guesswork
Firstly, we’re going to be making a number of assumptions about how Google is operating its AI Overviews feature. This is entirely guesswork and is done based on the information we know and the information that has been reported on by other outlets. We’re assuming that Google is operating off of a corpus of Reddit data along with its own data to generate these results.
What I suspect is happening is that Google is either using a version of its Gemini LLM trained with a higher weighting given to Reddit data, or it’s using Retrieval Augmented Generation (RAG) to pull data directly from that corpus of Reddit data. Google’s Gemma model supports RAG, so we know the company is working on it in the background. As we’ll explain, it’s most likely that if Google is using Reddit data, it’s using RAG to fetch that data in responses.
As a basic explainer for RAG, It improves prediction accuracy by using an external dataset during inference and lacing the responses with relevant information from the documents that are in its dataset. Scaling this to a dump of Reddit data and the model’s own knowledge, you would get results exactly like the ones that we’ve been seeing. How often have you Googled something, and then stuck “Reddit” at the end of it to get better results? Quite a lot I imagine, and I’m guilty of it too, and that’s exactly what Google would have been hoping to tap into.
Chat with RTX is a tool that Nvidia launched last year and uses RAG, so you can get a sense for how it works if you have an Nvidia graphics card and want to try it out. Whatever Google is doing clearly isn’t working, but it’s hard to say what exactly would work either. Assuming this is the setup that Google is using, then this is a ridiculously hard problem to solve without major human intervention.
Related
Nvidia’s Chat with RTX will connect an LLM with YouTube videos and documents locally on your PC
Nvidia is making it even easier to run a local LLM with Chat with RTX, and it’s pretty powerful, too.
AI is inherently unpredictable, and that’s the problem
It’s both a blessing and a curse
AI is autonomous in the sense that it can be run and forget and aside from the training data and Reinforcement Learning from Human Feedback (RLHF), it’s pretty hands-off once deployed. Companies like OpenAI have struggled with “jailbreaks” and the like over time, but there hasn’t really been anything damaging to come out of it. For context, though, Google’s AI Overview results show whether you like it or not, and in one of the most visited web pages in the world. A webpage that the elderly and vulnerable use as well. In contrast, the damage that ChatGPT or other LLMs could do was limited to those who sought them out.
That unpredictability leads to a game of whack-a-mole, but Google can’t account for every bad search result that its AI Overview gives. LLMs hallucinate, and LLMs that use RAG based on Reddit responses.. will be Reddit-powered hallucinations. Doesn’t that sound terrible? I think it sounds terrible.
Even if Google brings in blanket restrictions on the topics that its AI Overviews are allowed on, you’ll always find loopholes. From my own testing, I suspect that it’s blocked for use on anything relating to politics for example, but you can probably trick it to answer a question about the upcoming election if you were really trying to. Google could run a sentiment analysis tool through the Reddit data to remove perceived “joke” responses, but the sarcastic and dry nature of some comments on Reddit might mean that some would slip through. In fact, I would be surprised if Google hadn’t tried to use sentiment analysis to remove some comments like that already.
The above even goes for if it’s a specific LLM that Google trained on Reddit data rather than making use of RAG, as there is still the existing problem of needing to figure out what data needs to be removed from the training set. That’s why I suspect there’s a RAG component too, as it would allow Google to quickly and easily remove problematic responses from the database if they arise, rather than needing to modify an entire model.
What Google can do next
I suspect Google will pull back on AI Overviews for now
For now, I don’t see how Google keeps going with AI Overviews being enabled in search. Some of the results users have seen have been outright dangerous, while others are just… incorrect. Massively so. It’s providing no value to users at that point, but it’s clear that Google wants it to work at some point. How exactly that will happen is anyone’s guess, but there’s a lot of work that needs to be done.
For now, you’ll have to put up with Google’s AI Overview results. Don’t put glue in your pizza, don’t cool your PC with Nutella, and definitely do not make mustard gas to clean your washing machine.