Over the years, using the internet has increasingly meant surrendering control over the content we see. It is particularly evident in social media platforms, where information is algorithmically curated rather than actively sought by users.
Even though search engines allow individuals to independently seek and select information by clicking directly on desired links, this approach is slowly diminishing. Google’s recent testing of the ‘AI Mode’ in search suggests a future where AI-based curation becomes the default method for information retrieval.
Similarly, Perplexity, Grok, and ChatGPT also heavily promote AI-driven search tools, which seems to have worked. One in four Americans uses AI instead of traditional search engines.
However, a new report from Columbia University highlighted a critical flaw affecting these AI-based search tools—issues with citation accuracy, which is the very aspect AI labs emphasise to build user confidence.
Study Exposes Gaps in AI Search Accuracy
Tow Center for Digital Journalism, Columbia University, performed an evaluation of search tools from ChatGPT, Perplexity, Grok, DeepSeek Search, and Google’s Gemini. Ten articles from each of the twenty publishers were selected randomly, and direct excerpts were picked from those articles as input for the AI tool.
Then, the tool was asked to identify the article’s headline, original publisher, publication date, and URL. The study found that collectively, these search engines provided incorrect answers to more than 60% of queries. Notably, Perplexity answered 37% of queries incorrectly, and Grok 3 answered 94% of queries incorrectly.

Source: Tow Center for Digital Journalism
“Most of the tools we tested presented inaccurate answers with alarming confidence,” read the study, which highlighted that outputs rarely used phrases like ‘it appears’, ‘it’s possible’ and ‘I couldn’t locate the exact article’, all of which signify knowledge gaps, and uncertainties.
The research also revealed that more than half of the responses from Gemini and Grok 3 cited broken links.

Source: Tow Center for Digital Journalism
Moreover, these AI tools also often failed to identify the original source of the content. “For instance, despite its partnership with The Texas Tribune, Perplexity Pro cited syndicated versions of Tribune articles for three of the ten queries. In contrast, Perplexity cited an unofficial republished version for one,” the report added.
These issues stem despite the continued efforts of companies like OpenAI, and Perplexity to partner with publishers to provide reliable, and accurate outputs. The study observed multiple instances of these chatbots providing inaccurate responses from the very website they teamed up with.

Source: Tow Center for Digital Journalism
These results are alarming, to say the least. “Seems pretty misleading to advertise a capability as search/retrieval if it provides incorrect answers and links over 40% of the time,” Narasimha Chari, a product manager on X, said while citing the study.
Google Has a Responsibility to Fulfill
While AI systems and products continuously improve, there has been an increasingly strong push to adopt AI for search. Given the above results, this might seem premature. Recently, Google announced that AI overviews are being rolled out to more users, without having to sign in to access the feature.
Aligning with the results of the above-mentioned study, several users have recently expressed frustration with AI overviews and their inaccurate responses. While Google calls AI overviews “one of the most popular search features ever”, there also seems to be no way to disable them.
For instance, Mehdi Sadaghdar, who runs the popular YouTube channel ElectroBOOM, found Google’s AI providing a confusing response to a rather straightforward question. When he wanted to find the amount of energy contained by a lightning bolt, the AI overview first answered “1 gigajoules”, followed by another result showing an answer of “approximately 5 gigajoules”.
“I feel it is dangerous for Google AI answers to be the first result in the searches. I found myself accepting what it says as fact, but then with inaccuracies…it could be spreading false information that would result in inaccurate responses,” Sadaghdar added in a post on X.
Kind of useless google AI overview! pic.twitter.com/9pfrFsf6ki
— Mehdi Sadaghdar (@ElectroBOOMGuy) January 27, 2025
Moreover, Google is also testing an ‘AI Mode’ in Google Search, which, according to its demonstration video, seems to be the first tab users can see. Moreover, as per Google, it comes with enhanced capabilities for reasoning, multi-modal and high-quality responses with Gemini 2.0.
Having said that, Google has indeed been having an incredible run with its newly released Gemini models and the associated multimodal features recently. It is only fair to expect more refinements to AI overviews in search, a product from the company that faces the most number of users.
Moreover, a report from Statista suggests that over 90 million online users in the United States are set to primarily rely on AI for browsing the web. AI makers will certainly need to undertake more responsibilities as false information can lead to mild inconveniences and even fatal consequences in some situations.