In this article, we take a look at Google AI search results, which often appear at the top of the search results page when you perform a search on Google. The question we attempt to answer is: is the AI answer better than the standard search result.
In order to answer this question in an (mostly) objective way, we break it into two questions:
- How often does Google provide an AI answer that is correct
- If those correct answers, how often does the first non-AI search result either incomplete or incorrect.
Methodology
To perform this study, we identified 100 questions in many different fields, including programming, biology, history, and geography. Each of these questions has a complicated answer. We purposely didn’t include questions that have simple answers, like “how tall is Mt. Everest?”
Some examples of questions we asked are:
- How to get DocumentFragment children in CKEditor5?
- How is mRNA transported from the nucleus to cytoplasm?
In building our list of questions, we only looked at questions that provided an AI response. The ones without an AI response were omitted.
Results
We found that Google’s AI was correct on an excellent 97% of our questions — but of those we found that it provided a superior answer only 48% of the time.
But we found that the reason Google’s AI was so often correct was that it chose to answer only a small percentage of questions. That is, for most of the original questions we came up with for this study, less than 5% were actually usable. For the rest, Google provided no AI answer.
So while Google’s AI did an excellent job at correctly answering questions that it answered, the actual number of tough questions that Google’s AI answered was small.
Notably, Google’s AI provided the best answers to programming and tech questions. For these questions, it was able to provide in-depth answers specific to the questions. In contrast, the top search results were more general or incomplete. This may be because (among other things) there is the largest set of training data for programming and tech answers.
Google’s AI search results differ from GPT in that they don’t tend to answer questions where it is not confident of the response. Therefore, the correct response percentages are high.
Conclusion
While Google AI tends to provide correct responses to search queries. What is notable is the number of queries that the AI doesn’t try to answer. Google is doing a good job of limiting the answers to questions that the model is confident about, instead of trying to answer everything.