Google recently unveiled Gemini, its latest suite of powerful AI models, but the tech giant is already under fire for allegedly misleading claims about its performance.
In an op-ed by Bloomberg, concerns were raised about the accuracy of a video demonstration presented by Google during the announcement of Gemini. The hands-on video showcased Gemini’s impressive multimodal capabilities, combining spoken conversational prompts with image recognition. However, Bloomberg columnist Parmy Olson argued that the capabilities portrayed in the video may have been exaggerated.
The six-minute demonstration displayed Gemini quickly recognizing images, responding within seconds, and tracking real-time objects, such as a cup and ball game. Despite the seemingly remarkable performance, a disclaimer in the YouTube video description caught the attention of critics: “For the purposes of this demo, latency has been reduced, and Gemini outputs have been shortened for brevity.”
According to Olson, Google later admitted, when asked for comment, that the video demo did not occur in real time with spoken prompts. Instead, it used still image frames from raw footage, with text prompts added afterward for Gemini to respond. This revelation raised concerns about the accuracy of Google’s portrayal of Gemini’s real-time conversational capabilities.
While it’s not uncommon for companies to edit demo videos to avoid technical glitches during live presentations, Google has faced skepticism in the past over the authenticity of its AI demos. Notably, questions arose about the legitimacy of Google’s Duplex demo, where an AI voice assistant made restaurant reservations, due to a lack of ambient noise and overly helpful interactions.
In response to the criticism, Google defended the authenticity of the Gemini demo. Oriol Vinyals, vice president of research and deep learning lead at Google’s DeepMind and co-lead for Gemini, explained in a post that the user prompts and outputs in the video are real but shortened for brevity. He clarified that the video aimed to illustrate potential multimodal user experiences with Gemini and inspire developers.
Despite Google’s explanation, some argue that the focus should shift towards allowing journalists and developers to experience Gemini firsthand. Critics suggest that a more transparent approach, such as a public beta, would better showcase the true capabilities of Gemini and dispel any doubts about its performance relative to competitors like OpenAI’s GPT.