Google introduces lightweight Gemini AI model, video generation AI & more

15-05-2024 Wed 08:15 | Technology | IANS

New Delhi, May 14 : Google on Tuesday introduced several updates across the Gemini family of artificial intelligence (AI) models, including the new 1.5 Flash which is a lightweight AI model for speed and efficiency.

Both Gemini 1.5 Pro and 1.5 Flash AI models are available in public preview with a 1 million token context window on Google AI Studio and Vertex AI.

“Today, more than 1.5 million developers use Gemini models across our tools. You’re using it to debug code, get new insights, and build the next generation of AI applications,” said Sundar Pichai, CEO of Google and Alphabet.

“Still, we are in the early days of the AI platform shift. We see so much opportunity ahead, for creators, for developers, for startups, for everyone,” he added.

The company also said that it is integrating Gemini 1.5 Pro into Google products, including Gemini Advanced and in Workspace apps.

“We’re also announcing our next generation of open models, 'Gemma 2', and sharing a glimpse of ‘Project Astra’, a look at the future of universal AI agents,” said the company.

These AI agents are designed to process information faster by continuously encoding video frames, combining the video and speech input into a timeline of events, and caching this information for efficient recall.

“These agents can better understand the context they’re being used in, and respond quickly, in conversation,” said Google.

Meanwhile, the Gemini 1.5 Flash is optimised for high-volume, high-frequency tasks at scale and is more cost-efficient to serve.

Hassabis said that Gemini 1.5 Flash excels at summarisation, chat applications, image and video captioning, data extraction from long documents and tables, and more.

Google also launched Veo, its latest and most advanced video generation model, and Imagen 3, its highest quality text-to-image model yet.

“Veo is our most capable video generation model to date. It generates high-quality 1080p resolution videos that can go beyond a minute, in a wide range of cinematic and visual styles,” informed Eli Collins, Vice President, Product Management.

The model also understands cinematic terms like “timelapse” or “aerial shots of a landscape”, providing an unprecedented level of creative control.

The company also brought Gemini 1.5 Pro to Gemini Advanced subscribers in over 35 languages, along with a 1 million token context window, a new conversational experience and tools that “let Gemini take action on your behalf”.

To make Google Search even better, the company announced a new Gemini model customised for Google Search.

“It brings together Gemini’s advanced capabilities — including multi-step reasoning, planning and multimodality — with our best-in-class Search systems,” said the tech giant.

Starting today, “we’re making ‘AI Overviews’ available to everyone in the U.S., with more countries coming soon,” said Liz Reid, VP, Head of Google Search.

People have already used ‘AI Overviews’ billions of times through Google’s experiment in Search Labs.