Google Perfects Gemini’s AI for Realistic Human Generation

by Namitha Serah on Thu, 08/29/2024 - 06:53

Google's Gemini AI restores people generation with Imagen 3 for premium subscribers

Google has reintroduced the people-generating capability in its Gemini AI-powered chatbot after a significant pause that began in February. The feature was initially disabled following complaints of historical inaccuracies and racial stereotypes in the generated images.

Google’s CEO Sundar Pichai and DeepMind co-founder Demis Hassabis promised a swift fix, but it ultimately took several months of rigorous testing and development to address the issues. Now, Google claims to have implemented crucial fixes through its latest model, Imagen 3, which powers image generation within Gemini.

What’s New in Imagen 3?

Imagen 3, the updated image generation model, is designed to produce more accurate and fair representations of people, correcting the shortcomings of its predecessor. According to Google, the model has been trained on a diverse set of AI-generated captions to enhance the variety and inclusivity of concepts in its output.

Furthermore, the training data underwent safety and fairness reviews, with extensive internal and external testing to minimize undesirable results. Google has been tight-lipped about the specifics of Imagen 3’s training data, only revealing that it comprises a large dataset of images, text, and annotations.

Limited Access and Future Plans

Initially, only users subscribed to Google’s premium Gemini plans—Gemini Advanced, Business, and Enterprise—will regain access to the people-generating feature. This limited release is part of an early access test, available only in English. Google has not yet disclosed when or if the feature will be expanded to free-tier users or other languages.
For now, the company is focused on gathering feedback from its premium users to refine the feature further. In a related development, Google is rolling out Imagen 3 to all Gemini users, albeit without the people-generating function for non-premium subscribers.
The new model is touted as more creative and detailed than its predecessor, with better text understanding and fewer errors. To prevent misuse like deepfakes, Imagen 3 uses SynthID, a cryptographic watermarking tool from DeepMind, making AI-generated images traceable and authentic.

Alongside the improvements in image generation, Google is also introducing Gems to its Gemini platform. These custom-tailored AI experts can assist users with specific tasks, such as brainstorming, career guidance, or even social media content creation. Initially available only to premium users, Gems allow for a higher level of personalization in AI interactions, further enhancing the utility and appeal of the Gemini chatbot.

By reintroducing the people-generating feature with significant improvements and expanding its AI capabilities through tools like Imagen 3 and Gems, Google is setting a new standard in AI-driven digital experiences. These advancements not only correct previous shortcomings but also provide users with more powerful, versatile tools for creative and professional applications.

Stay Updated—Read our Latest News here!!