gemini Archives - SD Times

Gemini responses can now be grounded with Google Search results

Jenna Barron — Thu, 31 Oct 2024 17:45:00 +0000

Google is announcing that the Gemini API and Google AI Studio now both offer the ability to ground models using Google Search, which will improve the accuracy and reliability of Gemini’s responses.

By grounding the responses with Google Search results, responses can have fewer hallucinations, more up-to-date information, and richer information. Grounded responses also include links to the sources they are using.

“By providing supporting links, grounding brings transparency to AI applications, making them more trustworthy and encouraging users to click on the underlying sources to find out more,” Google wrote in a blog post.

This new capability supports dynamic retrieval, meaning that Gemini will assess if grounding is necessary, as not all queries need the extra assistant and it does add extra cost and latency. It generates a prediction score for every prompt, which is a measure of how beneficial grounding would be, and developers can adjust the prediction score threshold to what works best for their application.

Currently, grounding only supports text prompts and does not support multimodal prompts, like text-and-image or text-and-audio. It is available in all of the languages Gemini currently supports.

Google’s documentation on grounding provides instructions on how to configure Gemini models to use this new capability.

The post Gemini responses can now be grounded with Google Search results appeared first on SD Times.

Google AI Studio’s new Compare Mode helps users select the best Gemini model for their use case

Jenna Barron — Fri, 18 Oct 2024 16:32:26 +0000

Gemini users will now be able to more easily select the model that fits their requirements by using Google AI Studio’s new Compare Mode.

“As a developer, you understand the critical tradeoffs involved in model selection, such as cost, latency, token limits, and response quality. Compare Mode simplifies this process by allowing you to evaluate responses across the various Gemini and Gemma models available in AI Studio, side-by-side,” Kat Kampf, product manager of Google AI Studio wrote in a blog post.

Users can select two different models, write a prompt, and see how long each model takes and the quality of the response. They can also experiment with different system instructions and gain insights into how those influence the output in different models.

Google offers a number of different Gemini models optimized for different use cases, including Flash, which balances performance and cost, and Pro, which offers greater performance but may take longer and be more expensive. Figuring out which model best suits their use case enables users to get the most out of their Gemini experience.

“With Compare Mode, it’s easier than ever to assess different models and make the right choice for your project,” Kampf wrote.

Compare Mode is now available for all users by clicking the “Compare” button in the top right of a prompt in AI Studio.

The post Google AI Studio’s new Compare Mode helps users select the best Gemini model for their use case appeared first on SD Times.

Google begins rolling out voice capabilities in Gemini with Gemini Live

Jenna Barron — Wed, 14 Aug 2024 15:18:44 +0000

Google is trying to make its AI assistant Gemini more useful by adding a conversation mode called Gemini Live, similar to how conversations in ChatGPT work.

Gemini Live has a voice mode, so that users can speak their questions out loud rather than typing. This voice mode works even when the app is in the background or the phone is locked, which allows conversations to happen even when the user isn’t directly interacting with the Gemini app.

According to Google, users can also interrupt Gemini as it is reading its response out to ask follow-up questions.

“For years, we’ve relied on digital assistants to set timers, play music or control our smart homes. This technology has made it easier to get things done and saved valuable minutes each day. Now with generative AI, we can provide a whole new type of help for complex tasks that can save you hours. With Gemini, we’re reimagining what it means for a personal assistant to be truly helpful. Gemini is evolving to provide AI-powered mobile assistance that will offer a new level of help — all while being more natural, conversational and intuitive,” Sissie Hsiao, vice president and general manager of Gemini experiences and Google Assistant, wrote in a blog post.

Users can select from 10 different voices with different styles and tones, such as calm, bright, or engaged.

It has begun rolling out in English to Gemini Advanced subscribers on Android, which is a subscription that costs $19.99 per month, though Google does offer a one month trial. The company said that within the next few weeks it will roll out to other languages and iOS as well.

In addition, Google said that Gemini will be the default assistant on Pixel 9 phones, which were also announced yesterday. “While AI unlocks powerful new capabilities, it also presents new challenges,” Hsiao wrote. “Ironically, using large language models that can better interpret natural language and handle complex tasks often means simple tasks take a moment longer to complete. And while generative AI is flexible enough to complete a wide array of tasks, it can sometimes behave in unexpected ways or provide inaccurate information … Today, we’ve arrived at an inflection point where we believe the helpfulness of an AI-powered assistant far outweighs its challenges.”

Google also revealed that in the next couple weeks it is also introducing new Gemini extensions for Keep, Tasks, Utilities, and advanced YouTube Music features.

“Let’s say you’re hosting a dinner party: Have Gemini dig out that lasagna recipe Jenny sent you in your Gmail, and ask it to add the ingredients to your shopping list in Keep. And since your guests are your college friends, ask Gemini to ‘make a playlist of songs that remind me of the late ‘90s.’ Without needing too many details, Gemini gets the gist of what you want and delivers,” Hsiao wrote.

You may also like…

OpenAI starts rolling out advanced Voice Mode to ChatGPT Plus users

Gemini improvements unveiled at Google Cloud Next

The post Google begins rolling out voice capabilities in Gemini with Gemini Live appeared first on SD Times.

Google launches 2 million context window for Gemini 1.5 Pro

Jenna Barron — Thu, 27 Jun 2024 19:16:03 +0000

Google has announced that developers now have access to a 2 million context window for Gemini Pro 1.5. For comparison, GPT-4o has a 128k context window.

This context window length was first announced at Google I/O and accessible only through a waitlist, but now everyone has access.

Longer context windows can lead to higher costs, so Google also announced support for context caching in the Gemini API for Gemini 1.5 Pro and 1.5 Flash. This allows context to be stored for use in later queries, which reduces costs for tasks that reuse tokens across prompts.

Additionally, Google has announced that code execution is now enabled for both Gemini 1.5 Pro and 1.5 Flash. This feature allows the model to generate and run Python code and then iterate on it until the desired result is achieved.

According to Google, the execution sandbox isn’t connected to the internet, comes with a few numerical libraries pre-installed, and bills developers based on the output tokens from the model.

And finally, Gemma 2 is now available in Google AI Studio and Gemini 1.5 Flash tuning will be available via the Gemini API or Google AI Studio sometime next month.

You may also like…

Anthropic’s new Claude 3.5 Sonnet model already competitive with GPT-4o and Gemini 1.5 Pro on multiple benchmarks

Gemini improvements unveiled at Google Cloud Next

The post Google launches 2 million context window for Gemini 1.5 Pro appeared first on SD Times.

Gemini improvements unveiled at Google Cloud Next

Jenna Barron — Wed, 10 Apr 2024 19:21:22 +0000

Google Cloud Next was this week, and the company unveiled a lot of innovations related to AI, such as two new Gemma models for code generation and inference.

Google announced that Gemini 1.5 Pro will be entering public preview for Google Cloud customers, and it’s available through Vertex AI. This version of the model was benefited by a breakthrough in long context understanding that allows it to run 1 million tokens of information consistently. This opens up use cases, such as enabling a gaming company to provide gamers with a video analysis of their performance and tips to improve.

Gemini Code Assist has also been upgraded with Gemini 1.5 Pro and the larger context window improves its ability to provide code suggestions and enables deeper insights.

Google’s Threat Intelligence is also bolstered by the improvement and can now analyze larger samples of malicious code.

Additionally, Gemini for Google Workspace is getting a new feature called Google Vids that allows users to create videos for work. It also added Gemini improvements across Gmail, Meet, and Chat.

“Just as cloud computing changed how businesses worked a decade ago, AI is going to drive incredible opportunity and progress all over again. Google Cloud is how we’ll continue to help organizations everywhere do transformational things, and we can’t wait to see what’s next,” Sundar Pichai, CEO of Google, wrote in a blog post.

The post Gemini improvements unveiled at Google Cloud Next appeared first on SD Times.

Google Cloud integrates Gemini into Stack Overflow

Jenna Barron — Thu, 29 Feb 2024 17:47:38 +0000

Stack Overflow and Google Cloud have announced a new partnership aimed at better serving information to developers.

Google Cloud will be integrating the AI model Gemini into Stack Overflow to surface relevant content in response to searches, and Google Cloud will also begin pulling in information directly from Stack Overflow so that developers don’t have to leave the platform to find answers to their questions.

These integrations will help improve the way developers seek out and obtain technical knowledge from the community.

“In the AI era, Stack Overflow has maintained that the foundation of trusted and accurate data will be central to how technology solutions are built, with millions of the world’s developers coming to our platform as one of the few high-quality sources of information with community attribution at its core,” said Prashanth Chandrasekar, CEO of Stack Overflow. “This landmark, multi-dimensional AI-focused partnership, which includes Stack Overflow adopting the latest AI technology from Google Cloud, and Google Cloud integrating Stack Overflow knowledge into its AI tools, underscores our joint commitment to unleash developer creativity, unlock productivity without sacrificing accuracy, and deliver on socially responsible AI. By bringing together the strengths of our two companies, we can accelerate innovation across a variety of industries.”

Thomas Kurian, CEO at Google Cloud, added: “This partnership brings our enterprise AI platform together with the most in-depth and popular developer knowledge platform available today. Google Cloud and Stack Overflow will help developers more effectively use AI in the platforms they prefer, combining the vast knowledge from the Stack Overflow community and new AI capabilities, powered by Vertex AI and Google Cloud’s trusted, secure infrastructure.”

The post Google Cloud integrates Gemini into Stack Overflow appeared first on SD Times.

Google’s Gemini Pro now available to developers via Google AI Studio and Vertex AI

Jenna Barron — Wed, 13 Dec 2023 15:00:47 +0000

After announcing its new multimodal AI model Gemini last week, Google is making several announcements today to enable developers to build with it.

When first announced, Google said that Gemini will come in three different versions, each tailored to a different size or complexity requirement. In order from largest to smallest, Gemini is available in Ultra, Pro, and Nano versions. Gemini Nano has already seen use in Android in the Pixel 8 Pro and Google Bard is also already using a specialized version of Gemini Pro.

Today, Google is announcing that developers can use Gemini Pro through the Gemini API. Initial features that developers can leverage include function calling, embeddings, semantic retrieval, custom knowledge grounding, and chat functionality, the company explained.

There are two main ways to work with Gemini Pro: Google AI Studio and Vertex AI on Google Cloud. Google AI Studio is a web-based developer tool that is easy to get started with. It has a free quota that allows up to 60 requests per minute and offers quickstart templates to enable developers to get started.

Vertex AI on Google Cloud is a machine learning platform that Google says is sort of a step up from Google Studio AI in terms of complexity, where developers can fully customize Gemini and access benefits like full data control and integration with other Google Cloud features to support security, safety, privacy, governance, and compliance.

Currently, it will be free to use Gemini in Vertex AI at the same rate limit as the free quota of Google AI Studio until it reaches general availability next year. Once generally available, inputs will cost $0.00025 for 1000 characters and $0.0025 per image.

According to Google, some of the more complex capabilities enabled by working in Vertex AI include the ability to augment Gemini with company data and build search and conversational agents in a low-code environment.

Currently, Gemini Pro accepts text as input and also outputs text, but for developers wanting to experiment with images, there is a dedicated Gemini Pro Vision endpoint that also accepts images along with text in inputs, and outputs text.

Looking forward to the future, developers can anticipate Google to launch Gemini Ultra early next year, which is a larger model that is suited for complex tasks. The company is also working to bring Gemini to the Chrome and Firebase developer platforms.

In addition, another announcement the company made today is the release of the next generation of Google’s image-generation model, Imagen 2. It is now available for all Vertex AI customers on Google’s allowlist.

Imagen 2 enables the creation of “high-quality, photorealistic, high-resolution, aesthetically pleasing” images using natural language prompts. New features in this iteration include text rendering to create text overlays on images, logo generation, and visual question and answering for caption generation.

The post Google’s Gemini Pro now available to developers via Google AI Studio and Vertex AI appeared first on SD Times.

Google unveils Gemini, a new multimodal AI model

Jenna Barron — Wed, 06 Dec 2023 17:01:22 +0000

Google has announced its latest AI model, Gemini, which was built from the start to be multimodal so that it could interpret information in multiple formats, spanning text, code, audio, image, and video.

According to Google, the typical approach for creating a multimodal model involves training components for different information formats separately and then combining them together. What sets Gemini apart is that it was trained from the start on different formats and then fine-tuned with additional multi-modal data.

“This helps Gemini seamlessly understand and reason about all kinds of inputs from the ground up, far better than existing multimodal models — and its capabilities are state of the art in nearly every domain,” Sundar Pichai, CEO of Google and Alphabet, and Demis Hassabis, CEO and co-founder of Google DeepMind, wrote in a blog post.

Google also explained that the new model has pretty sophisticated reasoning capabilities, which allow it to understand complex written and visual information, making it “uniquely skilled at uncovering knowledge that can be difficult to discern amid vast amounts of data.”

For example, it can read through hundreds of thousands of documents and extract insights that lead to new breakthroughs in certain fields.

Its multimodal nature also makes it particularly suited to understanding and answering questions in complex fields like math and physics.

Gemini 1.0 comes in three different versions, each tailored to a different size requirement. In order from largest to smallest, Gemini is available in Ultra, Pro, and Nano versions.

According to Google, in Gemini’s initial benchmarking, Gemini Ultra has surpassed the performance of 30 out of the 32 popular academic benchmarks that are often used in model development and research. Gemini Ultra is also the first model to outperform human experts, measured using massive multitask language understanding (MMLU), which combines 57 subjects, including math, physics, history, law, medicine, and ethics.

Gemini Pro is now integrated into Bard, making it the biggest update to Bard since its initial release. The Pixel 8 Pro has also been engineered to make use of Gemini Nano to power features like Summarize in the Recorder app and Smart Reply in Google’s keyboard.

In the next few months Gemini will also be added to more Google products, such as Search, Ads, Chrome, and Duet AI.

Developers will be able to access Gemini Pro via the Gemini API in Google AI Studio or Google Cloud Vortex AI starting on December 13.

The first release of Gemini understands many popular programming languages, including Python, Java, C++, and Go. “Its ability to work across languages and reason about complex information makes it one of the leading foundation models for coding in the world,” Pichai and Hassabis wrote.

The company also used Gemini to create an advanced code generation system called AlphaCode 2 (an evolution of the first version Google released two years ago). It can solve competitive programming problems that involve complex math and theoretical computer science.

Along with the announcement of Gemini, Google is also announcing a new TPU system called Cloud TPU v5p, which is designed for “training cutting-edge AI models.”

“This next generation TPU will accelerate Gemini’s development and help developers and enterprise customers train large-scale generative AI models faster, allowing new products and capabilities to reach customers sooner,” Pichai and Hassabis wrote.

Google also highlighted how it followed its responsible AI Principles when developing Gemini. It says it conducted new research into areas of potential risk, including cyber-offense, persuasion, and autonomy.The company also built safety classifiers for identifying, labeling, and sorting out content containing violence or negative stereotypes.

“This is a significant milestone in the development of AI, and the start of a new era for us at Google as we continue to rapidly innovate and responsibly advance the capabilities of our models. We’ve made great progress on Gemini so far and we’re working hard to further extend its capabilities for future versions, including advances in planning and memory, and increasing the context window for processing even more information to give better responses,” Pichai and Hassabis wrote.

The post Google unveils Gemini, a new multimodal AI model appeared first on SD Times.