Updated production-ready Gemini models, reduced 1.5 Pro pricing, increased rate limits, and more

0 Less than a minute

Gemini 15 Flash Social 1.2e16d0ba.fill 1200x600.png

Today, we make a Gemini models ready for production: Gemini-1.5-PRO -002 and Gemini-1.5-Flash -002 along with:

> 50 % decrease in price on 1.5 Pro (both inputs and outputs for demands <128k)
Average 2x rates on 1.5 Flash and 3 ~ 3X above 1.5 Pro
2x output faster and 3x is the lowest transmission time
Update default liquidation settings

These new models are based on our latest experimental models and include meaningful improvements on Gemini 1.5 models released in Google I/O in May. The developers can access our latest free models through Google Ai Studio and Gemini API. For large institutions and Google Cloud customers, models are also available on Vertex AI.

Improving comprehensive quality, with greater gains in mathematics, long context and vision

GEINI 1.5 series is models designed for general performance through a wide range of text tasks, symbol and multimedia tasks. For example, Gemini models can be used to collect information from 1000 PDFS pages, answer questions about Repos that contains more than 10,000 line of software instructions, take long videos per hour and create useful content from them, and more.

With the latest updates, 1.5 Pro and Flash are now better, faster and more effective to build with production. We see an increase ~ 7 % in MMLU-PRO, which is a more challenging version of the famous MMLU standard. On Math and Hiddenmath (an internal group of mathematics problems in the competition), both models achieved a 20 % improvement. For vision and symbol use, both models also perform better (ranging from 2-7 %) via Evals to measure visual understanding and generate a snake symbol.

We have also improved comprehensive assistance for typical responses, while continuing to support our content policies and standards. This means that there are less high responses/less than useful responses across many topics.

Both models now have a more brief style in response to developers’ comments aimed at making these models easier to use and reduce costs. For use situations such as summarizing, answering questions, and extraction, the length of the default output of updated materials is shorter than 5 to 20 % of the previous models. For chat -based products where users may prefer longer virtual responses, you can read our claim strategies guide to learn more about how to make models more long and updated.

For more details about the deportation to the latest versions of Gemini 1.5 Pro and 1.5 Flash, check the API Gemini models page.

Gemini 1.5 Pro

We are still detonated through creative and useful applications for the distinctive distinctive context window of Gemini 1.5 Pro 2 million and multimedia capabilities. From understanding the video to processing 1000 PDFs, there are still many new use cases that still have to be designed. Today we announce a 64 % reduction in the distinctive icons of inputs, price reduction by 52 % on the distinctive icons of the output, and the price reduction by 64 % on the codes of the temporary storage code to increase the strongest series of series 1.5, GEINI 1.5 Pro, as of October 1, 2024, on claims less than 128 thousand ROM. In addition to the temporary storage of the context, this continues to push the cost of construction with Gemini down.

Increase rate limits

To facilitate the construction of developers with Gemini, we increase the limits of the paid layer rate against 1.5 flash to 2000 rpm and increase 1.5 Pro to 1000 rpm, up from 1000 and 360, respectively. In the coming weeks, we expect to continue to increase the limits of the API Gemini rate so that developers can build more with Gemini.

2x faster output and 3x is the lowest transmission time

Along with the basic improvements on our latest models, during the past few weeks we have reduced cumin with 1.5 Flash and significantly increasing the exit codes per second, allowing new use cases with our strongest models.

Update liquidation settings

Since the first Gemini was launched in December 2023, building a safe and reliable model has been a major axis. With the latest version of Gemini (-002), we have made improvements to the model’s ability to follow the user instructions with the safety budget. We will continue to provide a group of safety filters on which developers may apply to Google’s models. For models released today, filters will not be applied by default so that developers can determine the most appropriate composition of their use.

Gemini 1.5 Flash-8B Experimental Updates

We release another improved version of the Gemini 1.5 model that we announced in August called “Gemini-1.5-Flash-8B-EXP-0924.” This improved version includes significant performance increases across both textual and multi -media use. It is now available via Google Ai Studio and Gemini API.

Positive feeding developers have shared an incredibly 1.5 Flash-8B feedback, and we will continue to form the pilot production pipeline to production based on the developer’s observations.

We are excited about these updates and we cannot wait to know what you will build with new Gemini models! For advanced Gemini users, you will soon be able to access an improved version of Gemini 1.5 Pro -002.

2024-09-24 16:03:00

0 Less than a minute