AI

Tencent releases versatile open-source Hunyuan AI models

TENCENT has expanded her family from open source Hunyuan AI models that vary enough for wide use. This new models family is designed to provide strong performance through mathematical environments, from small edge devices to demanding and high -motion production systems.

The version includes a comprehensive set of pre -trained models and installed on the face of the developer platform that embraces its face. The models come in several sizes, specifically with the measures of parameters from 0.5B, 1.8B, 4B and 7B, providing great flexibility for developers and companies.

TENCENT indicated that these models were developed using training strategies similar to the most powerful Hunyuan-A13B model, allowing them to raise their performance properties. This approach enables users to determine the optimal model for their needs, whether it is a smaller variable for resource restaurant computing or a larger model for the burdens of highly productive production work, all while ensuring strong capabilities.

One of the most prominent features of the Hunyuan series is its original support for a very long 256K context window. This allows models to deal with stable performance and maintain the long text tasks, a vital ability to analyze complex documents, expanded conversations, and deeper content generation. Models support what Trent calls “hybrid thinking”, which allows both the fast and slow thinking modes that users can choose between their specific requirements.

The company also put a strong focus on functional capabilities. The models of the agent-based tasks have been improved and leadership results have shown on the standards in force such as BFCL-V3, Invent-Bench and C3, indicating a high degree of efficiency in solving multi-step complex problems. For example, on the C3 seat, the Hunyuan-7B-Instruct model achieves 68.5 degrees, while the Hunyuan-4B-Instruct is 64.3.

The performance of the series is to focus on effective inference. TENCENT Hunyuan models use the GQA’s attention, which is a well -known technology to improve treatment speed and reduce computer expenses. This efficiency is further enhanced by supporting the advanced quantity, which is a major component of the Hunyuan structure designed to reduce publishing barriers.

TENCENT has developed its compression tool set, Angleslim, to create a more easy and effective model pressure solution. Using this tool, the company offers two main types of quantities for the Hunyuan series.

The first is FP8, a fixed amount, which uses an 8 -bit floating point format. This method uses a small amount of calibration data to determine the quantity scale in advance without the need for full re -training and converting the weight weights and activation values to FP8 format to enhance the efficiency of inference.

The second method is INT4 quantity, which achieves the W4A16 quantity through GPTQ and AWQ algorithms:

  • the GPTQ The approach The weights treat the layer according to the layer, using calibration data to reduce errors in quantitative weights. This process avoids a re -trains of the model and improves inference speed.
  • the AWQ The algorithm works by statistically analyzing the stimulation values from a small group of calibration data. Then the scaling coefficient is calculated for each weight channel, which expands the numerical scope of the important weights to keep more information during the pressure process.

The developers can either use the Angleslim tool themselves or download the pre -models directly.

Performance standards emphasize the strong capabilities of Trent Hunyuan models through a set of tasks. For example, the Hunyuan-7B model, which was pre-trained 79.82, achieves MMLU standards, 88.25 on GSM8K, and 74.85 on the mathematics standard, indicating strong thinking and mathematical skills.

The variables seized in the instructions show great results in specialized areas. In mathematics, the Hunyuan-7B-Instruct 81.1 model achieves AIME 2024 standards, while the 4B 78.3 version is recorded. In science, 7B model reaches 76.5 on Olympiadbench, and in coding, 42 is recorded on LiveCodebeench.

The quantity standards show the minimum performance deterioration. In the projection standard, the Hunyuan-7B-Instruct 85.9 achieves its basic B16 format, 86.0 with FP8, and 85.7 with INT4 GPTQ, indicating that efficiency gains do not come at a cost of accuracy.

For publication, it recommends Tencent to use known business frameworks such as Tensorrt-LLM, VLLM or Sglang to serve Hunyuan models and create the API-compatible API points, ensuring them smoothly into the current development workflow. This combination of performance, efficiency and elasticity of publishing places the Hunyuan series as a strong, continuous competitor in open source artificial intelligence.

See also: Deep Cogito V2: AI is open source who charges its thinking skills

Do you want to learn more about artificial intelligence and large data from industry leaders? Check AI and Big Data Expo, which is held in Amsterdam, California, and London. The comprehensive event was identified with other leading events including the smart automation conference, Blockx, the digital transformation week, and the Cyber Security & Cloud.

Explore the upcoming web events and seminars with which Techforge works here.


Don’t miss more hot News like this! Click here to discover the latest in AI news!

2025-08-04 14:58:00

Related Articles

Back to top button