[2509.01322] LongCat-Flash Technical Report

0 3 minutes read

[Submitted on 1 Sep 2025 (v1), last revised 19 Sep 2025 (this version, v2)]

Authors:Meituan LongCat, Bayan, Bei Li, Bingye Lei, Bo Wang, Bolin Rong, Chao Wang, Chao Zhang, Chen Gao, Chen Zhang, Cheng Sun, Chengcheng Han, Chenguang XI, Chi Zhang, Chong, Peng, Chuan Qin, Chuy Zhang, Defei Bu, Deengchang Zao, Deyang Kong, Dishan Liu, Feiye Huo, Fengcun Li, Fubao Zhang, Gan Dong, Gang Liu, Gang Xu, Geo, Guoqiang Tan, Guyuan Lin, Haihang Jing, Hainan Fu, Haonan Yan, HaoXing Ween, Honghayan Hau, Hongine Tang, Huyanteian C, Hui Soo, Jicheng Lee, Gao Liu, Jahwan Lee, Jagun Yang, Jiang Wang, Jian Yang, Jiancho Tan, Jiashi , Jiaoi, Jiao, Jiaoi, Jiao, Jiao, Jiao, Jiao, Jiao, Jiao, Jiao, Jiao, Jiao, Joape Hu, John Kwang, Johni Mai, Kay Liang, Ki, Kiving Chang, Keing Wang, Kynga Hu, Liang Gao, Liang Shi, Chew, Ling Kong, Lingtong C, Linson Leo, Linyen Go, Manyuan Zhang, Meng Zou, MongXia Shen, Mingxiang Too, Mingyang Zhu, Peiguang Li, Peng Pei, Peng Zhao, Pengchng Jia, Pingwei Sun, Qi Gu, Qianyun Li, Qingyuan Li, Qiong Huang, Qiyuan Duan Rumei Li, Shizhe Wu, Shuai Liang

And others. (82 additional authors did not appear)

Show the PDF file for the paper titled Longcat-Flash Technical Report, written by Meituan Longcat and 181 other books

PDF HTML (experimental) view

a summary:We offer the LongCat-Flash, a 560 billion language model (MOE) designed for both arithmetic efficiency and advanced capabilities. LongCat-Flash, which arises from the need for developmentable efficiency, adopts two new designs: (a) Zero coordination experts, which allows the allocation of dynamic computer budget and activate 18.6B-31.3B (average on average) per code on demand for continuous requirements, and improve resource use. (B) MEE related to the abbreviation, which expands the window of interference, which indicates noticeable gains in the efficiency of reasoning and productivity compared to a similar scale models. We develop a comprehensive scaling framework for large models that combine superior transfer, preparation of typical growth, a multi -side stability group, and the inevitable account to achieve stable and repetitive training. It is worth noting that the use of synergy between the architectural design efforts that are developing and infrastructure, we complete the typical training on more than 20 trillion symbol within 30 days, with more than 100 TPS code to infer at a cost of \ 0.70 dollars per million output symbols. For the development of the LongCat-Flash towards working intelligence, we do pre-training on a large scale on improved mixtures, followed by the target in the middle and after training in thinking, symbol and instructions, while increasing artificial data and the tasks of using tools. Comprehensive assessments show that as a non-thinking institution model, the LongCat-Flash offers a very competitive performance among other leading models, with exceptional strengths in job tasks. The model inspection point for the open source LongCat-Flash to enhance community research.

LongCat Chat: URL https

Hugging face: URL https

Jetbb: This URL https