AI

Gaussian Reward Modeling for GUI Grounding

Authors:Fei Tang, Zhangxuan Gu, Zhengxi Lu, Xuyang Liu, Shuheng Shen, Changhua Meng, Wen Weng, Wenqi Zhang, Yongliang Shen, Weiming Lu, Jun Xiao, Yueting Zhuang

View PDF file from the paper entitled GuI-G^$ 2: Gaussian Reward Modeling for Gui Grouding, by Fei Tang and 11 other books

PDF HTML (experimental) view

a summary:Graphic user interface (graphical user interface) Natural language basis instructions to accurate interface sites for independent interaction. Currently enhanced learning methods use bilateral rewards that deal with elements as successful goals or Miss, creating few signs that ignore the ongoing nature of spatial reactions. Motivated by the human clicking behavior that naturally forms Russian distributions focusing on the targeted elements, we offer the Rus Guussian Bonuses (GuI-G $^$ 2), an initial reward frame that treats the user interface elements as continuous gospel distributions across the interface level. The Gui-G $^$ 2 includes two aquarium mechanisms: Gaussian Point Model, accurate Emiratization through the decomposition distributions clearly focus on the NGOs of the elements, while evaluating the spatial alignment coverage bonuses by measuring the overlap between the expected vocal distributions and goals. To deal with the scales of the various elements, we develop an adaptive variation mechanism that prepares reward distributions based on the dimensions of the elements. This framework turns off the graphic user interface from the branching dual classification to the continuous continuous improvement, as the gossip distributions generate a rich gradient signaling models towards optimal interaction sites. Extensive experiences on the screen, Screenspot-V2, and Screenspot-Pro show standards that GuI-G^$ 2, greatly outperforms the latest UI-Tars-72B method, with 24.7 % more important improvement on Screnspot-PRO. Our analysis reveals that continuous modeling provides superior durability to the interface and enhance the generalization of invisible layouts, which puts a new model for spatial thinking in the tasks of interacting the graphic user interface.

The application date

From: Fi Tang [view email]
[v1]

Monday, 21 July 2025 17:53:42 UTC (2762 KB)
[v2]

Tuesday, 22 Jul 2025 16:50:36 UTC (2762 KB)

Don’t miss more hot News like this! AI/" target="_blank" rel="noopener">Click here to discover the latest in AI news!

2025-07-23 04:00:00

Related Articles

Back to top button