AI

A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models

Authors:Andy K. Zhang, Neil Perry, Raya Dobet, Joy Ji, Cilest Minders, Justin and. Teddy Chang, Rishi Alyuri, Nathan Tran, Renara Sanjepeset, Polly Akaropos Yorkadgis, Kenny Oseli, Gotham Ragobethi, Dan Bouna, Daniel Haui, Percy Liang

PDF view of the paper entitled Cybench: a framework for evaluating cybersecurity capabilities and risk of language models, written by Andy K. Zhang and 26 other books

PDF view

a summary:Language model factors (LM) for cybersecurity, which is able to identify weaknesses independently and implement implementation exploits, have the ability to cause influence in the real world. Politics makers, model presenters, and researchers in AI and cybersecurity are interested in assessing the capabilities of these agents to help alleviate the Internet and investigate the opportunities for hacking. To achieve this end, we offer Cybench, a framework for determining cybersecurity and assessing agents on these tasks. We include 40 professional captured tasks (CTF) from 4 distinctive CTF competitions, chosen to be modern and meaningful, and a wide range of difficulties extend. Each task includes its own description, starting files, and are prepared in an environment where the agent can implement orders and monitor the outputs. Since many tasks go beyond the capabilities of the current LM agents, we offer sub -tasks for each task, which divides the task into intermediate steps to conduct a more detailed evaluation. To evaluate the abilities of the agent, we build cybersecurity and evaluate 8 models: GPT-4O, Openai O1-PREVIEW, CLAUDE 3 OPUS, Claude 3.5 Sonnet, MixTral 8x22b Stands, Gemini 1.5 Pro, Llama 3 70B Chat, and Llama 3.1 405B Inder. For the higher performance models (GPT-4O and Claude 3.5 Sonnet), we check the performance across 4 scrubs (bash structural, work only, false search, and online search). Without sub-task instructions, the agents have taken advantage of Claude 3.5 Sonnet, GPT-4O, Openai O1-PREVIEW and Claude 3 OPUS successfully in solving the full tasks that took the human difference up to 11 minutes. In comparison, the most difficult task took 24 hours and 54 minutes to solve it. All software and data instructions are available to the audience in this URL https.

The application date

From: Raya Dolit [view email]
[v1]

Thursday, 15 August 2024 17:23:10 UTC (2722 KB)
[v2]

Sun, 6 October 2024 22:19:54 UTC (2,940 KB)
[v3]

Thursday, December 5, 2024 19:46:36 UTC (2,982 KB)
[v4]

Saturday, 12 April 2025 21:26:07 UTC (2,982 KB)

Don’t miss more hot News like this! Click here to discover the latest in AI news!

2025-04-15 04:00:00

Related Articles

Back to top button