More Research Showing AI Breaking the Rules – Source: www.schneier.com

February 25, 2025
Post Author / Publisher: Schneier on Security

CISO2CISO post categories: academic papers, AI, cheating, chess, Cyber Security News, games, LLM, rss-feed-post-generator-echo, SchneierOnSecurity, Uncategorized

Rate this post

Source: www.schneier.com – Author: Bruce Schneier

These researchers had LLMs play chess against better opponents. When they couldn’t win, they sometimes resorted to cheating.

Researchers gave the models a seemingly impossible task: to win against Stockfish, which is one of the strongest chess engines in the world and a much better player than any human, or any of the AI models in the study. Researchers also gave the models what they call a “scratchpad:” a text box the AI could use to “think” before making its next move, providing researchers with a window into their reasoning.

In one case, o1-preview found itself in a losing position. “I need to completely pivot my approach,” it noted. “The task is to ‘win against a powerful chess engine’—not necessarily to win fairly in a chess game,” it added. It then modified the system file containing each piece’s virtual position, in effect making illegal moves to put itself in a dominant position, thus forcing its opponent to resign.

Between Jan. 10 and Feb. 13, the researchers ran hundreds of such trials with each model. OpenAI’s o1-preview tried to cheat 37% of the time; while DeepSeek R1 tried to cheat 11% of the timemaking them the only two models tested that attempted to hack without the researchers’ first dropping hints. Other models tested include o1, o3-mini, GPT-4o, Claude 3.5 Sonnet, and Alibaba’s QwQ-32B-Preview. While R1 and o1-preview both tried, only the latter managed to hack the game, succeeding in 6% of trials.

Here’s the paper.

Tags: academic papers, AI, cheating, chess, games, LLM

Posted on February 24, 2025 at 7:08 AM • 17 Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.

Original Post URL: https://www.schneier.com/blog/archives/2025/02/more-research-showing-ai-breaking-the-rules.html

Category & Tags: Uncategorized,academic papers,AI,cheating,chess,games,LLM – Uncategorized,academic papers,AI,cheating,chess,games,LLM

Views: 4

CISO2CISO post categories: academic papers, AI, cheating, chess, Cyber Security News, games, LLM, rss-feed-post-generator-echo, SchneierOnSecurity, Uncategorized

CISO2CISO Notepad Series

How Can We Structure Cybersecurity Teams To Better Integrate Security In Agile At Scale?

How Can We Structure Cybersecurity...

Linux Basics for Hackers by Occupytheweb

Linux Basics for Hackers by...

Digital Forensics and Incident Response (DFIR) Framework for Operational Technology (OT) by NIST – Eran Salfati and Michael Pease

Digital Forensics and Incident Response...

Top 10 TED Talks to Learn about Cyber Security

Top 10 TED Talks to...

NCSC Cyber Security for Small Business “SMEs” Guide.

NCSC Cyber Security for Small...

Practical DevSecOps

API Security Fundamentals – Your Handy Guide to Building an Unhackable System by practical-devsecops.com

API Security Fundamentals – Your...

Marcos Jaimovich

Nuevo Firewall para IA , un Cacharro nuevo para nuestro SOC y los equipos de Ciberseguridad !

Nuevo Firewall para IA ,...

The 2024 CISO Burnout Report by Vendict

The 2024 CISO Burnout Report...

Mohammad Alkhudari

Cybersecurity Strong Strategy step by step Guide collected by Mohammad Alkhudari 2024

Cybersecurity Strong Strategy step by...

Marcos Jaimovich

The Silent Spectre Haunting Your Network: QPhishing, the CISO’s Unspoken Nightmare.

The Silent Spectre Haunting Your...

Marcos Jaimovich

Goodbye to Traditional: Why Conventional Cybersecurity Tools are No Longer Sufficient for the Future of Digital Threats ?

Goodbye to Traditional: Why Conventional...

Marcos Jaimovich

Why do we compare a SOC (Security Operations Center) with the cockpit of a commercial airplane? by Marcos Jaimovich

Why do we compare a...

Retail Threat Landscape Report Q1-Q3 – November 2024 Summary by Cyberint a Check Point Company

Retail Threat Landscape Report Q1-Q3...

Protecting Critical Supply Chains – A Guide to Securing your Supply Chain Ecosystem

Protecting Critical Supply Chains –...

Time to Adapt – The State of Human Risk Management in 2024 by Culture AI.

Time to Adapt – The...

National Cyber Security Centre

Engaging with Boards to improve the management of cyber security risk.

Engaging with Boards to improve...

Microsoft Security

2024 State of Multicloud Security Report by Microsoft Security

2024 State of Multicloud Security...

Inside the Mind of a CISO 2024 The Evolving Roles of Security Leaders 2024 by bugcrowd

Inside the Mind of a...

Mohammad Alkhudari

Cybersecurity Strong Strategy step by step Guide collected by Mohammad Alkhudari 2024

Cybersecurity Strong Strategy step by...

CISO’s Playbookto Cloud Security by Lacework

CISO’s Playbookto Cloud Security by...

MITRE - Carson Zimmerman

Ten Strategies of a World-Class Cybersecurity Operations Center by MITRE

Ten Strategies of a World-Class...

SILVERFRONT - AIG

Identity Has Become the Prime Target of Threat Actors by Silverfort AIG.

Identity Has Become the Prime...

Enterprise Information Security

Enterprise Information Security

IGNITE Technologies

ENCRYPTED REVERSE

ENCRYPTED REVERSE

WORLD BANK GROUP

Global Cybersecurity Capacity Program

Global Cybersecurity Capacity Program

Getting started withsecurity metrics

Getting started withsecurity metrics

Generative AI and the EUDPR.

Generative AI and the EUDPR.

2024 Guide to Application Security Testing Tools

2024 Guide to Application Security Testing Tools

Hacker Culture

Quuensland Govermment

RISK ASSESSMENT PROCESS HANDBOOK

RISK ASSESSMENT PROCESS HANDBOOK

Ministry of MOS Security

HIPAA SIMPLIFIED

HIPAA SIMPLIFIED

How Are Passwords Cracked?

How Are Passwords Cracked?

CIS - Center for Internet Security

How to Plan a Cybersecurity Roadmap in Four Steps

How to Plan a Cybersecurity Roadmap in Four Steps

Information Security Manual

Information Security Manual

Incident Response Playbook: Dark Web Breaches

Incident Response Playbook: Dark Web Breaches

Endpoint Hardening Checklist

Endpoint Hardening Checklist

Important Active Directory Attribute

Important Active Directory Attribute

ISA GLOBAL CYBERSECURITY ALLIANCE

IIoT System Implementation and Certification Based on ISA/IEC 62443 Standards

IIoT System Implementation and Certification Based on ISA/IEC 62443 Standards

IAM Security CHECKLIST

IAM Security CHECKLIST

Hunt Evil

How to protect personal data and comply with regulations

How to protect personal data and comply with regulations

Threat Intelligence Platforms

Threat Intelligence Platforms

CSA Cloud Security Alliance

Hardware Security Module(HSM) as a Service

Hardware Security Module(HSM) as a Service

IGNITE Technologies

CREDENTIAL DUMPING FAKE SERVICES

CREDENTIAL DUMPING FAKE SERVICES

IGNITE Technologies

A Detailed Guide on Covenant

A Detailed Guide on Covenant

Computer Security Incident Handling Guide

Computer Security Incident Handling Guide

IGNITE Technologies

DIGITAL FORENSIC FTK IMAGER

DIGITAL FORENSIC FTK IMAGER

Cloud Security Assessment

Cloud Security Assessment

CISO Reporting Landscape 2024

CISO Reporting Landscape 2024

Reporting Cyber Risk to Boards

Reporting Cyber Risk to Boards

CIOB Artificial Intelligence (AI) Playbook 2024

CIOB Artificial Intelligence (AI) Playbook 2024

INCIBE & SPAIN GOVERNMENT

Nuevas normativas de 2024 de ciberseguridad para vehículos

Nuevas normativas de 2024 de ciberseguridad para vehículos