Subliminal Learning in AIs – Source: www.schneier.com

July 26, 2025
Post Author / Publisher: Schneier on Security

CISO2CISO post categories: academic papers, AI, Cyber Security News, Integrity, LLM, rss-feed-post-generator-echo, SchneierOnSecurity, trust, Uncategorized

Rate this post

Source: www.schneier.com – Author: Bruce Schneier

Today’s freaky LLM behavior:

We study subliminal learning, a surprising phenomenon where language models learn traits from model-generated data that is semantically unrelated to those traits. For example, a “student” model learns to prefer owls when trained on sequences of numbers generated by a “teacher” model that prefers owls. This same phenomenon can transmit misalignment through data that appears completely benign. This effect only occurs when the teacher and student share the same base model.

Interesting security implications.

I am more convinced than ever that we need serious research into AI integrity if we are ever going to have trustworthy AI.

Tags: academic papers, AI, integrity, LLM, trust

Posted on July 25, 2025 at 7:10 AM • 9 Comments

Sidebar photo of Bruce Schneier by Joe MacInnis.

Original Post URL: https://www.schneier.com/blog/archives/2025/07/subliminal-learning-in-ais.html

Category & Tags: Uncategorized,academic papers,AI,integrity,LLM,trust – Uncategorized,academic papers,AI,integrity,LLM,trust

Views: 5

CISO2CISO post categories: academic papers, AI, Cyber Security News, Integrity, LLM, rss-feed-post-generator-echo, SchneierOnSecurity, trust, Uncategorized

CISO2CISO Notepad Series

How Can We Structure Cybersecurity Teams To Better Integrate Security In Agile At Scale?

How Can We Structure Cybersecurity...

Linux Basics for Hackers by Occupytheweb

Linux Basics for Hackers by...

Digital Forensics and Incident Response (DFIR) Framework for Operational Technology (OT) by NIST – Eran Salfati and Michael Pease

Digital Forensics and Incident Response...

Top 10 TED Talks to Learn about Cyber Security

Top 10 TED Talks to...

NCSC Cyber Security for Small Business “SMEs” Guide.

NCSC Cyber Security for Small...

Practical DevSecOps

API Security Fundamentals – Your Handy Guide to Building an Unhackable System by practical-devsecops.com

API Security Fundamentals – Your...

Marcos Jaimovich

Nuevo Firewall para IA , un Cacharro nuevo para nuestro SOC y los equipos de Ciberseguridad !

Nuevo Firewall para IA ,...

The 2024 CISO Burnout Report by Vendict

The 2024 CISO Burnout Report...

Mohammad Alkhudari

Cybersecurity Strong Strategy step by step Guide collected by Mohammad Alkhudari 2024

Cybersecurity Strong Strategy step by...

Marcos Jaimovich

The Silent Spectre Haunting Your Network: QPhishing, the CISO’s Unspoken Nightmare.

The Silent Spectre Haunting Your...

Marcos Jaimovich

Goodbye to Traditional: Why Conventional Cybersecurity Tools are No Longer Sufficient for the Future of Digital Threats ?

Goodbye to Traditional: Why Conventional...

Marcos Jaimovich

Why do we compare a SOC (Security Operations Center) with the cockpit of a commercial airplane? by Marcos Jaimovich

Why do we compare a...

Retail Threat Landscape Report Q1-Q3 – November 2024 Summary by Cyberint a Check Point Company

Retail Threat Landscape Report Q1-Q3...

Protecting Critical Supply Chains – A Guide to Securing your Supply Chain Ecosystem

Protecting Critical Supply Chains –...

Time to Adapt – The State of Human Risk Management in 2024 by Culture AI.

Time to Adapt – The...

National Cyber Security Centre

Engaging with Boards to improve the management of cyber security risk.

Engaging with Boards to improve...

Microsoft Security

2024 State of Multicloud Security Report by Microsoft Security

2024 State of Multicloud Security...

Inside the Mind of a CISO 2024 The Evolving Roles of Security Leaders 2024 by bugcrowd

Inside the Mind of a...

Mohammad Alkhudari

Cybersecurity Strong Strategy step by step Guide collected by Mohammad Alkhudari 2024

Cybersecurity Strong Strategy step by...

CISO’s Playbookto Cloud Security by Lacework

CISO’s Playbookto Cloud Security by...

MITRE - Carson Zimmerman

Ten Strategies of a World-Class Cybersecurity Operations Center by MITRE

Ten Strategies of a World-Class...

SILVERFRONT - AIG

Identity Has Become the Prime Target of Threat Actors by Silverfort AIG.

Identity Has Become the Prime...

Enterprise Information Security

Enterprise Information Security

IGNITE Technologies

ENCRYPTED REVERSE

ENCRYPTED REVERSE

WORLD BANK GROUP

Global Cybersecurity Capacity Program

Global Cybersecurity Capacity Program

Getting started withsecurity metrics

Getting started withsecurity metrics

Generative AI and the EUDPR.

Generative AI and the EUDPR.

2024 Guide to Application Security Testing Tools

2024 Guide to Application Security Testing Tools

Hacker Culture

Quuensland Govermment

RISK ASSESSMENT PROCESS HANDBOOK

RISK ASSESSMENT PROCESS HANDBOOK

Ministry of MOS Security

HIPAA SIMPLIFIED

HIPAA SIMPLIFIED

How Are Passwords Cracked?

How Are Passwords Cracked?

CIS - Center for Internet Security

How to Plan a Cybersecurity Roadmap in Four Steps

How to Plan a Cybersecurity Roadmap in Four Steps

Information Security Manual

Information Security Manual

Incident Response Playbook: Dark Web Breaches

Incident Response Playbook: Dark Web Breaches

Endpoint Hardening Checklist

Endpoint Hardening Checklist

Important Active Directory Attribute

Important Active Directory Attribute

ISA GLOBAL CYBERSECURITY ALLIANCE

IIoT System Implementation and Certification Based on ISA/IEC 62443 Standards

IIoT System Implementation and Certification Based on ISA/IEC 62443 Standards

IAM Security CHECKLIST

IAM Security CHECKLIST

Hunt Evil

How to protect personal data and comply with regulations

How to protect personal data and comply with regulations

Threat Intelligence Platforms

Threat Intelligence Platforms

CSA Cloud Security Alliance

Hardware Security Module(HSM) as a Service

Hardware Security Module(HSM) as a Service

IGNITE Technologies

CREDENTIAL DUMPING FAKE SERVICES

CREDENTIAL DUMPING FAKE SERVICES

IGNITE Technologies

A Detailed Guide on Covenant

A Detailed Guide on Covenant

Computer Security Incident Handling Guide

Computer Security Incident Handling Guide

IGNITE Technologies

DIGITAL FORENSIC FTK IMAGER

DIGITAL FORENSIC FTK IMAGER

Cloud Security Assessment

Cloud Security Assessment

CISO Reporting Landscape 2024

CISO Reporting Landscape 2024

Reporting Cyber Risk to Boards

Reporting Cyber Risk to Boards

CIOB Artificial Intelligence (AI) Playbook 2024

CIOB Artificial Intelligence (AI) Playbook 2024

INCIBE & SPAIN GOVERNMENT

Nuevas normativas de 2024 de ciberseguridad para vehículos

Nuevas normativas de 2024 de ciberseguridad para vehículos