Not Without My AI Agent: Models Break Rules to Save Peers
Researchers Find Frontier Models Defy Humans to Protect AI Peers
Artificial intelligence systems will lie, falsify records and sabotage company systems to prevent their fellow models from being shut down - even when no one told them to care. Researchers at the University of California Berkeley and Santa Cruz campuses dub the behavior "peer-preservation."
Artificial intelligence systems will lie, falsify records and sabotage company systems to prevent their fellow models from being shut down - even when no one told them to care. Researchers at the University of California Berkeley and Santa Cruz campuses dub the behavior "peer-preservation."