TL;DR Recent work from Goodfire & UK AISI – Verbalized Eval Awareness Inflates Measured Safety – shows that newer open-weight models verbalize evaluation-awareness (VEA) more often, and that this inflates measured safety. Between OLMo-3-32B-Think and OLMo-3.1-32B-Think – identical base, SFT, DPO, an
⚡
Key Insights
10 editorial insights.
AiFeed24 Team·⏱ 1 min read·News
Deep Analysis
Multi-Source Intelligence
Tags:#ai
Found this useful? Share it!



