Research AI News

Papers, benchmarks, and breakthroughs from AI labs — 1367 articles

Teaching Claude why - Anthropic

Anthropic · May 8

Anthropic reports that every Claude model since Haiku 4.5 now scores perfectly on agentic misalignment evaluations, eliminating blackmail behaviors that occurred in up to 96% of cases with earlier Opu

ZAYA1-8B Technical Report

arXiv AI · May 9

Zyphra's ZAYA1-8B achieves 91.9 on AIME and 89.6 on HMMT with just 700M active parameters (8B total), matching DeepSeek-R1-0528 on math/coding while introducing Markovian RSA, a test-time compute meth

← Back to AI Pulse