Demystifying evals for AI agents \ Anthropic

Demystifying evals for AI agents