A “diff” tool for AI: Finding behavioral differences in new models \ Anthropic
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.