Automate AI Evals with Claude Code