Back to search

What are AI evals?

Evals are repeatable tests that measure whether a model or prompt setup performs well on the behaviors you care about.

Artificial Intelligence Medium Theory

What are AI evals?

Evals are repeatable tests that measure whether a model or prompt setup performs well on the behaviors you care about.

  • Use task-specific test cases
  • Track regressions over time
  • Human review may still be needed

What are AI evals?