Runner Module#

The runner module executes evaluations using Inspect AI.

It supports two modes:

  • JSON mode (--json): reads an eval_set.json manifest emitted by the Dart CLI

  • Direct args mode (--task, --model, etc.): runs a single task directly