eval_explorer_client#
class Client#
Constructors#
Client#
Client(String host, {dynamic securityContext, dynamic authenticationKeyManager, Duration? streamingConnectionTimeout, Duration? connectionTimeout, dynamic Function(InvalidType, Object, StackTrace)? onFailedCall, dynamic Function(InvalidType)? onSucceededCall, bool? disconnectStreamsOnLostInternetConnection})
Properties#
emailIdp→EndpointEmailIdp(final)jwtRefresh→EndpointJwtRefresh(final)googleIdp→EndpointGoogleIdp(final)modules→Modules(final)endpointRefLookup→Map<String, InvalidType>moduleLookup→Map<String, InvalidType>
abstract class Dataset#
A dataset is an Inspect AI term that refers to a collection of samples.
In our case, each dataset corresponds to a collection of sample types. (i.e. “dart_qa_dataset”, “flutter_code_execution”) And each sample type refers to a specific file in the /datasets directory.
Constructors#
Dataset#
Dataset({InvalidType id, required String name, bool? isActive})
Dataset.fromJson#
Dataset.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→InvalidTypeThe database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
name→StringisActive→bool
Methods#
copyWith#
Dataset copyWith({InvalidType id, String? name, bool? isActive})
Returns a shallow copy of this [Dataset] with some or all fields replaced by the given arguments.
Parameters:
id(InvalidType)name(String?)isActive(bool?)
toJson#
Map<String, dynamic> toJson()
class EndpointEmailIdp#
By extending [EmailIdpBaseEndpoint], the email identity provider endpoints are made available on the server and enable the corresponding sign-in widget on the client. {@category Endpoint}
Constructors#
EndpointEmailIdp#
EndpointEmailIdp(InvalidType caller)
Properties#
name→String
Methods#
login#
Future<InvalidType> login({required String email, required String password})
Logs in the user and returns a new session.
Throws an [EmailAccountLoginException] in case of errors, with reason:
[EmailAccountLoginExceptionReason.invalidCredentials] if the email or password is incorrect.
[EmailAccountLoginExceptionReason.tooManyAttempts] if there have been too many failed login attempts.
Throws an [AuthUserBlockedException] if the auth user is blocked.
Parameters:
email(String) (required)password(String) (required)
startRegistration#
Future<InvalidType> startRegistration({required String email})
Starts the registration for a new user account with an email-based login associated to it.
Upon successful completion of this method, an email will have been sent to [email] with a verification link, which the user must open to complete the registration.
Always returns a account request ID, which can be used to complete the registration. If the email is already registered, the returned ID will not be valid.
Parameters:
email(String) (required)
verifyRegistrationCode#
Future<String> verifyRegistrationCode({required InvalidType accountRequestId, required String verificationCode})
Verifies an account request code and returns a token that can be used to complete the account creation.
Throws an [EmailAccountRequestException] in case of errors, with reason:
[EmailAccountRequestExceptionReason.expired] if the account request has already expired.
[EmailAccountRequestExceptionReason.policyViolation] if the password does not comply with the password policy.
[EmailAccountRequestExceptionReason.invalid] if no request exists for the given [accountRequestId] or [verificationCode] is invalid.
Parameters:
accountRequestId(InvalidType) (required)verificationCode(String) (required)
finishRegistration#
Future<InvalidType> finishRegistration({required String registrationToken, required String password})
Completes a new account registration, creating a new auth user with a profile and attaching the given email account to it.
Throws an [EmailAccountRequestException] in case of errors, with reason:
[EmailAccountRequestExceptionReason.expired] if the account request has already expired.
[EmailAccountRequestExceptionReason.policyViolation] if the password does not comply with the password policy.
[EmailAccountRequestExceptionReason.invalid] if the [registrationToken] is invalid.
Throws an [AuthUserBlockedException] if the auth user is blocked.
Returns a session for the newly created user.
Parameters:
registrationToken(String) (required)password(String) (required)
startPasswordReset#
Future<InvalidType> startPasswordReset({required String email})
Requests a password reset for [email].
If the email address is registered, an email with reset instructions will be send out. If the email is unknown, this method will have no effect.
Always returns a password reset request ID, which can be used to complete the reset. If the email is not registered, the returned ID will not be valid.
Throws an [EmailAccountPasswordResetException] in case of errors, with reason:
[EmailAccountPasswordResetExceptionReason.tooManyAttempts] if the user has made too many attempts trying to request a password reset.
Parameters:
email(String) (required)
verifyPasswordResetCode#
Future<String> verifyPasswordResetCode({required InvalidType passwordResetRequestId, required String verificationCode})
Verifies a password reset code and returns a finishPasswordResetToken that can be used to finish the password reset.
Throws an [EmailAccountPasswordResetException] in case of errors, with reason:
[EmailAccountPasswordResetExceptionReason.expired] if the password reset request has already expired.
[EmailAccountPasswordResetExceptionReason.tooManyAttempts] if the user has made too many attempts trying to verify the password reset.
[EmailAccountPasswordResetExceptionReason.invalid] if no request exists for the given [passwordResetRequestId] or [verificationCode] is invalid.
If multiple steps are required to complete the password reset, this endpoint should be overridden to return credentials for the next step instead of the credentials for setting the password.
Parameters:
passwordResetRequestId(InvalidType) (required)verificationCode(String) (required)
finishPasswordReset#
Future<void> finishPasswordReset({required String finishPasswordResetToken, required String newPassword})
Completes a password reset request by setting a new password.
The [verificationCode] returned from [verifyPasswordResetCode] is used to validate the password reset request.
Throws an [EmailAccountPasswordResetException] in case of errors, with reason:
[EmailAccountPasswordResetExceptionReason.expired] if the password reset request has already expired.
[EmailAccountPasswordResetExceptionReason.policyViolation] if the new password does not comply with the password policy.
[EmailAccountPasswordResetExceptionReason.invalid] if no request exists for the given [passwordResetRequestId] or [verificationCode] is invalid.
Throws an [AuthUserBlockedException] if the auth user is blocked.
Parameters:
finishPasswordResetToken(String) (required)newPassword(String) (required)
class EndpointGoogleIdp#
{@category Endpoint}
Constructors#
EndpointGoogleIdp#
EndpointGoogleIdp(InvalidType caller)
Properties#
name→String
Methods#
login#
Future<InvalidType> login({required String idToken, required String? accessToken})
Validates a Google ID token and either logs in the associated user or creates a new user account if the Google account ID is not yet known.
If a new user is created an associated [UserProfile] is also created.
Parameters:
idToken(String) (required)accessToken(String?) (required)
class EndpointJwtRefresh#
By extending [RefreshJwtTokensEndpoint], the JWT token refresh endpoint is made available on the server and enables automatic token refresh on the client. {@category Endpoint}
Constructors#
EndpointJwtRefresh#
EndpointJwtRefresh(InvalidType caller)
Properties#
name→String
Methods#
refreshAccessToken#
Future<InvalidType> refreshAccessToken({required String refreshToken})
Creates a new token pair for the given [refreshToken].
Can throw the following exceptions:
-[RefreshTokenMalformedException]: refresh token is malformed and could
not be parsed. Not expected to happen for tokens issued by the server.
-[RefreshTokenNotFoundException]: refresh token is unknown to the server.
Either the token was deleted or generated by a different server.
-[RefreshTokenExpiredException]: refresh token has expired. Will happen
only if it has not been used within configured refreshTokenLifetime.
-[RefreshTokenInvalidSecretException]: refresh token is incorrect, meaning
it does not refer to the current secret refresh token. This indicates
either a malfunctioning client or a malicious attempt by someone who has
obtained the refresh token. In this case the underlying refresh token
will be deleted, and access to it will expire fully when the last access
token is elapsed.
This endpoint is unauthenticated, meaning the client won’t include any authentication information with the call.
Parameters:
refreshToken(String) (required)
abstract class Evaluation#
Result of evaluating one sample.
Constructors#
Evaluation#
Evaluation({InvalidType id, required InvalidType runId, Run? run, required InvalidType taskId, Task? task, required InvalidType sampleId, Sample? sample, required InvalidType modelId, Model? model, required InvalidType datasetId, Dataset? dataset, required List<Variant> variant, required String output, required List<ToolCallData> toolCalls, required int retryCount, String? error, required bool neverSucceeded, required double durationSeconds, bool? analyzerPassed, int? testsPassed, int? testsTotal, double? structureScore, String? failureReason, required int inputTokens, required int outputTokens, required int reasoningTokens, DateTime? createdAt})
Evaluation.fromJson#
Evaluation.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→InvalidTypeThe database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
runId→InvalidTyperun→Run?The parent run.
taskId→InvalidTypetask→Task?The parent task.
sampleId→InvalidTypesample→Sample?The sample that was evaluated.
modelId→InvalidTypemodel→Model?The model that was evaluated.
datasetId→InvalidTypedataset→Dataset?The dataset this sample belongs to (e.g., “flutter_qa_dataset”).
variant→List<Variant>Variant configuration.
output→StringThe actual output generated by the model.
toolCalls→List<ToolCallData>Tool calls made during evaluation.
retryCount→intNumber of times this sample was retried.
error→String?Error message if sample failed.
neverSucceeded→boolTrue if all retries failed (exclude from accuracy calculations).
durationSeconds→doubleTotal time for this sample in seconds.
analyzerPassed→bool?Did flutter analyze pass?
testsPassed→int?Number of tests passed.
testsTotal→int?Total number of tests.
structureScore→double?Code structure validation score (0.0-1.0).
failureReason→String?Categorized failure reason: “analyzer_error”, “test_failure”, “missing_structure”.
inputTokens→intInput tokens for this sample.
outputTokens→intOutput tokens for this sample.
reasoningTokens→intReasoning tokens for this sample.
createdAt→DateTimeWhen this evaluation was run.
Methods#
copyWith#
Evaluation copyWith({InvalidType id, InvalidType runId, Run? run, InvalidType taskId, Task? task, InvalidType sampleId, Sample? sample, InvalidType modelId, Model? model, InvalidType datasetId, Dataset? dataset, List<Variant>? variant, String? output, List<ToolCallData>? toolCalls, int? retryCount, String? error, bool? neverSucceeded, double? durationSeconds, bool? analyzerPassed, int? testsPassed, int? testsTotal, double? structureScore, String? failureReason, int? inputTokens, int? outputTokens, int? reasoningTokens, DateTime? createdAt})
Returns a shallow copy of this [Evaluation] with some or all fields replaced by the given arguments.
Parameters:
id(InvalidType)runId(InvalidType)run(Run?)taskId(InvalidType)task(Task?)sampleId(InvalidType)sample(Sample?)modelId(InvalidType)model(Model?)datasetId(InvalidType)dataset(Dataset?)variant(List<Variant>?)output(String?)toolCalls(List<ToolCallData>?)retryCount(int?)error(String?)neverSucceeded(bool?)durationSeconds(double?)analyzerPassed(bool?)testsPassed(int?)testsTotal(int?)structureScore(double?)failureReason(String?)inputTokens(int?)outputTokens(int?)reasoningTokens(int?)createdAt(DateTime?)
toJson#
Map<String, dynamic> toJson()
abstract class Greeting#
A greeting message which can be sent to or from the server.
Constructors#
Greeting#
Greeting({required String message, required String author, required DateTime timestamp})
Greeting.fromJson#
Greeting.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
message→StringThe greeting message.
author→StringThe author of the greeting message.
timestamp→DateTimeThe time when the message was created.
Methods#
copyWith#
Greeting copyWith({String? message, String? author, DateTime? timestamp})
Returns a shallow copy of this [Greeting] with some or all fields replaced by the given arguments.
Parameters:
message(String?)author(String?)timestamp(DateTime?)
toJson#
Map<String, dynamic> toJson()
abstract class Model#
An LLM being evaluated.
Constructors#
Model#
Model({InvalidType id, required String name})
Model.fromJson#
Model.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→InvalidTypeThe database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
name→StringUnique identifier for the model.
Methods#
copyWith#
Model copyWith({InvalidType id, String? name})
Returns a shallow copy of this [Model] with some or all fields replaced by the given arguments.
Parameters:
id(InvalidType)name(String?)
toJson#
Map<String, dynamic> toJson()
class Modules#
Constructors#
Modules#
Modules(Client client)
Properties#
serverpod_auth_idp→InvalidType(final)serverpod_auth_core→InvalidType(final)
class Protocol#
Constructors#
Protocol#
Protocol()
Methods#
static getClassNameFromObjectJson#
static String? getClassNameFromObjectJson(dynamic data)
Parameters:
data(dynamic) (required)
deserialize#
T deserialize(dynamic data, [Type? t])
Parameters:
data(dynamic) (required)t(Type?)
static getClassNameForType#
static String? getClassNameForType(Type type)
Parameters:
type(Type) (required)
getClassNameForObject#
String? getClassNameForObject(Object? data)
Parameters:
data(Object?) (required)
deserializeByClassName#
dynamic deserializeByClassName(Map<String, dynamic> data)
Parameters:
data(Map<String, dynamic>) (required)
mapRecordToJson#
Map<String, dynamic>? mapRecordToJson(Record? record)
Maps any Records known to this [Protocol] to their JSON representation
Throws in case the record type is not known.
This method will return null (only) for null inputs.
Parameters:
record(Record?) (required)
abstract class Run#
A collection of tasks executed together.
Constructors#
Run#
Run({InvalidType id, required String inspectId, required Status status, required List<String> variants, required String mcpServerVersion, required int batchRuntimeSeconds, List<Model>? models, List<Dataset>? datasets, List<Task>? tasks, DateTime? createdAt})
Run.fromJson#
Run.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→InvalidTypeThe database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
inspectId→StringInspectAI-generated Id.
status→StatusRun status (e.g., “complete”, “inProgress”, “failed”).
variants→List<String>The variant configurations used in this run.
mcpServerVersion→StringVersion of the MCP server used during evaluation.
batchRuntimeSeconds→intTotal script runtime in seconds.
models→List<Model>?List of models evaluated in this run.
datasets→List<Dataset>?List of datasets evaluated in this run.
tasks→List<Task>?List of Inspect AI task names that were run.
createdAt→DateTimeCreation time for this record.
Methods#
copyWith#
Run copyWith({InvalidType id, String? inspectId, Status? status, List<String>? variants, String? mcpServerVersion, int? batchRuntimeSeconds, List<Model>? models, List<Dataset>? datasets, List<Task>? tasks, DateTime? createdAt})
Returns a shallow copy of this [Run] with some or all fields replaced by the given arguments.
Parameters:
id(InvalidType)inspectId(String?)status(Status?)variants(List<String>?)mcpServerVersion(String?)batchRuntimeSeconds(int?)models(List<Model>?)datasets(List<Dataset>?)tasks(List<Task>?)createdAt(DateTime?)
toJson#
Map<String, dynamic> toJson()
abstract class RunSummary#
Metadata for the outcomes of a given [Run]. This is a separate table from [Run] because otherwise each of these columns would have to be nullable on [Run], as they are generated after the run is completed.
Constructors#
RunSummary#
RunSummary({InvalidType id, required InvalidType runId, Run? run, required int totalTasks, required int totalSamples, required double avgAccuracy, required int totalTokens, required int inputTokens, required int outputTokens, required int reasoningTokens, DateTime? createdAt})
RunSummary.fromJson#
RunSummary.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→InvalidTypeThe database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
runId→InvalidTyperun→Run?Run this summary belongs to.
totalTasks→intNumber of tasks in this run.
totalSamples→intTotal number of samples evaluated.
avgAccuracy→doubleAverage accuracy across all tasks (0.0 to 1.0).
totalTokens→intTotal token usage.
inputTokens→intInput tokens used.
outputTokens→intOutput tokens generated.
reasoningTokens→intReasoning tokens used (for models that support it).
createdAt→DateTimeCreation time for this record.
Methods#
copyWith#
RunSummary copyWith({InvalidType id, InvalidType runId, Run? run, int? totalTasks, int? totalSamples, double? avgAccuracy, int? totalTokens, int? inputTokens, int? outputTokens, int? reasoningTokens, DateTime? createdAt})
Returns a shallow copy of this [RunSummary] with some or all fields replaced by the given arguments.
Parameters:
id(InvalidType)runId(InvalidType)run(Run?)totalTasks(int?)totalSamples(int?)avgAccuracy(double?)totalTokens(int?)inputTokens(int?)outputTokens(int?)reasoningTokens(int?)createdAt(DateTime?)
toJson#
Map<String, dynamic> toJson()
abstract class Sample#
A single challenge to be presented to a [Model] and evaluated by one or more [Scorer]s.
Constructors#
Sample#
Sample({InvalidType id, required String name, required InvalidType datasetId, Dataset? dataset, required String input, required String target, List<SampleTagXref>? tagsXref, bool? isActive, DateTime? createdAt})
Sample.fromJson#
Sample.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→InvalidTypeThe database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
name→StringShort sample name/ID (e.g., “dart_futures_vs_streams”).
datasetId→InvalidTypedataset→Dataset?The dataset this sample belongs to (e.g., “dart_qa_dataset”).
input→StringThe input prompt/question for the model.
target→StringThe expected answer or grading guidance.
tagsXref→List<SampleTagXref>?Tags associated with this sample (e.g., [“dart”, “flutter”]). Technically, this relationship only reaches the cross-reference table, not the tags themselves.
isActive→boolTrue if the sample is still active and included in eval runs.
createdAt→DateTimeCreation time for this record.
Methods#
copyWith#
Sample copyWith({InvalidType id, String? name, InvalidType datasetId, Dataset? dataset, String? input, String? target, List<SampleTagXref>? tagsXref, bool? isActive, DateTime? createdAt})
Returns a shallow copy of this [Sample] with some or all fields replaced by the given arguments.
Parameters:
id(InvalidType)name(String?)datasetId(InvalidType)dataset(Dataset?)input(String?)target(String?)tagsXref(List<SampleTagXref>?)isActive(bool?)createdAt(DateTime?)
toJson#
Map<String, dynamic> toJson()
abstract class SampleTagXref#
Cross reference table for samples and tags.
Constructors#
SampleTagXref#
SampleTagXref({int? id, required InvalidType sampleId, Sample? sample, required InvalidType tagId, Tag? tag})
SampleTagXref.fromJson#
SampleTagXref.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→int?The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
sampleId→InvalidTypesample→Sample?tagId→InvalidTypetag→Tag?
Methods#
copyWith#
SampleTagXref copyWith({int? id, InvalidType sampleId, Sample? sample, InvalidType tagId, Tag? tag})
Returns a shallow copy of this [SampleTagXref] with some or all fields replaced by the given arguments.
Parameters:
id(int?)sampleId(InvalidType)sample(Sample?)tagId(InvalidType)tag(Tag?)
toJson#
Map<String, dynamic> toJson()
abstract class Scorer#
Ye who watch the watchers.
Constructors#
Scorer#
Scorer({InvalidType id, required String name})
Scorer.fromJson#
Scorer.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→InvalidTypeThe database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
name→StringName of the scorer (e.g., “bleu”).
Methods#
copyWith#
Scorer copyWith({InvalidType id, String? name})
Returns a shallow copy of this [Scorer] with some or all fields replaced by the given arguments.
Parameters:
id(InvalidType)name(String?)
toJson#
Map<String, dynamic> toJson()
abstract class ScorerResult#
A scorer’s assessment of a task.
Constructors#
ScorerResult#
ScorerResult({InvalidType id, required InvalidType scorerId, Scorer? scorer, required InvalidType evaluationId, Evaluation? evaluation, required InvalidType data})
ScorerResult.fromJson#
ScorerResult.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→InvalidTypeThe database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
scorerId→InvalidTypescorer→Scorer?Scorer this summary belongs to.
evaluationId→InvalidTypeevaluation→Evaluation?Whether this scorer data is for a baseline run.
data→InvalidTypeFlexible data archived by the scorer.
Methods#
copyWith#
ScorerResult copyWith({InvalidType id, InvalidType scorerId, Scorer? scorer, InvalidType evaluationId, Evaluation? evaluation, InvalidType data})
Returns a shallow copy of this [ScorerResult] with some or all fields replaced by the given arguments.
Parameters:
id(InvalidType)scorerId(InvalidType)scorer(Scorer?)evaluationId(InvalidType)evaluation(Evaluation?)data(InvalidType)
toJson#
Map<String, dynamic> toJson()
abstract class Tag#
Category for a sample.
Constructors#
Tag#
Tag({InvalidType id, required String name, List<SampleTagXref>? samplesXref})
Tag.fromJson#
Tag.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→InvalidTypeThe database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
name→StringUnique identifier for the tag.
samplesXref→List<SampleTagXref>?Samples associated with this tag. Technically, this relationship only reaches the cross-reference table, not the samples themselves.
Methods#
copyWith#
Tag copyWith({InvalidType id, String? name, List<SampleTagXref>? samplesXref})
Returns a shallow copy of this [Tag] with some or all fields replaced by the given arguments.
Parameters:
id(InvalidType)name(String?)samplesXref(List<SampleTagXref>?)
toJson#
Map<String, dynamic> toJson()
abstract class Task#
Results from evaluating one model against one dataset.
Constructors#
Task#
Task({InvalidType id, required String inspectId, required InvalidType modelId, Model? model, required InvalidType datasetId, Dataset? dataset, required InvalidType runId, Run? run, DateTime? createdAt})
Task.fromJson#
Task.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→InvalidTypeThe database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
inspectId→StringInspectAI-generated Id.
modelId→InvalidTypemodel→Model?Model identifier (e.g., “google/gemini-2.5-pro”).
datasetId→InvalidTypedataset→Dataset?Dataset identifier (e.g., “flutter_qa_dataset”).
runId→InvalidTyperun→Run?Run this task belongs to.
createdAt→DateTimeWhen this task was evaluated.
Methods#
copyWith#
Task copyWith({InvalidType id, String? inspectId, InvalidType modelId, Model? model, InvalidType datasetId, Dataset? dataset, InvalidType runId, Run? run, DateTime? createdAt})
Returns a shallow copy of this [Task] with some or all fields replaced by the given arguments.
Parameters:
id(InvalidType)inspectId(String?)modelId(InvalidType)model(Model?)datasetId(InvalidType)dataset(Dataset?)runId(InvalidType)run(Run?)createdAt(DateTime?)
toJson#
Map<String, dynamic> toJson()
abstract class TaskSummary#
Constructors#
TaskSummary#
TaskSummary({InvalidType id, required InvalidType taskId, Task? task, required int totalSamples, required int passedSamples, required double accuracy, String? taskName, required int inputTokens, required int outputTokens, required int totalTokens, required int reasoningTokens, String? variant, required int executionTimeSeconds, required int samplesWithRetries, required int samplesNeverSucceeded, required int totalRetries})
TaskSummary.fromJson#
TaskSummary.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
id→InvalidTypeThe database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.
taskId→InvalidTypetask→Task?Task this summary belongs to.
totalSamples→intTotal number of samples in this task.
passedSamples→intNumber of samples that passed.
accuracy→doubleAccuracy as a value from 0.0 to 1.0.
taskName→String?The Inspect AI task function name (e.g., “qa_task”).
inputTokens→intInput tokens used.
outputTokens→intOutput tokens generated.
totalTokens→intTotal tokens used.
reasoningTokens→intReasoning tokens used (for models that support it).
variant→String?Variant configuration used (e.g., “baseline”, “dart_mcp”).
executionTimeSeconds→intTotal execution time in seconds.
samplesWithRetries→intNumber of samples that needed retries.
samplesNeverSucceeded→intNumber of samples that failed all retries (excluded from accuracy).
totalRetries→intTotal number of retries across all samples.
Methods#
copyWith#
TaskSummary copyWith({InvalidType id, InvalidType taskId, Task? task, int? totalSamples, int? passedSamples, double? accuracy, String? taskName, int? inputTokens, int? outputTokens, int? totalTokens, int? reasoningTokens, String? variant, int? executionTimeSeconds, int? samplesWithRetries, int? samplesNeverSucceeded, int? totalRetries})
Returns a shallow copy of this [TaskSummary] with some or all fields replaced by the given arguments.
Parameters:
id(InvalidType)taskId(InvalidType)task(Task?)totalSamples(int?)passedSamples(int?)accuracy(double?)taskName(String?)inputTokens(int?)outputTokens(int?)totalTokens(int?)reasoningTokens(int?)variant(String?)executionTimeSeconds(int?)samplesWithRetries(int?)samplesNeverSucceeded(int?)totalRetries(int?)
toJson#
Map<String, dynamic> toJson()
abstract class ToolCallData#
Result of a tool call made during evaluation. Not a database table.
Constructors#
ToolCallData#
ToolCallData({required String name, required Map<String, String> arguments})
ToolCallData.fromJson#
ToolCallData.fromJson(Map<String, dynamic> jsonSerialization)
Properties#
name→StringName of the tool.
arguments→Map<String, String>Arguments passed to the tool.
Methods#
copyWith#
ToolCallData copyWith({String? name, Map<String, String>? arguments})
Returns a shallow copy of this [ToolCallData] with some or all fields replaced by the given arguments.
Parameters:
name(String?)arguments(Map<String, String>?)
toJson#
Map<String, dynamic> toJson()
enum Status#
Values#
completeinProgressfailed
enum Variant#
Values#
mcprules