eval_explorer_client#

class Client#

Constructors#

Client#

Client(String host, {dynamic securityContext, dynamic authenticationKeyManager, Duration? streamingConnectionTimeout, Duration? connectionTimeout, dynamic Function(InvalidType, Object, StackTrace)? onFailedCall, dynamic Function(InvalidType)? onSucceededCall, bool? disconnectStreamsOnLostInternetConnection})

Properties#

  • emailIdpEndpointEmailIdp (final)

  • jwtRefreshEndpointJwtRefresh (final)

  • googleIdpEndpointGoogleIdp (final)

  • modulesModules (final)

  • endpointRefLookupMap<String, InvalidType>

  • moduleLookupMap<String, InvalidType>


abstract class Dataset#

A dataset is an Inspect AI term that refers to a collection of samples.

In our case, each dataset corresponds to a collection of sample types. (i.e. “dart_qa_dataset”, “flutter_code_execution”) And each sample type refers to a specific file in the /datasets directory.

Constructors#

Dataset#

Dataset({InvalidType id, required String name, bool? isActive})

Dataset.fromJson#

Dataset.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idInvalidType

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • nameString

  • isActivebool

Methods#

copyWith#

Dataset copyWith({InvalidType id, String? name, bool? isActive})

Returns a shallow copy of this [Dataset] with some or all fields replaced by the given arguments.

Parameters:

  • id (InvalidType)

  • name (String?)

  • isActive (bool?)

toJson#

Map<String, dynamic> toJson()

class EndpointEmailIdp#

By extending [EmailIdpBaseEndpoint], the email identity provider endpoints are made available on the server and enable the corresponding sign-in widget on the client. {@category Endpoint}

Constructors#

EndpointEmailIdp#

EndpointEmailIdp(InvalidType caller)

Properties#

  • nameString

Methods#

login#

Future<InvalidType> login({required String email, required String password})

Logs in the user and returns a new session.

Throws an [EmailAccountLoginException] in case of errors, with reason:

  • [EmailAccountLoginExceptionReason.invalidCredentials] if the email or password is incorrect.

  • [EmailAccountLoginExceptionReason.tooManyAttempts] if there have been too many failed login attempts.

Throws an [AuthUserBlockedException] if the auth user is blocked.

Parameters:

  • email (String) (required)

  • password (String) (required)

startRegistration#

Future<InvalidType> startRegistration({required String email})

Starts the registration for a new user account with an email-based login associated to it.

Upon successful completion of this method, an email will have been sent to [email] with a verification link, which the user must open to complete the registration.

Always returns a account request ID, which can be used to complete the registration. If the email is already registered, the returned ID will not be valid.

Parameters:

  • email (String) (required)

verifyRegistrationCode#

Future<String> verifyRegistrationCode({required InvalidType accountRequestId, required String verificationCode})

Verifies an account request code and returns a token that can be used to complete the account creation.

Throws an [EmailAccountRequestException] in case of errors, with reason:

  • [EmailAccountRequestExceptionReason.expired] if the account request has already expired.

  • [EmailAccountRequestExceptionReason.policyViolation] if the password does not comply with the password policy.

  • [EmailAccountRequestExceptionReason.invalid] if no request exists for the given [accountRequestId] or [verificationCode] is invalid.

Parameters:

  • accountRequestId (InvalidType) (required)

  • verificationCode (String) (required)

finishRegistration#

Future<InvalidType> finishRegistration({required String registrationToken, required String password})

Completes a new account registration, creating a new auth user with a profile and attaching the given email account to it.

Throws an [EmailAccountRequestException] in case of errors, with reason:

  • [EmailAccountRequestExceptionReason.expired] if the account request has already expired.

  • [EmailAccountRequestExceptionReason.policyViolation] if the password does not comply with the password policy.

  • [EmailAccountRequestExceptionReason.invalid] if the [registrationToken] is invalid.

Throws an [AuthUserBlockedException] if the auth user is blocked.

Returns a session for the newly created user.

Parameters:

  • registrationToken (String) (required)

  • password (String) (required)

startPasswordReset#

Future<InvalidType> startPasswordReset({required String email})

Requests a password reset for [email].

If the email address is registered, an email with reset instructions will be send out. If the email is unknown, this method will have no effect.

Always returns a password reset request ID, which can be used to complete the reset. If the email is not registered, the returned ID will not be valid.

Throws an [EmailAccountPasswordResetException] in case of errors, with reason:

  • [EmailAccountPasswordResetExceptionReason.tooManyAttempts] if the user has made too many attempts trying to request a password reset.

Parameters:

  • email (String) (required)

verifyPasswordResetCode#

Future<String> verifyPasswordResetCode({required InvalidType passwordResetRequestId, required String verificationCode})

Verifies a password reset code and returns a finishPasswordResetToken that can be used to finish the password reset.

Throws an [EmailAccountPasswordResetException] in case of errors, with reason:

  • [EmailAccountPasswordResetExceptionReason.expired] if the password reset request has already expired.

  • [EmailAccountPasswordResetExceptionReason.tooManyAttempts] if the user has made too many attempts trying to verify the password reset.

  • [EmailAccountPasswordResetExceptionReason.invalid] if no request exists for the given [passwordResetRequestId] or [verificationCode] is invalid.

If multiple steps are required to complete the password reset, this endpoint should be overridden to return credentials for the next step instead of the credentials for setting the password.

Parameters:

  • passwordResetRequestId (InvalidType) (required)

  • verificationCode (String) (required)

finishPasswordReset#

Future<void> finishPasswordReset({required String finishPasswordResetToken, required String newPassword})

Completes a password reset request by setting a new password.

The [verificationCode] returned from [verifyPasswordResetCode] is used to validate the password reset request.

Throws an [EmailAccountPasswordResetException] in case of errors, with reason:

  • [EmailAccountPasswordResetExceptionReason.expired] if the password reset request has already expired.

  • [EmailAccountPasswordResetExceptionReason.policyViolation] if the new password does not comply with the password policy.

  • [EmailAccountPasswordResetExceptionReason.invalid] if no request exists for the given [passwordResetRequestId] or [verificationCode] is invalid.

Throws an [AuthUserBlockedException] if the auth user is blocked.

Parameters:

  • finishPasswordResetToken (String) (required)

  • newPassword (String) (required)


class EndpointGoogleIdp#

{@category Endpoint}

Constructors#

EndpointGoogleIdp#

EndpointGoogleIdp(InvalidType caller)

Properties#

  • nameString

Methods#

login#

Future<InvalidType> login({required String idToken, required String? accessToken})

Validates a Google ID token and either logs in the associated user or creates a new user account if the Google account ID is not yet known.

If a new user is created an associated [UserProfile] is also created.

Parameters:

  • idToken (String) (required)

  • accessToken (String?) (required)


class EndpointJwtRefresh#

By extending [RefreshJwtTokensEndpoint], the JWT token refresh endpoint is made available on the server and enables automatic token refresh on the client. {@category Endpoint}

Constructors#

EndpointJwtRefresh#

EndpointJwtRefresh(InvalidType caller)

Properties#

  • nameString

Methods#

refreshAccessToken#

Future<InvalidType> refreshAccessToken({required String refreshToken})

Creates a new token pair for the given [refreshToken].

Can throw the following exceptions: -[RefreshTokenMalformedException]: refresh token is malformed and could not be parsed. Not expected to happen for tokens issued by the server. -[RefreshTokenNotFoundException]: refresh token is unknown to the server. Either the token was deleted or generated by a different server. -[RefreshTokenExpiredException]: refresh token has expired. Will happen only if it has not been used within configured refreshTokenLifetime. -[RefreshTokenInvalidSecretException]: refresh token is incorrect, meaning it does not refer to the current secret refresh token. This indicates either a malfunctioning client or a malicious attempt by someone who has obtained the refresh token. In this case the underlying refresh token will be deleted, and access to it will expire fully when the last access token is elapsed.

This endpoint is unauthenticated, meaning the client won’t include any authentication information with the call.

Parameters:

  • refreshToken (String) (required)


abstract class Evaluation#

Result of evaluating one sample.

Constructors#

Evaluation#

Evaluation({InvalidType id, required InvalidType runId, Run? run, required InvalidType taskId, Task? task, required InvalidType sampleId, Sample? sample, required InvalidType modelId, Model? model, required InvalidType datasetId, Dataset? dataset, required List<Variant> variant, required String output, required List<ToolCallData> toolCalls, required int retryCount, String? error, required bool neverSucceeded, required double durationSeconds, bool? analyzerPassed, int? testsPassed, int? testsTotal, double? structureScore, String? failureReason, required int inputTokens, required int outputTokens, required int reasoningTokens, DateTime? createdAt})

Evaluation.fromJson#

Evaluation.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idInvalidType

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • runIdInvalidType

  • runRun?

    The parent run.

  • taskIdInvalidType

  • taskTask?

    The parent task.

  • sampleIdInvalidType

  • sampleSample?

    The sample that was evaluated.

  • modelIdInvalidType

  • modelModel?

    The model that was evaluated.

  • datasetIdInvalidType

  • datasetDataset?

    The dataset this sample belongs to (e.g., “flutter_qa_dataset”).

  • variantList<Variant>

    Variant configuration.

  • outputString

    The actual output generated by the model.

  • toolCallsList<ToolCallData>

    Tool calls made during evaluation.

  • retryCountint

    Number of times this sample was retried.

  • errorString?

    Error message if sample failed.

  • neverSucceededbool

    True if all retries failed (exclude from accuracy calculations).

  • durationSecondsdouble

    Total time for this sample in seconds.

  • analyzerPassedbool?

    Did flutter analyze pass?

  • testsPassedint?

    Number of tests passed.

  • testsTotalint?

    Total number of tests.

  • structureScoredouble?

    Code structure validation score (0.0-1.0).

  • failureReasonString?

    Categorized failure reason: “analyzer_error”, “test_failure”, “missing_structure”.

  • inputTokensint

    Input tokens for this sample.

  • outputTokensint

    Output tokens for this sample.

  • reasoningTokensint

    Reasoning tokens for this sample.

  • createdAtDateTime

    When this evaluation was run.

Methods#

copyWith#

Evaluation copyWith({InvalidType id, InvalidType runId, Run? run, InvalidType taskId, Task? task, InvalidType sampleId, Sample? sample, InvalidType modelId, Model? model, InvalidType datasetId, Dataset? dataset, List<Variant>? variant, String? output, List<ToolCallData>? toolCalls, int? retryCount, String? error, bool? neverSucceeded, double? durationSeconds, bool? analyzerPassed, int? testsPassed, int? testsTotal, double? structureScore, String? failureReason, int? inputTokens, int? outputTokens, int? reasoningTokens, DateTime? createdAt})

Returns a shallow copy of this [Evaluation] with some or all fields replaced by the given arguments.

Parameters:

  • id (InvalidType)

  • runId (InvalidType)

  • run (Run?)

  • taskId (InvalidType)

  • task (Task?)

  • sampleId (InvalidType)

  • sample (Sample?)

  • modelId (InvalidType)

  • model (Model?)

  • datasetId (InvalidType)

  • dataset (Dataset?)

  • variant (List<Variant>?)

  • output (String?)

  • toolCalls (List<ToolCallData>?)

  • retryCount (int?)

  • error (String?)

  • neverSucceeded (bool?)

  • durationSeconds (double?)

  • analyzerPassed (bool?)

  • testsPassed (int?)

  • testsTotal (int?)

  • structureScore (double?)

  • failureReason (String?)

  • inputTokens (int?)

  • outputTokens (int?)

  • reasoningTokens (int?)

  • createdAt (DateTime?)

toJson#

Map<String, dynamic> toJson()

abstract class Greeting#

A greeting message which can be sent to or from the server.

Constructors#

Greeting#

Greeting({required String message, required String author, required DateTime timestamp})

Greeting.fromJson#

Greeting.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • messageString

    The greeting message.

  • authorString

    The author of the greeting message.

  • timestampDateTime

    The time when the message was created.

Methods#

copyWith#

Greeting copyWith({String? message, String? author, DateTime? timestamp})

Returns a shallow copy of this [Greeting] with some or all fields replaced by the given arguments.

Parameters:

  • message (String?)

  • author (String?)

  • timestamp (DateTime?)

toJson#

Map<String, dynamic> toJson()

abstract class Model#

An LLM being evaluated.

Constructors#

Model#

Model({InvalidType id, required String name})

Model.fromJson#

Model.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idInvalidType

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • nameString

    Unique identifier for the model.

Methods#

copyWith#

Model copyWith({InvalidType id, String? name})

Returns a shallow copy of this [Model] with some or all fields replaced by the given arguments.

Parameters:

  • id (InvalidType)

  • name (String?)

toJson#

Map<String, dynamic> toJson()

class Modules#

Constructors#

Modules#

Modules(Client client)

Properties#

  • serverpod_auth_idpInvalidType (final)

  • serverpod_auth_coreInvalidType (final)


class Protocol#

Constructors#

Protocol#

Protocol()

Methods#

static getClassNameFromObjectJson#

static String? getClassNameFromObjectJson(dynamic data)

Parameters:

  • data (dynamic) (required)

deserialize#

T deserialize(dynamic data, [Type? t])

Parameters:

  • data (dynamic) (required)

  • t (Type?)

static getClassNameForType#

static String? getClassNameForType(Type type)

Parameters:

  • type (Type) (required)

getClassNameForObject#

String? getClassNameForObject(Object? data)

Parameters:

  • data (Object?) (required)

deserializeByClassName#

dynamic deserializeByClassName(Map<String, dynamic> data)

Parameters:

  • data (Map<String, dynamic>) (required)

mapRecordToJson#

Map<String, dynamic>? mapRecordToJson(Record? record)

Maps any Records known to this [Protocol] to their JSON representation

Throws in case the record type is not known.

This method will return null (only) for null inputs.

Parameters:

  • record (Record?) (required)


abstract class Run#

A collection of tasks executed together.

Constructors#

Run#

Run({InvalidType id, required String inspectId, required Status status, required List<String> variants, required String mcpServerVersion, required int batchRuntimeSeconds, List<Model>? models, List<Dataset>? datasets, List<Task>? tasks, DateTime? createdAt})

Run.fromJson#

Run.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idInvalidType

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • inspectIdString

    InspectAI-generated Id.

  • statusStatus

    Run status (e.g., “complete”, “inProgress”, “failed”).

  • variantsList<String>

    The variant configurations used in this run.

  • mcpServerVersionString

    Version of the MCP server used during evaluation.

  • batchRuntimeSecondsint

    Total script runtime in seconds.

  • modelsList<Model>?

    List of models evaluated in this run.

  • datasetsList<Dataset>?

    List of datasets evaluated in this run.

  • tasksList<Task>?

    List of Inspect AI task names that were run.

  • createdAtDateTime

    Creation time for this record.

Methods#

copyWith#

Run copyWith({InvalidType id, String? inspectId, Status? status, List<String>? variants, String? mcpServerVersion, int? batchRuntimeSeconds, List<Model>? models, List<Dataset>? datasets, List<Task>? tasks, DateTime? createdAt})

Returns a shallow copy of this [Run] with some or all fields replaced by the given arguments.

Parameters:

  • id (InvalidType)

  • inspectId (String?)

  • status (Status?)

  • variants (List<String>?)

  • mcpServerVersion (String?)

  • batchRuntimeSeconds (int?)

  • models (List<Model>?)

  • datasets (List<Dataset>?)

  • tasks (List<Task>?)

  • createdAt (DateTime?)

toJson#

Map<String, dynamic> toJson()

abstract class RunSummary#

Metadata for the outcomes of a given [Run]. This is a separate table from [Run] because otherwise each of these columns would have to be nullable on [Run], as they are generated after the run is completed.

Constructors#

RunSummary#

RunSummary({InvalidType id, required InvalidType runId, Run? run, required int totalTasks, required int totalSamples, required double avgAccuracy, required int totalTokens, required int inputTokens, required int outputTokens, required int reasoningTokens, DateTime? createdAt})

RunSummary.fromJson#

RunSummary.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idInvalidType

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • runIdInvalidType

  • runRun?

    Run this summary belongs to.

  • totalTasksint

    Number of tasks in this run.

  • totalSamplesint

    Total number of samples evaluated.

  • avgAccuracydouble

    Average accuracy across all tasks (0.0 to 1.0).

  • totalTokensint

    Total token usage.

  • inputTokensint

    Input tokens used.

  • outputTokensint

    Output tokens generated.

  • reasoningTokensint

    Reasoning tokens used (for models that support it).

  • createdAtDateTime

    Creation time for this record.

Methods#

copyWith#

RunSummary copyWith({InvalidType id, InvalidType runId, Run? run, int? totalTasks, int? totalSamples, double? avgAccuracy, int? totalTokens, int? inputTokens, int? outputTokens, int? reasoningTokens, DateTime? createdAt})

Returns a shallow copy of this [RunSummary] with some or all fields replaced by the given arguments.

Parameters:

  • id (InvalidType)

  • runId (InvalidType)

  • run (Run?)

  • totalTasks (int?)

  • totalSamples (int?)

  • avgAccuracy (double?)

  • totalTokens (int?)

  • inputTokens (int?)

  • outputTokens (int?)

  • reasoningTokens (int?)

  • createdAt (DateTime?)

toJson#

Map<String, dynamic> toJson()

abstract class Sample#

A single challenge to be presented to a [Model] and evaluated by one or more [Scorer]s.

Constructors#

Sample#

Sample({InvalidType id, required String name, required InvalidType datasetId, Dataset? dataset, required String input, required String target, List<SampleTagXref>? tagsXref, bool? isActive, DateTime? createdAt})

Sample.fromJson#

Sample.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idInvalidType

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • nameString

    Short sample name/ID (e.g., “dart_futures_vs_streams”).

  • datasetIdInvalidType

  • datasetDataset?

    The dataset this sample belongs to (e.g., “dart_qa_dataset”).

  • inputString

    The input prompt/question for the model.

  • targetString

    The expected answer or grading guidance.

  • tagsXrefList<SampleTagXref>?

    Tags associated with this sample (e.g., [“dart”, “flutter”]). Technically, this relationship only reaches the cross-reference table, not the tags themselves.

  • isActivebool

    True if the sample is still active and included in eval runs.

  • createdAtDateTime

    Creation time for this record.

Methods#

copyWith#

Sample copyWith({InvalidType id, String? name, InvalidType datasetId, Dataset? dataset, String? input, String? target, List<SampleTagXref>? tagsXref, bool? isActive, DateTime? createdAt})

Returns a shallow copy of this [Sample] with some or all fields replaced by the given arguments.

Parameters:

  • id (InvalidType)

  • name (String?)

  • datasetId (InvalidType)

  • dataset (Dataset?)

  • input (String?)

  • target (String?)

  • tagsXref (List<SampleTagXref>?)

  • isActive (bool?)

  • createdAt (DateTime?)

toJson#

Map<String, dynamic> toJson()

abstract class SampleTagXref#

Cross reference table for samples and tags.

Constructors#

SampleTagXref#

SampleTagXref({int? id, required InvalidType sampleId, Sample? sample, required InvalidType tagId, Tag? tag})

SampleTagXref.fromJson#

SampleTagXref.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idint?

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • sampleIdInvalidType

  • sampleSample?

  • tagIdInvalidType

  • tagTag?

Methods#

copyWith#

SampleTagXref copyWith({int? id, InvalidType sampleId, Sample? sample, InvalidType tagId, Tag? tag})

Returns a shallow copy of this [SampleTagXref] with some or all fields replaced by the given arguments.

Parameters:

  • id (int?)

  • sampleId (InvalidType)

  • sample (Sample?)

  • tagId (InvalidType)

  • tag (Tag?)

toJson#

Map<String, dynamic> toJson()

abstract class Scorer#

Ye who watch the watchers.

Constructors#

Scorer#

Scorer({InvalidType id, required String name})

Scorer.fromJson#

Scorer.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idInvalidType

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • nameString

    Name of the scorer (e.g., “bleu”).

Methods#

copyWith#

Scorer copyWith({InvalidType id, String? name})

Returns a shallow copy of this [Scorer] with some or all fields replaced by the given arguments.

Parameters:

  • id (InvalidType)

  • name (String?)

toJson#

Map<String, dynamic> toJson()

abstract class ScorerResult#

A scorer’s assessment of a task.

Constructors#

ScorerResult#

ScorerResult({InvalidType id, required InvalidType scorerId, Scorer? scorer, required InvalidType evaluationId, Evaluation? evaluation, required InvalidType data})

ScorerResult.fromJson#

ScorerResult.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idInvalidType

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • scorerIdInvalidType

  • scorerScorer?

    Scorer this summary belongs to.

  • evaluationIdInvalidType

  • evaluationEvaluation?

    Whether this scorer data is for a baseline run.

  • dataInvalidType

    Flexible data archived by the scorer.

Methods#

copyWith#

ScorerResult copyWith({InvalidType id, InvalidType scorerId, Scorer? scorer, InvalidType evaluationId, Evaluation? evaluation, InvalidType data})

Returns a shallow copy of this [ScorerResult] with some or all fields replaced by the given arguments.

Parameters:

  • id (InvalidType)

  • scorerId (InvalidType)

  • scorer (Scorer?)

  • evaluationId (InvalidType)

  • evaluation (Evaluation?)

  • data (InvalidType)

toJson#

Map<String, dynamic> toJson()

abstract class Tag#

Category for a sample.

Constructors#

Tag#

Tag({InvalidType id, required String name, List<SampleTagXref>? samplesXref})

Tag.fromJson#

Tag.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idInvalidType

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • nameString

    Unique identifier for the tag.

  • samplesXrefList<SampleTagXref>?

    Samples associated with this tag. Technically, this relationship only reaches the cross-reference table, not the samples themselves.

Methods#

copyWith#

Tag copyWith({InvalidType id, String? name, List<SampleTagXref>? samplesXref})

Returns a shallow copy of this [Tag] with some or all fields replaced by the given arguments.

Parameters:

  • id (InvalidType)

  • name (String?)

  • samplesXref (List<SampleTagXref>?)

toJson#

Map<String, dynamic> toJson()

abstract class Task#

Results from evaluating one model against one dataset.

Constructors#

Task#

Task({InvalidType id, required String inspectId, required InvalidType modelId, Model? model, required InvalidType datasetId, Dataset? dataset, required InvalidType runId, Run? run, DateTime? createdAt})

Task.fromJson#

Task.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idInvalidType

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • inspectIdString

    InspectAI-generated Id.

  • modelIdInvalidType

  • modelModel?

    Model identifier (e.g., “google/gemini-2.5-pro”).

  • datasetIdInvalidType

  • datasetDataset?

    Dataset identifier (e.g., “flutter_qa_dataset”).

  • runIdInvalidType

  • runRun?

    Run this task belongs to.

  • createdAtDateTime

    When this task was evaluated.

Methods#

copyWith#

Task copyWith({InvalidType id, String? inspectId, InvalidType modelId, Model? model, InvalidType datasetId, Dataset? dataset, InvalidType runId, Run? run, DateTime? createdAt})

Returns a shallow copy of this [Task] with some or all fields replaced by the given arguments.

Parameters:

  • id (InvalidType)

  • inspectId (String?)

  • modelId (InvalidType)

  • model (Model?)

  • datasetId (InvalidType)

  • dataset (Dataset?)

  • runId (InvalidType)

  • run (Run?)

  • createdAt (DateTime?)

toJson#

Map<String, dynamic> toJson()

abstract class TaskSummary#

Constructors#

TaskSummary#

TaskSummary({InvalidType id, required InvalidType taskId, Task? task, required int totalSamples, required int passedSamples, required double accuracy, String? taskName, required int inputTokens, required int outputTokens, required int totalTokens, required int reasoningTokens, String? variant, required int executionTimeSeconds, required int samplesWithRetries, required int samplesNeverSucceeded, required int totalRetries})

TaskSummary.fromJson#

TaskSummary.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • idInvalidType

    The database id, set if the object has been inserted into the database or if it has been fetched from the database. Otherwise, the id will be null.

  • taskIdInvalidType

  • taskTask?

    Task this summary belongs to.

  • totalSamplesint

    Total number of samples in this task.

  • passedSamplesint

    Number of samples that passed.

  • accuracydouble

    Accuracy as a value from 0.0 to 1.0.

  • taskNameString?

    The Inspect AI task function name (e.g., “qa_task”).

  • inputTokensint

    Input tokens used.

  • outputTokensint

    Output tokens generated.

  • totalTokensint

    Total tokens used.

  • reasoningTokensint

    Reasoning tokens used (for models that support it).

  • variantString?

    Variant configuration used (e.g., “baseline”, “dart_mcp”).

  • executionTimeSecondsint

    Total execution time in seconds.

  • samplesWithRetriesint

    Number of samples that needed retries.

  • samplesNeverSucceededint

    Number of samples that failed all retries (excluded from accuracy).

  • totalRetriesint

    Total number of retries across all samples.

Methods#

copyWith#

TaskSummary copyWith({InvalidType id, InvalidType taskId, Task? task, int? totalSamples, int? passedSamples, double? accuracy, String? taskName, int? inputTokens, int? outputTokens, int? totalTokens, int? reasoningTokens, String? variant, int? executionTimeSeconds, int? samplesWithRetries, int? samplesNeverSucceeded, int? totalRetries})

Returns a shallow copy of this [TaskSummary] with some or all fields replaced by the given arguments.

Parameters:

  • id (InvalidType)

  • taskId (InvalidType)

  • task (Task?)

  • totalSamples (int?)

  • passedSamples (int?)

  • accuracy (double?)

  • taskName (String?)

  • inputTokens (int?)

  • outputTokens (int?)

  • totalTokens (int?)

  • reasoningTokens (int?)

  • variant (String?)

  • executionTimeSeconds (int?)

  • samplesWithRetries (int?)

  • samplesNeverSucceeded (int?)

  • totalRetries (int?)

toJson#

Map<String, dynamic> toJson()

abstract class ToolCallData#

Result of a tool call made during evaluation. Not a database table.

Constructors#

ToolCallData#

ToolCallData({required String name, required Map<String, String> arguments})

ToolCallData.fromJson#

ToolCallData.fromJson(Map<String, dynamic> jsonSerialization)

Properties#

  • nameString

    Name of the tool.

  • argumentsMap<String, String>

    Arguments passed to the tool.

Methods#

copyWith#

ToolCallData copyWith({String? name, Map<String, String>? arguments})

Returns a shallow copy of this [ToolCallData] with some or all fields replaced by the given arguments.

Parameters:

  • name (String?)

  • arguments (Map<String, String>?)

toJson#

Map<String, dynamic> toJson()

enum Status#

Values#

  • complete

  • inProgress

  • failed


enum Variant#

Values#

  • mcp

  • rules