Skip to content

Create Evaluator

bedrockagentcorecontrol_create_evaluator R Documentation

Creates a custom evaluator for agent quality assessment

Description

Creates a custom evaluator for agent quality assessment. Custom evaluators can use either LLM-as-a-Judge configurations with user-defined prompts, rating scales, and model settings, or code-based configurations with customer-managed Lambda functions to evaluate agent performance at tool call, trace, or session levels.

Usage

bedrockagentcorecontrol_create_evaluator(clientToken, evaluatorName,
  description, evaluatorConfig, level, kmsKeyArn, tags)

Arguments

clientToken

A unique, case-sensitive identifier to ensure that the API request completes no more than one time. If you don't specify this field, a value is randomly generated for you. If this token matches a previous request, the service ignores the request, but doesn't return an error. For more information, see Ensuring idempotency.

evaluatorName

[required] The name of the evaluator. Must be unique within your account.

description

The description of the evaluator that explains its purpose and evaluation criteria.

evaluatorConfig

[required] The configuration for the evaluator. Specify either LLM-as-a-Judge settings with instructions, rating scale, and model configuration, or code-based settings with a customer-managed Lambda function.

level

[required] The evaluation level that determines the scope of evaluation. Valid values are TOOL_CALL for individual tool invocations, TRACE for single request-response interactions, or SESSION for entire conversation sessions.

kmsKeyArn

The Amazon Resource Name (ARN) of a customer managed KMS key to use for encrypting sensitive evaluator data, including instructions and rating scale. If you don't specify a KMS key, the evaluator data is encrypted with an Amazon Web Services owned key. Only symmetric encryption KMS keys are supported. For more information, see Encryption at rest for AgentCore Evaluations.

tags

A map of tag keys and values to assign to an AgentCore Evaluator. Tags enable you to categorize your resources in different ways, for example, by purpose, owner, or environment.

Value

A list with the following syntax:

list(
  evaluatorArn = "string",
  evaluatorId = "string",
  createdAt = as.POSIXct(
    "2015-01-01"
  ),
  status = "ACTIVE"|"CREATING"|"CREATE_FAILED"|"UPDATING"|"UPDATE_FAILED"|"DELETING"
)

Request syntax

svc$create_evaluator(
  clientToken = "string",
  evaluatorName = "string",
  description = "string",
  evaluatorConfig = list(
    llmAsAJudge = list(
      instructions = "string",
      ratingScale = list(
        numerical = list(
          list(
            definition = "string",
            value = 123.0,
            label = "string"
          )
        ),
        categorical = list(
          list(
            definition = "string",
            label = "string"
          )
        )
      ),
      modelConfig = list(
        bedrockEvaluatorModelConfig = list(
          modelId = "string",
          inferenceConfig = list(
            maxTokens = 123,
            temperature = 123.0,
            topP = 123.0,
            stopSequences = list(
              "string"
            )
          ),
          additionalModelRequestFields = list()
        )
      )
    ),
    codeBased = list(
      lambdaConfig = list(
        lambdaArn = "string",
        lambdaTimeoutInSeconds = 123
      )
    )
  ),
  level = "TOOL_CALL"|"TRACE"|"SESSION",
  kmsKeyArn = "string",
  tags = list(
    "string"
  )
)