Skip to content

Start Data Quality Rule Recommendation Run

glue_start_data_quality_rule_recommendation_run R Documentation

Starts a recommendation run that is used to generate rules when you don't know what rules to write

Description

Starts a recommendation run that is used to generate rules when you don't know what rules to write. Glue Data Quality analyzes the data and comes up with recommendations for a potential ruleset. You can then triage the ruleset and modify the generated ruleset to your liking.

Recommendation runs are automatically deleted after 90 days.

Usage

glue_start_data_quality_rule_recommendation_run(DataSource, Role,
  NumberOfWorkers, Timeout, CreatedRulesetName, ClientToken)

Arguments

DataSource

[required] The data source (Glue table) associated with this run.

Role

[required] An IAM role supplied to encrypt the results of the run.

NumberOfWorkers

The number of G.1X workers to be used in the run. The default is 5.

Timeout

The timeout for a run in minutes. This is the maximum time that a run can consume resources before it is terminated and enters TIMEOUT status. The default is 2,880 minutes (48 hours).

CreatedRulesetName

A name for the ruleset.

ClientToken

Used for idempotency and is recommended to be set to a random ID (such as a UUID) to avoid creating or starting multiple instances of the same resource.

Value

A list with the following syntax:

list(
  RunId = "string"
)

Request syntax

svc$start_data_quality_rule_recommendation_run(
  DataSource = list(
    GlueTable = list(
      DatabaseName = "string",
      TableName = "string",
      CatalogId = "string",
      ConnectionName = "string",
      AdditionalOptions = list(
        "string"
      )
    )
  ),
  Role = "string",
  NumberOfWorkers = 123,
  Timeout = 123,
  CreatedRulesetName = "string",
  ClientToken = "string"
)