Skip to content

Start Pii Entities Detection Job

comprehend_start_pii_entities_detection_job R Documentation

Starts an asynchronous PII entity detection job for a collection of documents

Description

Starts an asynchronous PII entity detection job for a collection of documents.

Usage

comprehend_start_pii_entities_detection_job(InputDataConfig,
  OutputDataConfig, Mode, RedactionConfig, DataAccessRoleArn, JobName,
  LanguageCode, ClientRequestToken, Tags)

Arguments

InputDataConfig

[required] The input properties for a PII entities detection job.

OutputDataConfig

[required] Provides configuration parameters for the output of PII entity detection jobs.

Mode

[required] Specifies whether the output provides the locations (offsets) of PII entities or a file in which PII entities are redacted.

RedactionConfig

Provides configuration parameters for PII entity redaction.

This parameter is required if you set the Mode parameter to ONLY_REDACTION. In that case, you must provide a RedactionConfig definition that includes the PiiEntityTypes parameter.

DataAccessRoleArn

[required] The Amazon Resource Name (ARN) of the IAM role that grants Amazon Comprehend read access to your input data.

JobName

The identifier of the job.

LanguageCode

[required] The language of the input documents. Enter the language code for English (en) or Spanish (es).

ClientRequestToken

A unique identifier for the request. If you don't set the client request token, Amazon Comprehend generates one.

Tags

Tags to associate with the PII entities detection job. A tag is a key-value pair that adds metadata to a resource used by Amazon Comprehend. For example, a tag with "Sales" as the key might be added to a resource to indicate its use by the sales department.

Value

A list with the following syntax:

list(
  JobId = "string",
  JobArn = "string",
  JobStatus = "SUBMITTED"|"IN_PROGRESS"|"COMPLETED"|"FAILED"|"STOP_REQUESTED"|"STOPPED"
)

Request syntax

svc$start_pii_entities_detection_job(
  InputDataConfig = list(
    S3Uri = "string",
    InputFormat = "ONE_DOC_PER_FILE"|"ONE_DOC_PER_LINE",
    DocumentReaderConfig = list(
      DocumentReadAction = "TEXTRACT_DETECT_DOCUMENT_TEXT"|"TEXTRACT_ANALYZE_DOCUMENT",
      DocumentReadMode = "SERVICE_DEFAULT"|"FORCE_DOCUMENT_READ_ACTION",
      FeatureTypes = list(
        "TABLES"|"FORMS"
      )
    )
  ),
  OutputDataConfig = list(
    S3Uri = "string",
    KmsKeyId = "string"
  ),
  Mode = "ONLY_REDACTION"|"ONLY_OFFSETS",
  RedactionConfig = list(
    PiiEntityTypes = list(
      "BANK_ACCOUNT_NUMBER"|"BANK_ROUTING"|"CREDIT_DEBIT_NUMBER"|"CREDIT_DEBIT_CVV"|"CREDIT_DEBIT_EXPIRY"|"PIN"|"EMAIL"|"ADDRESS"|"NAME"|"PHONE"|"SSN"|"DATE_TIME"|"PASSPORT_NUMBER"|"DRIVER_ID"|"URL"|"AGE"|"USERNAME"|"PASSWORD"|"AWS_ACCESS_KEY"|"AWS_SECRET_KEY"|"IP_ADDRESS"|"MAC_ADDRESS"|"ALL"|"LICENSE_PLATE"|"VEHICLE_IDENTIFICATION_NUMBER"|"UK_NATIONAL_INSURANCE_NUMBER"|"CA_SOCIAL_INSURANCE_NUMBER"|"US_INDIVIDUAL_TAX_IDENTIFICATION_NUMBER"|"UK_UNIQUE_TAXPAYER_REFERENCE_NUMBER"|"IN_PERMANENT_ACCOUNT_NUMBER"|"IN_NREGA"|"INTERNATIONAL_BANK_ACCOUNT_NUMBER"|"SWIFT_CODE"|"UK_NATIONAL_HEALTH_SERVICE_NUMBER"|"CA_HEALTH_NUMBER"|"IN_AADHAAR"|"IN_VOTER_NUMBER"
    ),
    MaskMode = "MASK"|"REPLACE_WITH_PII_ENTITY_TYPE",
    MaskCharacter = "string"
  ),
  DataAccessRoleArn = "string",
  JobName = "string",
  LanguageCode = "en"|"es"|"fr"|"de"|"it"|"pt"|"ar"|"hi"|"ja"|"ko"|"zh"|"zh-TW",
  ClientRequestToken = "string",
  Tags = list(
    list(
      Key = "string",
      Value = "string"
    )
  )
)