Skip to content

Client

emr R Documentation

Amazon EMR

Description

Amazon EMR is a web service that makes it easier to process large amounts of data efficiently. Amazon EMR uses Hadoop processing combined with several Amazon Web Services services to do tasks such as web indexing, data mining, log file analysis, machine learning, scientific simulation, and data warehouse management.

Usage

emr(config = list(), credentials = list(), endpoint = NULL, region = NULL)

Arguments

config

Optional configuration of credentials, endpoint, and/or region.

  • credentials:

    • creds:

      • access_key_id: AWS access key ID

      • secret_access_key: AWS secret access key

      • session_token: AWS temporary session token

    • profile: The name of a profile to use. If not given, then the default profile is used.

    • anonymous: Set anonymous credentials.

  • endpoint: The complete URL to use for the constructed client.

  • region: The AWS Region used in instantiating the client.

  • close_connection: Immediately close all HTTP connections.

  • timeout: The time in seconds till a timeout exception is thrown when attempting to make a connection. The default is 60 seconds.

  • s3_force_path_style: Set this to true to force the request to use path-style addressing, i.e. ⁠http://s3.amazonaws.com/BUCKET/KEY⁠.

  • sts_regional_endpoint: Set sts regional endpoint resolver to regional or legacy https://docs.aws.amazon.com/sdkref/latest/guide/feature-sts-regionalized-endpoints.html

credentials

Optional credentials shorthand for the config parameter

  • creds:

    • access_key_id: AWS access key ID

    • secret_access_key: AWS secret access key

    • session_token: AWS temporary session token

  • profile: The name of a profile to use. If not given, then the default profile is used.

  • anonymous: Set anonymous credentials.

endpoint

Optional shorthand for complete URL to use for the constructed client.

region

Optional shorthand for AWS Region used in instantiating the client.

Value

A client for the service. You can call the service's operations using syntax like svc$operation(...), where svc is the name you've assigned to the client. The available operations are listed in the Operations section.

Service syntax

svc <- emr(
  config = list(
    credentials = list(
      creds = list(
        access_key_id = "string",
        secret_access_key = "string",
        session_token = "string"
      ),
      profile = "string",
      anonymous = "logical"
    ),
    endpoint = "string",
    region = "string",
    close_connection = "logical",
    timeout = "numeric",
    s3_force_path_style = "logical",
    sts_regional_endpoint = "string"
  ),
  credentials = list(
    creds = list(
      access_key_id = "string",
      secret_access_key = "string",
      session_token = "string"
    ),
    profile = "string",
    anonymous = "logical"
  ),
  endpoint = "string",
  region = "string"
)

Operations

add_instance_fleet
Adds an instance fleet to a running cluster
add_instance_groups
Adds one or more instance groups to a running cluster
add_job_flow_steps
AddJobFlowSteps adds new steps to a running cluster
add_tags
Adds tags to an Amazon EMR resource, such as a cluster or an Amazon EMR Studio
cancel_steps
Cancels a pending step or steps in a running cluster
create_security_configuration
Creates a security configuration, which is stored in the service and can be specified when a cluster is created
create_studio
Creates a new Amazon EMR Studio
create_studio_session_mapping
Maps a user or group to the Amazon EMR Studio specified by StudioId, and applies a session policy to refine Studio permissions for that user or group
delete_security_configuration
Deletes a security configuration
delete_studio
Removes an Amazon EMR Studio from the Studio metadata store
delete_studio_session_mapping
Removes a user or group from an Amazon EMR Studio
describe_cluster
Provides cluster-level details including status, hardware and software configuration, VPC settings, and so on
describe_job_flows
This API is no longer supported and will eventually be removed
describe_notebook_execution
Provides details of a notebook execution
describe_release_label
Provides Amazon EMR release label details, such as the releases available the Region where the API request is run, and the available applications for a specific Amazon EMR release label
describe_security_configuration
Provides the details of a security configuration by returning the configuration JSON
describe_step
Provides more detail about the cluster step
describe_studio
Returns details for the specified Amazon EMR Studio including ID, Name, VPC, Studio access URL, and so on
get_auto_termination_policy
Returns the auto-termination policy for an Amazon EMR cluster
get_block_public_access_configuration
Returns the Amazon EMR block public access configuration for your Amazon Web Services account in the current Region
get_cluster_session_credentials
Provides temporary, HTTP basic credentials that are associated with a given runtime IAM role and used by a cluster with fine-grained access control activated
get_managed_scaling_policy
Fetches the attached managed scaling policy for an Amazon EMR cluster
get_studio_session_mapping
Fetches mapping details for the specified Amazon EMR Studio and identity (user or group)
list_bootstrap_actions
Provides information about the bootstrap actions associated with a cluster
list_clusters
Provides the status of all clusters visible to this Amazon Web Services account
list_instance_fleets
Lists all available details about the instance fleets in a cluster
list_instance_groups
Provides all available details about the instance groups in a cluster
list_instances
Provides information for all active Amazon EC2 instances and Amazon EC2 instances terminated in the last 30 days, up to a maximum of 2,000
list_notebook_executions
Provides summaries of all notebook executions
list_release_labels
Retrieves release labels of Amazon EMR services in the Region where the API is called
list_security_configurations
Lists all the security configurations visible to this account, providing their creation dates and times, and their names
list_steps
Provides a list of steps for the cluster in reverse order unless you specify stepIds with the request or filter by StepStates
list_studios
Returns a list of all Amazon EMR Studios associated with the Amazon Web Services account
list_studio_session_mappings
Returns a list of all user or group session mappings for the Amazon EMR Studio specified by StudioId
list_supported_instance_types
A list of the instance types that Amazon EMR supports
modify_cluster
Modifies the number of steps that can be executed concurrently for the cluster specified using ClusterID
modify_instance_fleet
Modifies the target On-Demand and target Spot capacities for the instance fleet with the specified InstanceFleetID within the cluster specified using ClusterID
modify_instance_groups
ModifyInstanceGroups modifies the number of nodes and configuration settings of an instance group
put_auto_scaling_policy
Creates or updates an automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster
put_auto_termination_policy
Auto-termination is supported in Amazon EMR releases 5
put_block_public_access_configuration
Creates or updates an Amazon EMR block public access configuration for your Amazon Web Services account in the current Region
put_managed_scaling_policy
Creates or updates a managed scaling policy for an Amazon EMR cluster
remove_auto_scaling_policy
Removes an automatic scaling policy from a specified instance group within an Amazon EMR cluster
remove_auto_termination_policy
Removes an auto-termination policy from an Amazon EMR cluster
remove_managed_scaling_policy
Removes a managed scaling policy from a specified Amazon EMR cluster
remove_tags
Removes tags from an Amazon EMR resource, such as a cluster or Amazon EMR Studio
run_job_flow
RunJobFlow creates and starts running a new cluster (job flow)
set_keep_job_flow_alive_when_no_steps
You can use the SetKeepJobFlowAliveWhenNoSteps to configure a cluster (job flow) to terminate after the step execution, i
set_termination_protection
SetTerminationProtection locks a cluster (job flow) so the Amazon EC2 instances in the cluster cannot be terminated by user intervention, an API call, or in the event of a job-flow error
set_unhealthy_node_replacement
Specify whether to enable unhealthy node replacement, which lets Amazon EMR gracefully replace core nodes on a cluster if any nodes become unhealthy
set_visible_to_all_users
The SetVisibleToAllUsers parameter is no longer supported
start_notebook_execution
Starts a notebook execution
stop_notebook_execution
Stops a notebook execution
terminate_job_flows
TerminateJobFlows shuts a list of clusters (job flows) down
update_studio
Updates an Amazon EMR Studio configuration, including attributes such as name, description, and subnets
update_studio_session_mapping
Updates the session policy attached to the user or group for the specified Amazon EMR Studio

Examples

## Not run: 
svc <- emr()
svc$add_instance_fleet(
  Foo = 123
)

## End(Not run)