Create Optimization Job
| sagemaker_create_optimization_job | R Documentation |
Creates a job that optimizes a model for inference performance¶
Description¶
Creates a job that optimizes a model for inference performance. To create the job, you provide the location of a source model, and you provide the settings for the optimization techniques that you want the job to apply. When the job completes successfully, SageMaker uploads the new optimized model to the output destination that you specify.
For more information about how to use this action, and about the supported optimization techniques, see Optimize model inference with Amazon SageMaker.
Usage¶
sagemaker_create_optimization_job(OptimizationJobName, RoleArn,
ModelSource, DeploymentInstanceType, MaxInstanceCount,
OptimizationEnvironment, OptimizationConfigs, OutputConfig,
StoppingCondition, Tags, VpcConfig)
Arguments¶
OptimizationJobName |
[required] A custom name for the new optimization job. |
RoleArn |
[required] The Amazon Resource Name (ARN) of an IAM role that enables Amazon SageMaker AI to perform tasks on your behalf. During model optimization, Amazon SageMaker AI needs your permission to:
You grant permissions for all of these tasks to an IAM role. To pass
this role to Amazon SageMaker AI, the caller of this API must have the
|
ModelSource |
[required] The location of the source model to optimize with an optimization job. |
DeploymentInstanceType |
[required] The type of instance that hosts the optimized model that you create with the optimization job. |
MaxInstanceCount |
The maximum number of instances to use for the optimization job. |
OptimizationEnvironment |
The environment variables to set in the model container. |
OptimizationConfigs |
[required] Settings for each of the optimization techniques that the job applies. |
OutputConfig |
[required] Details for where to store the optimized model that you create with the optimization job. |
StoppingCondition |
[required] Specifies a limit to how long a job can run. When the job reaches the time limit, SageMaker ends the job. Use this API to cap costs. To stop a training job, SageMaker sends the algorithm the
The training algorithms provided by SageMaker automatically save the
intermediate results of a model training job when possible. This attempt
to save artifacts is only a best effort case as model might not be in a
state from which it can be saved. For example, if training has just
started, the model might not be ready to save. When saved, this
intermediate data is a valid model artifact. You can use it to create a
model with The Neural Topic Model (NTM) currently does not support saving intermediate model artifacts. When training NTMs, make sure that the maximum runtime is sufficient for the training job to complete. |
Tags |
A list of key-value pairs associated with the optimization job. For more information, see Tagging Amazon Web Services resources in the Amazon Web Services General Reference Guide. |
VpcConfig |
A VPC in Amazon VPC that your optimized model has access to. |
Value¶
A list with the following syntax:
list(
OptimizationJobArn = "string"
)
Request syntax¶
svc$create_optimization_job(
OptimizationJobName = "string",
RoleArn = "string",
ModelSource = list(
S3 = list(
S3Uri = "string",
ModelAccessConfig = list(
AcceptEula = TRUE|FALSE
)
),
SageMakerModel = list(
ModelName = "string"
)
),
DeploymentInstanceType = "ml.p4d.24xlarge"|"ml.p4de.24xlarge"|"ml.p5.48xlarge"|"ml.p5e.48xlarge"|"ml.p5en.48xlarge"|"ml.g4dn.xlarge"|"ml.g4dn.2xlarge"|"ml.g4dn.4xlarge"|"ml.g4dn.8xlarge"|"ml.g4dn.12xlarge"|"ml.g4dn.16xlarge"|"ml.g5.xlarge"|"ml.g5.2xlarge"|"ml.g5.4xlarge"|"ml.g5.8xlarge"|"ml.g5.12xlarge"|"ml.g5.16xlarge"|"ml.g5.24xlarge"|"ml.g5.48xlarge"|"ml.g6.xlarge"|"ml.g6.2xlarge"|"ml.g6.4xlarge"|"ml.g6.8xlarge"|"ml.g6.12xlarge"|"ml.g6.16xlarge"|"ml.g6.24xlarge"|"ml.g6.48xlarge"|"ml.g6e.xlarge"|"ml.g6e.2xlarge"|"ml.g6e.4xlarge"|"ml.g6e.8xlarge"|"ml.g6e.12xlarge"|"ml.g6e.16xlarge"|"ml.g6e.24xlarge"|"ml.g6e.48xlarge"|"ml.inf2.xlarge"|"ml.inf2.8xlarge"|"ml.inf2.24xlarge"|"ml.inf2.48xlarge"|"ml.trn1.2xlarge"|"ml.trn1.32xlarge"|"ml.trn1n.32xlarge",
MaxInstanceCount = 123,
OptimizationEnvironment = list(
"string"
),
OptimizationConfigs = list(
list(
ModelQuantizationConfig = list(
Image = "string",
OverrideEnvironment = list(
"string"
)
),
ModelCompilationConfig = list(
Image = "string",
OverrideEnvironment = list(
"string"
)
),
ModelShardingConfig = list(
Image = "string",
OverrideEnvironment = list(
"string"
)
),
ModelSpeculativeDecodingConfig = list(
Technique = "EAGLE",
TrainingDataSource = list(
S3Uri = "string",
S3DataType = "S3Prefix"|"ManifestFile"
)
)
)
),
OutputConfig = list(
KmsKeyId = "string",
S3OutputLocation = "string",
SageMakerModel = list(
ModelName = "string"
)
),
StoppingCondition = list(
MaxRuntimeInSeconds = 123,
MaxWaitTimeInSeconds = 123,
MaxPendingTimeInSeconds = 123
),
Tags = list(
list(
Key = "string",
Value = "string"
)
),
VpcConfig = list(
SecurityGroupIds = list(
"string"
),
Subnets = list(
"string"
)
)
)