SageMaker endpoint production variants should have an initial instance count greater than 1
A SageMaker endpoint backed by a single instance has no redundancy. If that instance fails or gets interrupted during maintenance, inference requests fail until a replacement launches. In a production pipeline, that recovery window cascades into application-level outages.
Running at least two instances per production variant distributes traffic and tolerates a single-instance failure without downtime. The marginal cost of a second instance is almost always less than the cost of an unplanned inference outage.
Retrofit consideration
Increasing initial_instance_count on an existing endpoint requires creating a new aws_sagemaker_endpoint_configuration and updating the endpoint to point at it, which triggers a rolling deployment. Review your endpoint update policy before applying.
Implementation
Choose the approach that matches how you manage Terraform.
Use AWS provider resources directly. See docs for the resources involved: aws_sagemaker_endpoint_configuration.
resource "aws_sagemaker_endpoint_configuration" "this" {
name = "pofix-abc123"
production_variants {
initial_instance_count = 2
instance_type = "ml.t2.medium"
model_name = "example-sagemaker-model"
variant_name = "AllTraffic"
}
}
What this control checks
aws_sagemaker_endpoint_configuration defines one or more production_variants blocks. Each block's initial_instance_count controls how many ML instances back that variant at launch. The control requires every variant to set it to 2 or higher. A value of 1 fails. Endpoints with multiple production variants must each meet the threshold independently.
Common pitfalls
Endpoint configurations are immutable
SageMaker endpoint configurations cannot be updated in place. Changing
initial_instance_countin Terraform destroys and recreates theaws_sagemaker_endpoint_configuration, which then requires updating the associatedaws_sagemaker_endpoint. Without acreate_before_destroylifecycle rule or a blue-green naming strategy, that sequence causes downtime.Auto-scaling does not satisfy the initial count check
Auto-scaling minimum capacity and
initial_instance_countare not the same thing. Anaws_appautoscaling_targetwithmin_capacityof 2 does not change the static value stored in the endpoint configuration. This control checks the configuration, not runtime counts, so you still needinitial_instance_count >= 2set explicitly.Serverless inference variants are not applicable
This control applies only to real-time endpoints with provisioned instances. If you use
serverless_configinside aproduction_variantsblock,initial_instance_countis unused andinitial_variant_weighthandles traffic distribution instead. Serverless variants may be flagged as non-compliant or skipped depending on the Config rule implementation, so confirm how your policy engine handles them before drawing conclusions.Shadow variants vs production variants
shadow_production_variantsis a separate block fromproduction_variantsinaws_sagemaker_endpoint_configuration, and this control only evaluates the latter. If your compliance output shows fewer resources than expected, confirm you are not conflating shadow (testing) variants with production ones.
Audit evidence
Config rule evaluation results showing all aws_sagemaker_endpoint_configuration resources as COMPLIANT confirm every production variant meets the instance count threshold. Console screenshots of the endpoint configuration details page, with the 'Initial instance count' column visible for each variant, work as point-in-time evidence. CloudTrail CreateEndpointConfig events corroborate the values submitted at creation. Output from Prowler or Steampipe, filtered to this rule, covers the gaps between Config snapshots.
Framework-specific interpretation
Related controls
Tool mappings
Use these identifiers to cross-reference this control across tools, reports, and evidence.
Compliance.tf Control:
sagemaker_endpoint_configuration_prod_instance_count_greater_than_oneAWS Config Managed Rule:
SAGEMAKER_ENDPOINT_CONFIG_PROD_INSTANCE_COUNTPowerpipe Control:
aws_compliance.control.sagemaker_endpoint_configuration_prod_instance_count_greater_than_oneProwler Check:
sagemaker_endpoint_config_prod_variant_instancesAWS Security Hub Control:
SageMaker.4
Last reviewed: 2026-03-09