Skip to content

SageMaker endpoint production variants should have an initial instance count greater than 1

A SageMaker endpoint backed by a single instance has no redundancy. If that instance fails or gets interrupted during maintenance, inference requests fail until a replacement launches. In a production pipeline, that recovery window cascades into application-level outages.

Running at least two instances per production variant distributes traffic and tolerates a single-instance failure without downtime. The marginal cost of a second instance is almost always less than the cost of an unplanned inference outage.

Retrofit consideration

Increasing initial_instance_count on an existing endpoint requires creating a new aws_sagemaker_endpoint_configuration and updating the endpoint to point at it, which triggers a rolling deployment. Review your endpoint update policy before applying.

Implementation

Choose the approach that matches how you manage Terraform.

Use AWS provider resources directly. See docs for the resources involved: aws_sagemaker_endpoint_configuration.

resource "aws_sagemaker_endpoint_configuration" "this" {
  name = "pofix-abc123"

  production_variants {
    initial_instance_count = 2
    instance_type          = "ml.t2.medium"
    model_name             = "example-sagemaker-model"
    variant_name           = "AllTraffic"
  }
}

What this control checks

aws_sagemaker_endpoint_configuration defines one or more production_variants blocks. Each block's initial_instance_count controls how many ML instances back that variant at launch. The control requires every variant to set it to 2 or higher. A value of 1 fails. Endpoints with multiple production variants must each meet the threshold independently.

Common pitfalls

  • Endpoint configurations are immutable

    SageMaker endpoint configurations cannot be updated in place. Changing initial_instance_count in Terraform destroys and recreates the aws_sagemaker_endpoint_configuration, which then requires updating the associated aws_sagemaker_endpoint. Without a create_before_destroy lifecycle rule or a blue-green naming strategy, that sequence causes downtime.

  • Auto-scaling does not satisfy the initial count check

    Auto-scaling minimum capacity and initial_instance_count are not the same thing. An aws_appautoscaling_target with min_capacity of 2 does not change the static value stored in the endpoint configuration. This control checks the configuration, not runtime counts, so you still need initial_instance_count >= 2 set explicitly.

  • Serverless inference variants are not applicable

    This control applies only to real-time endpoints with provisioned instances. If you use serverless_config inside a production_variants block, initial_instance_count is unused and initial_variant_weight handles traffic distribution instead. Serverless variants may be flagged as non-compliant or skipped depending on the Config rule implementation, so confirm how your policy engine handles them before drawing conclusions.

  • Shadow variants vs production variants

    shadow_production_variants is a separate block from production_variants in aws_sagemaker_endpoint_configuration, and this control only evaluates the latter. If your compliance output shows fewer resources than expected, confirm you are not conflating shadow (testing) variants with production ones.

Audit evidence

Config rule evaluation results showing all aws_sagemaker_endpoint_configuration resources as COMPLIANT confirm every production variant meets the instance count threshold. Console screenshots of the endpoint configuration details page, with the 'Initial instance count' column visible for each variant, work as point-in-time evidence. CloudTrail CreateEndpointConfig events corroborate the values submitted at creation. Output from Prowler or Steampipe, filtered to this rule, covers the gaps between Config snapshots.

Framework-specific interpretation

Tool mappings

Use these identifiers to cross-reference this control across tools, reports, and evidence.

  • Compliance.tf Control: sagemaker_endpoint_configuration_prod_instance_count_greater_than_one

  • AWS Config Managed Rule: SAGEMAKER_ENDPOINT_CONFIG_PROD_INSTANCE_COUNT

  • Powerpipe Control: aws_compliance.control.sagemaker_endpoint_configuration_prod_instance_count_greater_than_one

  • Prowler Check: sagemaker_endpoint_config_prod_variant_instances

  • AWS Security Hub Control: SageMaker.4

Last reviewed: 2026-03-09