ElastiCache for Redis replication groups should have automatic failover enabled
When a Redis primary node fails without automatic failover, the replication group requires manual intervention to promote a replica. During that window, all write operations fail and applications depending on the cache layer degrade or fall over entirely. Automatic failover lets ElastiCache detect the failure and promote a healthy replica within seconds, keeping write availability intact without operator involvement.
The cost of enabling this is negligible compared to the operational risk. You already pay for replicas; automatic failover simply ensures they're used when needed rather than sitting idle during an incident.
Retrofit consideration
Enabling automatic failover on an existing replication group requires at least one read replica and may trigger a brief maintenance window depending on configuration changes.
Implementation
Choose the approach that matches how you manage Terraform.
If you use terraform-aws-modules/elasticache/aws, set the right module inputs for this control. You can later migrate to the compliance.tf module with minimal changes because it is compatible by design.
module "elasticache" {
source = "terraform-aws-modules/elasticache/aws"
version = ">=1.0.0,<2.0.0"
description = "Redis cluster"
engine = "redis"
engine_version = "7.1"
node_type = "cache.t3.micro"
num_cache_clusters = 2
replication_group_id = "abc123"
subnet_ids = ["subnet-12345678", "subnet-12345678"]
vpc_id = "vpc-12345678"
automatic_failover_enabled = true
}
Use AWS provider resources directly. See docs for the resources involved: aws_elasticache_replication_group.
resource "aws_elasticache_replication_group" "this" {
at_rest_encryption_enabled = true
auth_token = "PofixExampleAuthToken32CharsLng"
description = "pofix example replication group"
node_type = "cache.t3.micro"
num_cache_clusters = 2
replication_group_id = "pofix-abc123"
snapshot_retention_limit = 15
subnet_group_name = "example-subnet-group"
transit_encryption_enabled = true
automatic_failover_enabled = true
}
What this control checks
In Terraform, aws_elasticache_replication_group must have automatic_failover_enabled = true. The argument defaults to false, so omitting it causes the control to fail. Automatic failover also requires at least one replica: num_cache_clusters must be 2 or greater, or replicas_per_node_group must be at least 1 when using cluster mode. A replication group with automatic_failover_enabled = true but only a single node will fail to apply. Setting multi_az_enabled = true places replicas in separate Availability Zones and pairs well with this control, but the policy evaluates only the automatic_failover_enabled flag.
Common pitfalls
Single-node groups fail at apply time, not silently
Setting
automatic_failover_enabled = truewithnum_cache_clusters = 1causes a Terraform apply error. You need at least two cache clusters (one primary, one replica) for automatic failover to function. Ifnum_cache_clustersis set dynamically, validate it is always >= 2 whenever failover is enabled.Cluster mode disabled vs enabled syntax differences
When using cluster mode (sharding), replica count is controlled by
replicas_per_node_groupinsideaws_elasticache_replication_group, notnum_cache_clusters. Ensurereplicas_per_node_groupis at least 1 per shard. Mixing both arguments produces a conflict error.T2 node types don't support automatic failover
Older
cache.t2.*node types don't support Multi-AZ with automatic failover. Specifyingautomatic_failover_enabled = trueon acache.t2.microinstance results in an API rejection. Switch to a current-generation node type such ascache.t4g.*to avoid this.
Audit evidence
AWS Config rule evaluations showing all AWS::ElastiCache::ReplicationGroup resources as COMPLIANT are the primary evidence, or equivalent output from a CSPM tool. The aws elasticache describe-replication-groups CLI should show AutomaticFailover: enabled for every group. Console screenshots confirming "Auto Failover: Enabled" work as supplementary evidence.
For continuous assurance, AWS Config conformance pack or Security Hub findings scoped to this control should show consistently compliant evaluations over time.
Framework-specific interpretation
Related controls
Tool mappings
Use these identifiers to cross-reference this control across tools, reports, and evidence.
Compliance.tf Control:
elasticache_replication_group_auto_failover_enabledAWS Config Managed Rule:
ELASTICACHE_REPL_GRP_AUTO_FAILOVER_ENABLEDCheckov Check:
CKV2_AWS_50Powerpipe Control:
aws_compliance.control.elasticache_replication_group_auto_failover_enabledProwler Checks:
elasticache_redis_cluster_automatic_failover_enabled,elasticache_redis_cluster_multi_az_enabledAWS Security Hub Control:
ElastiCache.3
Last reviewed: 2026-03-09