Skip to content

Glue Spark jobs should run on supported versions of AWS Glue

Older AWS Glue versions (0.9, 1.0, 2.0) lack security patches, performance improvements, and updated Apache Spark runtimes. Running ETL workloads on unsupported runtimes exposes your data pipelines to known vulnerabilities and cuts off access to features like auto-scaling and improved memory management that shipped with Glue 3.0 and 4.0.

AWS deprecates older Glue versions on a published schedule, and jobs still pinned to them fail when support ends. Enforcing a minimum version in Terraform keeps your data processing stack on a patched runtime and avoids emergency migration work when deprecation hits.

Retrofit consideration

Upgrading glue_version changes the underlying Spark runtime, which means regression testing ETL scripts before promoting to any higher environment. Jobs that depend on Spark 2.x APIs or deprecated Glue constructs will need code changes, not just a version bump.

Implementation

Choose the approach that matches how you manage Terraform.

Use AWS provider resources directly. See docs for the resources involved: aws_glue_job.

resource "aws_glue_job" "this" {
  command {
    script_location = "s3://pofix-example/script.py"
  }
  glue_version = "4.0"
  name         = "pofix-abc123"
  role_arn     = "arn:aws:iam::123456789012:role/example-role"
}

What this control checks

The glue_version argument in aws_glue_job must be set to "3.0" or "4.0" (or any later supported version). Values of "0.9", "1.0", or "2.0" fail the control. If glue_version is omitted, the default varies by command type; since the control evaluates the effective version, explicitly setting it to a supported value is the safest approach. The command block's name should be "glueetl" or "gluestreaming" for Spark-type jobs. Python Shell jobs ("pythonshell") are not Spark jobs and are typically out of scope for this control.

Common pitfalls

  • Omitting glue_version relies on AWS defaults

    Omit glue_version and AWS picks a default that varies by command type and Terraform provider version. Configurations written before Glue 3.0 existed commonly leave this argument out entirely. Set it explicitly on every aws_glue_job resource; otherwise, Terraform plan output and the control's evaluation can diverge.

  • Spark runtime differences between Glue 2.0 and 3.0

    Glue 3.0 runs Apache Spark 3.1 while Glue 2.0 runs Spark 2.4. Some Spark SQL functions, DataFrame APIs, and Hadoop dependency versions changed between these releases. Bumping glue_version from "2.0" to "3.0" without testing can break ETL scripts that rely on Spark 2.x behavior.

  • Glue 4.0 changes Python and Spark versions

    Glue 4.0 ships with Spark 3.3 and requires Python 3.10. If your scripts use Python 3.6 syntax or libraries that haven't been tested against 3.10, they will fail at runtime. Validate dependencies and run integration tests before pinning production jobs to "4.0".

  • Non-Spark jobs may produce false evaluation

    Python Shell jobs set command.name = "pythonshell" and don't use the Spark runtime at all. If the policy engine evaluates all aws_glue_job resources without filtering on command type, these jobs will show as non-compliant even though the glue_version constraint doesn't apply to them. Confirm whether your policy implementation scopes to "glueetl" and "gluestreaming" command types only.

Audit evidence

AWS Config rule evaluation results for the AWS::Glue::Job resource type, showing each job as COMPLIANT with GlueVersion at 3.0 or higher, are the primary artifact. Supplementary evidence includes an aws glue get-jobs CLI export filtered to show the GlueVersion field across all regions, or a console screenshot from the Glue job details page showing the version value. If Security Hub is aggregating Config findings across accounts, a Security Hub findings export for this control covers all accounts in a single artifact.

Tool mappings

Use these identifiers to cross-reference this control across tools, reports, and evidence.

  • Compliance.tf Control: glue_spark_job_runs_on_version_3_or_higher

  • AWS Config Managed Rule: GLUE_SPARK_JOB_SUPPORTED_VERSION

  • Powerpipe Control: aws_compliance.control.glue_spark_job_runs_on_version_3_or_higher

  • AWS Security Hub Control: Glue.4

Last reviewed: 2026-03-09