Skip to content

SageMaker notebook instances should not have direct internet access

A SageMaker notebook instance with direct internet access can exfiltrate training data, model artifacts, or credentials to any external endpoint. Attackers who compromise a notebook gain an unrestricted outbound path. Disabling direct internet access forces traffic through your VPC, where security groups, NACLs, and VPC endpoints control exactly which services the notebook can reach.

This also prevents notebooks from being reachable inbound from the internet, reducing the attack surface of your ML workloads.

Retrofit consideration

Disabling direct internet access on an existing notebook requires stopping and reconfiguring the instance. You must also provision VPC endpoints for SageMaker API, SageMaker Runtime, S3, and any other AWS services the notebook uses, or provide NAT-based egress, or the instance will lose connectivity to those services entirely.

Implementation

Choose the approach that matches how you manage Terraform.

Use AWS provider resources directly. See docs for the resources involved: aws_sagemaker_notebook_instance.

resource "aws_sagemaker_notebook_instance" "this" {
  instance_type   = "ml.t2.medium"
  name            = "pofix-abc123"
  role_arn        = "arn:aws:iam::123456789012:role/example-role"
  security_groups = ["sg-12345678"]
  subnet_id       = "subnet-12345678"

  direct_internet_access = "Disabled"
}

What this control checks

The aws_sagemaker_notebook_instance resource has a direct_internet_access argument that defaults to "Enabled". To pass this control, set it to "Disabled". When disabled, you must also specify subnet_id and security_group_ids to place the notebook inside a VPC; without them, the notebook will fail to launch. The control fails any instance where direct_internet_access is "Enabled" or omitted.

Common pitfalls

  • Default value enables internet access

    Omit direct_internet_access and the instance launches with internet access enabled. The argument defaults to "Enabled", so you have to set it explicitly to "Disabled". There is no warning when the default is used.

  • Missing VPC endpoints break notebook functionality

    Before disabling direct internet access, provision VPC endpoints for com.amazonaws.<region>.sagemaker.api, com.amazonaws.<region>.sagemaker.runtime, and com.amazonaws.<region>.s3 (gateway endpoint), or set up NAT-based egress. Without one of those paths, the instance appears healthy but cannot reach SageMaker APIs or S3.

  • Security group must allow HTTPS outbound to VPC endpoints

    The security group attached via security_group_ids must allow outbound HTTPS (port 443) to the CIDR range or security group of your VPC endpoints. A restrictive egress rule that blocks this traffic silently breaks notebook kernel startup and package installation.

  • Lifecycle configurations that fetch external packages fail silently

    Lifecycle configs that run pip install from PyPI or conda install from public channels will fail once direct internet access is off. Route outbound through a NAT gateway or point to a private package mirror; otherwise the lifecycle_config_name reference deploys but the setup scripts silently fail.

Audit evidence

Auditors expect AWS Config rule evaluation results from the sagemaker-notebook-no-direct-internet-access managed rule showing all notebook instances as COMPLIANT. Supporting evidence includes the SageMaker console showing each notebook's network configuration with "Direct internet access" set to "Disabled" and a VPC and subnet assignment visible. CloudTrail CreateNotebookInstance and UpdateNotebookInstance events should show DirectInternetAccess: Disabled in the request parameters.

For ongoing coverage, Config conformance pack output or Security Hub findings filtered to this control, showing a sustained compliant state across the audit period, work as point-in-time evidence for continuous assurance.

Framework-specific interpretation

SOC 2: CC6.1 and CC6.6 call for logical access controls and restricted network traffic. A notebook with direct internet access bypasses your network boundary entirely. Disabling it puts ML compute under the same segmentation policy as other production workloads, which is what auditors ask to see.

PCI DSS v4.0: Requirement 1.3 restricts inbound and outbound traffic to only what is necessary. A notebook with direct internet access permits unrestricted outbound connectivity by default, which fails that test outright. Disabling direct access and routing through VPC endpoints limits traffic to explicitly authorized flows.

HIPAA Omnibus Rule 2013: SageMaker notebooks processing ePHI with unrestricted internet access create an uncontrolled egress channel for protected data. The HIPAA technical safeguards at 45 CFR 164.312(b) require audit controls and access restrictions; disabling direct internet access limits the network paths through which ePHI can leave the environment.

NIST SP 800-53 Rev 5: SC-7 (Boundary Protection) and AC-4 (Information Flow Enforcement) both apply here. Disabling direct internet access puts a managed interface at the notebook boundary, so traffic to and from external networks goes through VPC controls rather than an open internet path.

FedRAMP Moderate Baseline Rev 4: At the Moderate baseline, SC-7 requires federal data to traverse monitored, controlled network boundaries. Routing all SageMaker notebook traffic through VPC-based controls meets that requirement.

Tool mappings

Use these identifiers to cross-reference this control across tools, reports, and evidence.

  • Compliance.tf Control: sagemaker_notebook_instance_direct_internet_access_disabled

  • AWS Config Managed Rule: SAGEMAKER_NOTEBOOK_NO_DIRECT_INTERNET_ACCESS

  • Checkov Check: CKV_AWS_122

  • Powerpipe Control: aws_compliance.control.sagemaker_notebook_instance_direct_internet_access_disabled

  • Prowler Check: sagemaker_notebook_instance_without_direct_internet_access_configured

  • AWS Security Hub Control: SageMaker.1

Last reviewed: 2026-03-09