Amazon SageMaker is AWS’s fully managed machine learning service. In a typical ML workflow on AWS, ML engineers and data scientists use SageMaker Notebook Instances to write code for data processing, model training, and deployment.
Visual Studio Code is a widely popular source code editor known for its powerful developer tools and extensive extensions.
This post covers two methods for setting up VS Code on SageMaker Notebook Instances. The first method involves manually creating a notebook instance and using SageMaker lifecycle configuration scripts to automate the installation and setup of code-server with Jupyter. The second method uses a CloudFormation template to fully automate the setup process.
Manual Setup
Lifecycle Configuration Scripts
We’ll use two Lifecycle Configuration scripts to install and set up code-server.
Create Notebook: The install-codeserver.sh script installs code-server. This script runs only during the initial setup when creating a new notebook instance. It does not run on existing notebook instances.
Start Notebook: The setup-codeserver.sh script configures code-server on SageMaker. It runs each time the associated notebook instance starts, including during its initial creation. If the instance is already running, the script will execute the next time the notebook is stopped and restarted.
These scripts were created by AWS Solutions Architects, and the GitHub repository can be found here.
Associating the Lifecycle Configuration Script with a Notebook Instance
To associate the Lifecycle Configuration script with a notebook instance, update the settings during instance creation:
Once the notebook instance launches, we should see the option to open code-server in the browser:
When opened, VS Code will appear in a new tab:
CloudFormation Setup
The CloudFormation setup provides an automated way to deploy a SageMaker notebook instance within a properly configured VPC.
AWSTemplateFormatVersion: '2010-09-09'
Description: CloudFormation template to create a VPC with a single public
subnet, and SageMaker notebook instance with a GitHub repository cloned into
it.
Parameters:
VpcCIDR:
Description: Please enter the IP range (CIDR notation) for this VPC
Type: String
Default: 10.0.0.0/16
PublicSubnetCIDR:
Description: Please enter the IP range (CIDR notation) for the public subnet
Type: String
Default: 10.0.0.0/24
SageMakerInstanceType:
Description: The instance type of SageMaker notebook to be provisioned.
Type: String
Default: ml.t3.medium
AllowedValues:
- ml.t3.medium
- ml.t3.large
- ml.t3.xlarge
- ml.t3.2xlarge
VolumeSizeInGB:
Description: The size of the EBS volume, in gigabytes, that is attached to the
notebook instance.
Type: Number
Default: 30
MinValue: 5
MaxValue: 16384
DefaultCodeRepository:
Description: The URL or name of the Git repository to associate with the
notebook instance as its default code repository.
Type: String
Default: ''
S3BucketName:
Description: The name of the S3 bucket to grant full access.
Type: String
Conditions:
SpecifiedGitHubRepo: !Not
- !Equals
- !Ref DefaultCodeRepository
- ''
Resources:
VPC:
Type: AWS::EC2::VPC
Properties:
CidrBlock: !Ref VpcCIDR
EnableDnsSupport: true
EnableDnsHostnames: true
Tags:
- Key: Name
Value: !Sub ${AWS::StackName}-vpc
PublicSubnet:
Type: AWS::EC2::Subnet
Properties:
VpcId: !Ref VPC
CidrBlock: !Ref PublicSubnetCIDR
MapPublicIpOnLaunch: true
Tags:
- Key: Name
Value: !Sub ${AWS::StackName}-public-subnet
InternetGateway:
Type: AWS::EC2::InternetGateway
Properties:
Tags:
- Key: Name
Value: !Sub ${AWS::StackName}-igw
AttachGateway:
Type: AWS::EC2::VPCGatewayAttachment
Properties:
VpcId: !Ref VPC
InternetGatewayId: !Ref InternetGateway
PublicRouteTable:
Type: AWS::EC2::RouteTable
Properties:
VpcId: !Ref VPC
Tags:
- Key: Name
Value: !Sub ${AWS::StackName}-public-routetable
PublicRoute:
Type: AWS::EC2::Route
Properties:
RouteTableId: !Ref PublicRouteTable
DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref InternetGateway
PublicSubnetRouteTableAssociation:
Type: AWS::EC2::SubnetRouteTableAssociation
Properties:
SubnetId: !Ref PublicSubnet
RouteTableId: !Ref PublicRouteTable
NotebookExecutionRole:
Type: AWS::IAM::Role
Properties:
RoleName: !Sub ${AWS::StackName}-sagemaker-execution-role
AssumeRolePolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Principal:
Service:
- sagemaker.amazonaws.com
Action:
- sts:AssumeRole
Policies:
- PolicyName: S3BucketAccessPolicy
PolicyDocument:
Version: '2012-10-17'
Statement:
- Effect: Allow
Action: s3:*
Resource:
- !Sub arn:aws:s3:::${S3BucketName}
- !Sub arn:aws:s3:::${S3BucketName}/*
ManagedPolicyArns:
- arn:aws:iam::aws:policy/AmazonSageMakerFullAccess
- arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryFullAccess
NotebookLifecycleConfig:
Type: AWS::SageMaker::NotebookInstanceLifecycleConfig
Properties:
NotebookInstanceLifecycleConfigName: !Sub ${AWS::StackName}-lifecycle-config
OnCreate:
- Content: IyEvYmluL2Jhc2gKc2V0IC1ldXgKCkNPREVfU0VSVkVSX1ZFUlNJT049IjQuMTYuMSIKQ09ERV9TRVJWRVJfSU5TVEFMTF9MT0M9Ii9ob21lL2VjMi11c2VyL1NhZ2VNYWtlci8uY3MiClhER19EQVRBX0hPTUU9Ii9ob21lL2VjMi11c2VyL1NhZ2VNYWtlci8ueGRnL2RhdGEiClhER19DT05GSUdfSE9NRT0iL2hvbWUvZWMyLXVzZXIvU2FnZU1ha2VyLy54ZGcvY29uZmlnIgpJTlNUQUxMX1BZVEhPTl9FWFRFTlNJT049MQpDUkVBVEVfTkVXX0NPTkRBX0VOVj0xCkNPTkRBX0VOVl9MT0NBVElPTj0nL2hvbWUvZWMyLXVzZXIvU2FnZU1ha2VyLy5jcy9jb25kYS9lbnZzL2NvZGVzZXJ2ZXJfcHkzOScKQ09OREFfRU5WX1BZVEhPTl9WRVJTSU9OPSIzLjkiCklOU1RBTExfRE9DS0VSX0VYVEVOU0lPTj0xClVTRV9DVVNUT01fRVhURU5TSU9OX0dBTExFUlk9MAoKc3VkbyAtdSBlYzItdXNlciAtaSA8PEVPRgoKdW5zZXQgU1VET19VSUQKCiMjIyMjIyMjIyMjIyMKIyAgSU5TVEFMTCAgIwojIyMjIyMjIyMjIyMjCgojIHNldCB0aGUgZGF0YSBhbmQgY29uZmlnIGhvbWUgZW52IHZhcmlhYmxlIGZvciBjb2RlLXNlcnZlcgpleHBvcnQgWERHX0RBVEFfSE9NRT0kWERHX0RBVEFfSE9NRQpleHBvcnQgWERHX0NPTkZJR19IT01FPSRYREdfQ09ORklHX0hPTUUKZXhwb3J0IFBBVEg9IiRDT0RFX1NFUlZFUl9JTlNUQUxMX0xPQy9iaW4vOiRQQVRIIgoKIyBpbnN0YWxsIGNvZGUtc2VydmVyIHN0YW5kYWxvbmUKbWtkaXIgLXAgJHtDT0RFX1NFUlZFUl9JTlNUQUxMX0xPQ30vbGliICR7Q09ERV9TRVJWRVJfSU5TVEFMTF9MT0N9L2JpbgpjdXJsIC1mTCBodHRwczovL2dpdGh1Yi5jb20vY29kZXIvY29kZS1zZXJ2ZXIvcmVsZWFzZXMvZG93bmxvYWQvdiRDT0RFX1NFUlZFUl9WRVJTSU9OL2NvZGUtc2VydmVyLSRDT0RFX1NFUlZFUl9WRVJTSU9OLWxpbnV4LWFtZDY0LnRhci5neiBcCnwgdGFyIC1DICR7Q09ERV9TRVJWRVJfSU5TVEFMTF9MT0N9L2xpYiAteHoKbXYgJHtDT0RFX1NFUlZFUl9JTlNUQUxMX0xPQ30vbGliL2NvZGUtc2VydmVyLSRDT0RFX1NFUlZFUl9WRVJTSU9OLWxpbnV4LWFtZDY0ICR7Q09ERV9TRVJWRVJfSU5TVEFMTF9MT0N9L2xpYi9jb2RlLXNlcnZlci0kQ09ERV9TRVJWRVJfVkVSU0lPTgpsbiAtcyAke0NPREVfU0VSVkVSX0lOU1RBTExfTE9DfS9saWIvY29kZS1zZXJ2ZXItJENPREVfU0VSVkVSX1ZFUlNJT04vYmluL2NvZGUtc2VydmVyICR7Q09ERV9TRVJWRVJfSU5TVEFMTF9MT0N9L2Jpbi9jb2RlLXNlcnZlcgoKIyBjcmVhdGUgc2VwYXJhdGUgY29uZGEgZW52aXJvbm1lbnQKaWYgWyAkQ1JFQVRFX05FV19DT05EQV9FTlYgLWVxIDEgXQp0aGVuCiAgICBjb25kYSBjcmVhdGUgLS1wcmVmaXggJENPTkRBX0VOVl9MT0NBVElPTiBweXRob249JENPTkRBX0VOVl9QWVRIT05fVkVSU0lPTiAteQpmaQoKIyBpbnN0YWxsIG1zLXB5dGhvbiBleHRlbnNpb24KaWYgWyAkVVNFX0NVU1RPTV9FWFRFTlNJT05fR0FMTEVSWSAtZXEgMCAtYSAkSU5TVEFMTF9QWVRIT05fRVhURU5TSU9OIC1lcSAxIF0KdGhlbgogICAgY29kZS1zZXJ2ZXIgLS1pbnN0YWxsLWV4dGVuc2lvbiBtcy1weXRob24ucHl0aG9uIC0tZm9yY2UKCiAgICAjIGlmIHRoZSBuZXcgY29uZGEgZW52IHdhcyBjcmVhdGVkLCBhZGQgY29uZmlndXJhdGlvbiB0byBzZXQgYXMgZGVmYXVsdAogICAgaWYgWyAkQ1JFQVRFX05FV19DT05EQV9FTlYgLWVxIDEgXQogICAgdGhlbgogICAgICAgIENPREVfU0VSVkVSX01BQ0hJTkVfU0VUVElOR1NfRklMRT0iJFhER19EQVRBX0hPTUUvY29kZS1zZXJ2ZXIvTWFjaGluZS9zZXR0aW5ncy5qc29uIgogICAgICAgIGlmIGdyZXAgLXEgInB5dGhvbi5kZWZhdWx0SW50ZXJwcmV0ZXJQYXRoIiAiXCRDT0RFX1NFUlZFUl9NQUNISU5FX1NFVFRJTkdTX0ZJTEUiCiAgICAgICAgdGhlbgogICAgICAgICAgICBlY2hvICJEZWZhdWx0IGludGVyZXByZXRlciBwYXRoIGlzIGFscmVhZHkgc2V0LiIKICAgICAgICBlbHNlCiAgICAgICAgICAgIGNhdCA+PlwkQ09ERV9TRVJWRVJfTUFDSElORV9TRVRUSU5HU19GSUxFIDw8LSBNQUNISU5FU0VUVElOR1MKewogICAgInB5dGhvbi5kZWZhdWx0SW50ZXJwcmV0ZXJQYXRoIjogIiRDT05EQV9FTlZfTE9DQVRJT04vYmluIgp9Ck1BQ0hJTkVTRVRUSU5HUwogICAgICAgIGZpCiAgICBmaQpmaQoKIyBpbnN0YWxsIGRvY2tlciBleHRlbnNpb24KaWYgWyAkVVNFX0NVU1RPTV9FWFRFTlNJT05fR0FMTEVSWSAtZXEgMCAtYSAkSU5TVEFMTF9ET0NLRVJfRVhURU5TSU9OIC1lcSAxIF0KdGhlbgogICAgY29kZS1zZXJ2ZXIgLS1pbnN0YWxsLWV4dGVuc2lvbiBtcy1henVyZXRvb2xzLnZzY29kZS1kb2NrZXIgLS1mb3JjZQpmaQoKRU9G
OnStart:
- Content: IyEvYmluL2Jhc2gKc2V0IC1ldXgKCkNPREVfU0VSVkVSX1ZFUlNJT049IjQuMTYuMSIKQ09ERV9TRVJWRVJfSU5TVEFMTF9MT0M9Ii9ob21lL2VjMi11c2VyL1NhZ2VNYWtlci8uY3MiClhER19EQVRBX0hPTUU9Ii9ob21lL2VjMi11c2VyL1NhZ2VNYWtlci8ueGRnL2RhdGEiClhER19DT05GSUdfSE9NRT0iL2hvbWUvZWMyLXVzZXIvU2FnZU1ha2VyLy54ZGcvY29uZmlnIgpDUkVBVEVfTkVXX0NPTkRBX0VOVj0xCkNPTkRBX0VOVl9MT0NBVElPTj0nL2hvbWUvZWMyLXVzZXIvU2FnZU1ha2VyLy5jcy9jb25kYS9lbnZzL2NvZGVzZXJ2ZXJfcHkzOScKVVNFX0NVU1RPTV9FWFRFTlNJT05fR0FMTEVSWT0wCkVYVEVOU0lPTl9HQUxMRVJZX0NPTkZJRz0ne3tcInNlcnZpY2VVcmxcIjpcIlwiLFwiY2FjaGVVcmxcIjpcIlwiLFwiaXRlbVVybFwiOlwiXCIsXCJjb250cm9sVXJsXCI6XCJcIixcInJlY29tbWVuZGF0aW9uc1VybFwiOlwiXCJ9fScKCkxBVU5DSEVSX0VOVFJZX1RJVExFPSdDb2RlIFNlcnZlcicKUFJPWFlfUEFUSD0nY29kZXNlcnZlcicKTEFCXzNfRVhURU5TSU9OX0RPV05MT0FEX1VSTD0naHR0cHM6Ly9naXRodWIuY29tL2F3cy1zYW1wbGVzL2FtYXpvbi1zYWdlbWFrZXItY29kZXNlcnZlci9yZWxlYXNlcy9kb3dubG9hZC92MC4yLjAvc2FnZW1ha2VyLWpwcm94eS1sYXVuY2hlci1leHQtMC4yLjAudGFyLmd6JwoKZXhwb3J0IFhER19EQVRBX0hPTUU9JFhER19EQVRBX0hPTUUKZXhwb3J0IFhER19DT05GSUdfSE9NRT0kWERHX0NPTkZJR19IT01FCmV4cG9ydCBQQVRIPSIke0NPREVfU0VSVkVSX0lOU1RBTExfTE9DfS9iaW4vOiRQQVRIIgoKRVhUX0dBTExFUllfSlNPTj0nJwppZiBbICRVU0VfQ1VTVE9NX0VYVEVOU0lPTl9HQUxMRVJZIC1lcSAxIF0KdGhlbgogICAgRVhUX0dBTExFUllfSlNPTj0iJ0VYVEVOU0lPTlNfR0FMTEVSWSc6ICckRVhURU5TSU9OX0dBTExFUllfQ09ORklHJyIKZmkKCkpVUFlURVJfQ09ORklHX0ZJTEU9Ii9ob21lL2VjMi11c2VyLy5qdXB5dGVyL2p1cHl0ZXJfbm90ZWJvb2tfY29uZmlnLnB5IgppZiBncmVwIC1xICIkQ09ERV9TRVJWRVJfSU5TVEFMTF9MT0MvYmluIiAiJEpVUFlURVJfQ09ORklHX0ZJTEUiCnRoZW4KICAgIGVjaG8gIlNlcnZlci1wcm94eSBjb25maWd1cmF0aW9uIGFscmVhZHkgc2V0IGluIEp1cHl0ZXIgbm90ZWJvb2sgY29uZmlnLiIKZWxzZQogICAgY2F0ID4+L2hvbWUvZWMyLXVzZXIvLmp1cHl0ZXIvanVweXRlcl9ub3RlYm9va19jb25maWcucHkgPDxFT0MKYy5TZXJ2ZXJQcm94eS5zZXJ2ZXJzID0gewogICckUFJPWFlfUEFUSCc6IHsKICAgICAgJ2xhdW5jaGVyX2VudHJ5JzogewogICAgICAgICAgICAnZW5hYmxlZCc6IFRydWUsCiAgICAgICAgICAgICd0aXRsZSc6ICckTEFVTkNIRVJfRU5UUllfVElUTEUnLAogICAgICAgICAgICAnaWNvbl9wYXRoJzogJ2NvZGVzZXJ2ZXIuc3ZnJwogICAgICB9LAogICAgICAnY29tbWFuZCc6IFsnJENPREVfU0VSVkVSX0lOU1RBTExfTE9DL2Jpbi9jb2RlLXNlcnZlcicsICctLWF1dGgnLCAnbm9uZScsICctLWRpc2FibGUtdGVsZW1ldHJ5JywgJy0tYmluZC1hZGRyJywgJzEyNy4wLjAuMTp7cG9ydH0nXSwKICAgICAgJ2Vudmlyb25tZW50JyA6IHsKICAgICAgICAgICAgICAgICAgICAgICAgJ1hER19EQVRBX0hPTUUnIDogJyRYREdfREFUQV9IT01FJywgCiAgICAgICAgICAgICAgICAgICAgICAgICdYREdfQ09ORklHX0hPTUUnOiAnJFhER19DT05GSUdfSE9NRScsCiAgICAgICAgICAgICAgICAgICAgICAgICdTSEVMTCc6ICcvYmluL2Jhc2gnLAogICAgICAgICAgICAgICAgICAgICAgICAkRVhUX0dBTExFUllfSlNPTgogICAgICAgICAgICAgICAgICAgICAgfSwKICAgICAgJ2Fic29sdXRlX3VybCc6IEZhbHNlLAogICAgICAndGltZW91dCc6IDMwCiAgfQp9CkVPQwpmaQoKSlVQWVRFUl9MQUJfVkVSU0lPTj0kKC9ob21lL2VjMi11c2VyL2FuYWNvbmRhMy9lbnZzL0p1cHl0ZXJTeXN0ZW1FbnYvYmluL2p1cHl0ZXItbGFiIC0tdmVyc2lvbikKCnN1ZG8gLXUgZWMyLXVzZXIgLWkgPDxFT0YKCmlmIFsgJENSRUFURV9ORVdfQ09OREFfRU5WIC1lcSAxIF0KdGhlbgogICAgY29uZGEgY29uZmlnIC0tYWRkIGVudnNfZGlycyAiJHtDT05EQV9FTlZfTE9DQVRJT04lLyp9IgpmaQoKaWYgW1sgJEpVUFlURVJfTEFCX1ZFUlNJT04gPT0gMSogXV0KdGhlbgogICAgc291cmNlIC9ob21lL2VjMi11c2VyL2FuYWNvbmRhMy9iaW4vYWN0aXZhdGUgSnVweXRlclN5c3RlbUVudgogICAgZWNobyAiSW5zdGFsbGluZyBqdXB5dGVyLXNlcnZlci1wcm94eS4iCiAgICBwaXAgaW5zdGFsbCBqdXB5dGVyLXNlcnZlci1wcm94eT09MS42LjAKICAgIGNvbmRhIGRlYWN0aXZhdGUKCiAgICBlY2hvICJKdXB5dGVyTGFiIGV4dGVuc2lvbiBmb3IgSnVweXRlckxhYiAxIGlzIG5vdCBzdXBwb3J0ZWQuIFlvdSBjYW4gc3RpbGwgYWNjZXNzIGNvZGUtc2VydmVyIGJ5IHR5cGluZyB0aGUgY29kZS1zZXJ2ZXIgVVJMIGluIHRoZSBicm93c2VyIGFkZHJlc3MgYmFyLiIKZWxzZQogICAgc291cmNlIC9ob21lL2VjMi11c2VyL2FuYWNvbmRhMy9iaW4vYWN0aXZhdGUgSnVweXRlclN5c3RlbUVudgoKICAgIG1rZGlyIC1wICRDT0RFX1NFUlZFUl9JTlNUQUxMX0xPQy9sYWJfZXh0CiAgICBjdXJsIC1MICRMQUJfM19FWFRFTlNJT05fRE9XTkxPQURfVVJMID4gJENPREVfU0VSVkVSX0lOU1RBTExfTE9DL2xhYl9leHQvc2FnZW1ha2VyLWpwcm94eS1sYXVuY2hlci1leHQudGFyLmd6CiAgICBwaXAgaW5zdGFsbCAkQ09ERV9TRVJWRVJfSU5TVEFMTF9MT0MvbGFiX2V4dC9zYWdlbWFrZXItanByb3h5LWxhdW5jaGVyLWV4dC50YXIuZ3oKCiAgICBqdXB5dGVyIGxhYmV4dGVuc2lvbiBkaXNhYmxlIGp1cHl0ZXJsYWItc2VydmVyLXByb3h5CgogICAgY29uZGEgZGVhY3RpdmF0ZQpmaQpFT0YKCmlmIFtbIC1mIC9ob21lL2VjMi11c2VyL2Jpbi9kb2NrZXJkLXJvb3RsZXNzLnNoIF1dOyB0aGVuCgllY2hvICJSdW5uaW5nIGluIHJvb3RsZXNzIG1vZGU7IHBsZWFzZSByZXN0YXJ0IEp1cHl0ZXIgU2VydmVyIGZyb20gdGhlICdGaWxlJyA+ICdTaHV0IERvd24nIG1lbnUgYW5kIHJlLW9wZW4gSnVweXRlci9KdXB5dGVyTGFiLiIKZWxzZQoJZWNobyAiUm9vdCBtb2RlLiBSZXN0YXJ0aW5nIEp1cHl0ZXIgU2VydmVyLi4uIgogICAgc3VkbyBzeXN0ZW1jdGwgcmVzdGFydCBqdXB5dGVyLXNlcnZlcgpmaQo=
NotebookSecurityGroup:
Type: AWS::EC2::SecurityGroup
Properties:
GroupDescription: Security group for SageMaker Notebook instance allowing
all outbound traffic and no inbound traffic
VpcId: !Ref VPC
NotebookInstance:
Type: AWS::SageMaker::NotebookInstance
Properties:
NotebookInstanceName: !Sub ${AWS::StackName}-notebook
InstanceType: !Ref SageMakerInstanceType
RoleArn: !GetAtt NotebookExecutionRole.Arn
SubnetId: !Ref PublicSubnet
LifecycleConfigName: !GetAtt NotebookLifecycleConfig.NotebookInstanceLifecycleConfigName
DirectInternetAccess: Enabled
VolumeSizeInGB: !Ref VolumeSizeInGB
DefaultCodeRepository: !If
- SpecifiedGitHubRepo
- !Ref DefaultCodeRepository
- !Ref AWS::NoValue
SecurityGroupIds:
- !Ref NotebookSecurityGroup
CloudFormation Template Breakdown
The provided template creates a VPC with a single public subnet, along with a SageMaker notebook instance that can optionally clone a GitHub repository into its environment.
Component | Description |
---|---|
VPC and Subnet Configuration | - A public subnet is created within the VPC, with an attached Internet Gateway to allow outbound internet access. |
- A route table and associated routes are configured to ensure that the subnet has the necessary internet connectivity. | |
SageMaker Notebook Instance | - A SageMaker notebook instance is created with a specified instance type, volume size, and an optional, pre-existing code repository. |
- The notebook instance is placed in the public subnet, ensuring it has internet access for downloading dependencies, interacting with S3, and other tasks. | |
- Lifecycle configuration scripts are set up to install and configure VS Code (using code-server) within the notebook instance, providing an integrated development environment. The shell scripts must be base64-encoded strings, which can be encoded using Python or tools like base64encode.org. | |
- The execution role is configured with permissions for SageMaker, ECR, and a specific S3 bucket. | |
Security Group Configuration | - A security group is created specifically for the SageMaker notebook instance. This security group allows all outbound traffic, ensuring that the notebook can interact with necessary external services (e.g., GitHub, S3). |
- Inbound traffic is restricted by default. |
Once the CloudFormation stack is deployed, the VPC, subnet, and SageMaker notebook instance are created according to the specifications in the template. The lifecycle configuration script will automatically run when the notebook instance starts, installing and configuring VS Code. The notebook instance will be ready for use with our specified GitHub repository and connected S3 bucket.