Simplify MLOps: Store SageMaker AI Project templates in S3
Teams struggle to reliably ship ML projects because templates are hard to distribute and govern.
This is simple—and surprisingly powerful: Amazon SageMaker AI Projects can now read custom CloudFormation templates stored in Amazon S3. That change lets engineering and governance teams manage ModelOps templates with familiar S3 controls—versioning, lifecycle rules, cross-region replication, and bucket policies—while giving data scientists self‑service access from SageMaker Studio.
Executive summary
Amazon SageMaker AI Projects’ new S3-based templates remove the Service Catalog requirement for custom templates. Store CloudFormation YAML files in S3, tag them so Studio can see them, and use S3 features to version, replicate, and control access. The mlops-github-actions sample shows an enterprise-ready template that wires GitHub + GitHub Actions to SageMaker Pipelines, the Model Registry, and event-driven automation (Lambda, EventBridge). Follow a least-privilege provisioning pattern (AmazonSageMakerProjectsLaunchRole) so practitioners can self-provision without broad IAM permissions.
What you’ll learn
- What S3-based templates are and why they matter for MLOps/ModelOps
- Quick setup steps to make templates available in SageMaker Studio
- A migration checklist from Service Catalog to S3 templates
- Practical security and governance patterns, plus a small IAM example
- Limitations, gotchas, and a pilot playbook to get started
Quick definitions (plain English)
- CloudFormation — AWS’s infrastructure-as-code format (here: templates must be valid YAML).
- SageMaker Studio — the IDE/console where data scientists create and launch ML projects.
- AWS Service Catalog — an AWS service that historically distributed custom SageMaker templates (still used for AWS-built templates).
S3-based templates remove much of the administrative complexity previously required by Service Catalog, letting teams manage templates through familiar S3 controls.
Why S3-based templates matter for MLOps
Enterprises standardize ML delivery to reduce risk and accelerate value, but the traditional distribution path for templates added administrative friction: Service Catalog portfolios, product lifecycles, and extra IAM roles. Storing templates in S3 aligns ModelOps with normal DevOps flows—templates become first-class artifacts that live in storage, are versioned, and can be promoted through CI. The result: faster onboarding, auditable template histories, and simpler cross-account sharing.
How it works (core requirements)
High-level flow: put a valid CloudFormation YAML file into an S3 bucket, tag the object so SageMaker Studio can surface it, and configure your SageMaker domain to point at that bucket/prefix.
Musts (Do / Don’t / Why)
- Do: Upload CloudFormation YAML files to S3. Why: SageMaker reads CloudFormation YAML to provision projects.
- Do: Tag each template object with
sagemaker:studio-visibility=true. Why: This is the visibility flag Studio uses to show custom templates. - Do: Add the domain tag
sagemaker:projectS3TemplatesLocationpointing tos3://<bucket>/<prefix>/. Why: SageMaker domain-level tag tells Studio where to surface templates. - Do: Enable CORS on the bucket and set bucket policies for cross-account access if needed. Why: Studio needs S3 access and cross-account sharing is a common enterprise requirement.
- Don’t: Expect built-in AWS-provided templates to move to S3. Why: AWS-managed templates remain in Service Catalog.
Quick setup: 6 steps to get templates showing in Studio
- Create an S3 bucket or choose an existing one (example path:
s3://my-company-sagemaker-templates/prod/). - Enable S3 versioning and, optionally, cross-region replication and lifecycle rules.
- Upload your CloudFormation YAML templates to the bucket and tag each object with
sagemaker:studio-visibility=true. - Configure the SageMaker domain with the tag
sagemaker:projectS3TemplatesLocation=s3://my-company-sagemaker-templates/prod/. - Set bucket policies (and CORS) to allow the SageMaker domain role and any cross-account principals to read the objects.
- Open SageMaker Studio; templates in that bucket/prefix should appear in the Create Project flow.
Example: What a single CloudFormation template can provision
The aws-samples/sagemaker-custom-project-templates repo (see mlops-github-actions) demonstrates a production-ready template that provisions:
- A GitHub repository scaffold
- GitHub Actions CI/CD workflows
- SageMaker Pipelines for preprocessing, training, and evaluation
- SageMaker Model Registry integration and model groups
- Staging and production promotion gates (manual approval)
- Event-driven automation with Amazon EventBridge and Lambda
- Secrets stored in AWS Secrets Manager for credentials
A single CloudFormation template can provision an end-to-end, governed CI/CD workflow that ties your Git repo to SageMaker Pipelines and the Model Registry.
Security & IAM: the assume-role pattern
Avoid granting broad IAM permissions to individual practitioners. Follow the least-privilege provisioning pattern:
- Create a dedicated provisioning role (recommended name:
AmazonSageMakerProjectsLaunchRole). This role has the permissions needed to create infrastructure defined by the template. - Grant data scientists only the right to assume that role when creating a project.
- Keep day-to-day operational permissions for the team minimal; provisioning is mediated through the launch role.
Sample assume-role trust statement (minimal, illustrative):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": { "AWS": "arn:aws:iam::123456789012:role/SageMakerDomainUserRole" },
"Action": "sts:AssumeRole",
"Condition": {}
}
]
}
Grant the practitioner principal only the sts:AssumeRole permission on that role. The provisioning role should itself have narrowly scoped permissions to create the specific resources your template needs.
Migration checklist: Service Catalog → S3-based templates
- Inventory current Service Catalog templates and note Service Catalog–specific roles or references.
- Modify templates to remove Service Catalog artifacts and replace with SageMaker-friendly roles and parameters.
- Enable S3 versioning on the target bucket and set lifecycle rules (suggestion: retain last 30 versions, archive older ones).
- Upload templates to S3 and tag them with
sagemaker:studio-visibility=true. - Set the SageMaker domain tag
sagemaker:projectS3TemplatesLocationto the S3 prefix. - Test a single pilot team: launch a project from Studio, verify resources provisioned, and validate the assume-role flow.
- Document cleanup steps for repos, pipelines, model groups, and endpoints—some resources may require manual deletion.
- Retire the old Service Catalog products (only after pilot validation and stakeholder approval).
Governance, CI and operational best practices
- Keep source in Git: Maintain template source files in a repo. Use PRs and CI to validate templates before publishing to S3.
- Publish with automation: CI should run CloudFormation lints and then upload artifacts to S3 and apply the visibility tag.
- Audit and alerts: Enable CloudTrail for S3, set S3 event notifications for object creates/changes, and trigger a Lambda that validates tags and notifies Slack/Teams on unexpected changes.
- Cross-account access: Use bucket policies and IAM roles for controlled sharing. Consider object lock or MFA delete for sensitive templates.
- Template lifecycle: Use versioning and lifecycle rules; adopt a deprecation process (e.g., create a deprecation tag and timeline before removal).
Limitations & gotchas
- AWS-built templates: Built-in AWS ModelOps templates still live in Service Catalog and won’t be available in S3.
- Tagging discipline: Missing or incorrect tags prevent templates from being visible in Studio—automate tag checks in CI.
- CORS and bucket policies: Forgetting CORS or misconfiguring bucket policies is a common blocker for Studio access.
- Cleanup and cost: Templates can create persistent resources (endpoints, model registry entries). Track cost and include teardown steps in the template’s README or pipeline.
- Template drift: Central templates can evolve; consumers must have a clear upgrade path to avoid environment drift.
Troubleshooting quick hits
- Templates not visible in Studio? Check object tag
sagemaker:studio-visibility=trueand domain tagsagemaker:projectS3TemplatesLocation. - Access denied when launching a project? Verify bucket policy, CORS, and that the SageMaker domain role can read the object.
- Resources still billing after delete? Inspect pipelines, endpoints, and model groups—delete those manually or add a cleanup Lambda.
- Template updates not picked up? Ensure you upload new object versions to S3; Studio will surface templates from the configured prefix.
Pilot playbook (who, timeline, metrics)
- Who: One infra owner, one governance lead, two ML engineers, one data scientist team.
- Timeline: 2–4 weeks (week 1: prov. role & S3 setup; week 2: CI & publishing; week 3: pilot launches; week 4: iterate).
- Success metrics: time-to-first-project (target: reduce by 50%), percent of projects launched from approved templates, incidents related to environment misconfiguration.
Costs and cleanup reminder
Templates themselves are free, but resources provisioned by templates (endpoints, instances, storage) incur AWS charges. Include teardown automation in your templates or document manual cleanup steps. Monitor costs and include resource tagging for chargeback visibility.
Where to look next
Inspect the example repository to see the exact CloudFormation wiring and CI approach: aws-samples/sagemaker-custom-project-templates (mlops-github-actions). The repo contains template.yaml, CI examples, and README instructions to run the sample end-to-end.
Administrators define and govern templates; ML engineers and data scientists consume approved templates via self-service in SageMaker Studio.
Authors of the guidance and examples: Christian Kamwangala, Sandeep Raveesh, and Paolo Di Francesco (AWS solution architects).
Key takeaways
- S3-based templates make ModelOps templates easier to version, audit, and share across accounts.
- Lean provisioning through a dedicated AmazonSageMakerProjectsLaunchRole enforces least privilege while enabling self-service.
- Automate the template lifecycle: keep templates in Git, validate via CI, and publish to S3 with controlled pipelines and alerts.
Ready to pilot? Start by picking one template, move it into a repo with CI validation, publish to an S3 prefix with versioning turned on, and run a single-team pilot using the migration checklist above. If you want a downloadable migration checklist or an IAM-focused security checklist tailored to your environment, those can be produced next.