Atria - CDK Stack Refactoring Migration Report
Executive Summary
This document details the comprehensive refactoring of the Atria CDK infrastructure from a monolithic single-stack architecture to a modular nested stack architecture. The migration was necessary to overcome AWS CloudFormation's 500-resource limit while improving maintainability, deployment isolation, and scalability.
Key Achievements
Metric | Before | After |
|---|---|---|
Stack Architecture | Monolithic (~575 resources) | 7 Nested Stacks + Main Stack |
Main Stack Lines | ~4,314 | 1,839 (57% reduction after legacy cleanup) |
New Stack Files | 0 | 7 nested stacks |
Shared Utilities | 0 | 3 modules |
Total New TypeScript | 0 | ~3,091 lines |
Pull Requests Merged | 1 | 8 |
Lines Removed | - | 2,475 lines (legacy code cleanup) |
Table of Contents
Problem Statement
The original Atria CDK infrastructure was implemented as a single monolithic stack (cdk-stack.ts) containing all resources for multiple application features. This approach led to:
Resource Limit Risk: CloudFormation stacks are limited to 500 resources. The Atria stack was approaching this limit resources.
Deployment Coupling: Changes to one feature required full stack deployment, increasing risk and deployment time.
Maintainability Challenges: A single 4,250+ line file became difficult to navigate and maintain.
Team Collaboration: Multiple developers couldn't work on different features without merge conflicts.
Solution Architecture
Nested Stack Pattern
We adopted AWS CDK's nested stack pattern, which allows CloudFormation stacks to contain other stacks as resources. This effectively increases the resource limit by distributing resources across multiple child stacks while maintaining a single deployment unit.
┌─────────────────────────────────────────────────────────────────┐
│ Main Stack (cdk-stack.ts) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Stateful Resources │ │
│ │ • Cognito User Pool & Identity Pool │ │
│ │ • DynamoDB Tables (all with RETAIN policy) │ │
│ │ • S3 Shared Bucket │ │
│ │ • Device Dashboard Stack │ │
│ │ • Amplify App │ │
│ │ • SSM Parameters │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │Site Pictures │ │GREF Installs │ │ Provisioning │ │
│ │Nested Stack │ │Nested Stack │ │Nested Stack │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │Device Health │ │ Site Survey │ │ Admin API │ │
│ │Nested Stack │ │Nested Stack │ │Nested Stack │ │
│ └──────────────┘ └──────────────┘ └──────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Installer Nested Stack │ │
│ │ (Largest - includes Step Functions, EventBridge) │ │
│ └────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘Key Design Decisions
Stateful vs Stateless Separation: DynamoDB tables, Cognito, and S3 buckets remain in the main stack with
RETAINpolicies. Only stateless resources (Lambdas, API Gateways, IAM Roles) are moved to nested stacks.Props-Based Resource Sharing: Nested stacks receive references to stateful resources via TypeScript props interfaces, ensuring type safety and clear contracts.
SSM Parameter Store: For resources that can't be passed via props (Cognito ARN for authorizers), we use SSM Parameter Store as an intermediary.
CfnAuthorizer Pattern: Nested stacks use
CfnAuthorizerinstead ofCognitoUserPoolsAuthorizerto avoid cross-stack reference issues.
Migration Phases
Phase 0: RETAIN Policy Setup
Branch: backup/pre-nested-stack-migration
Commit: 9b7e0f8
Date: 2026-01-21
Set RemovalPolicy.RETAIN on all stateful resources to prevent accidental data loss during migration:
All DynamoDB tables
S3 Shared Bucket
Cognito User Pool
Phase 1: SSM Parameters & Shared Utilities
Branch: feature/phase-1-ssm-parameters
Commit: 556f094
Date: 2026-01-21
Created shared infrastructure for cross-stack communication:
lib/shared/ssm-parameters.ts- SSM path constants and helper classeslib/shared/api-utils.ts- Reusable API Gateway utilitieslib/shared/index.ts- Barrel exports
Phase 2: Site Pictures Migration
Branch: feature/phase-2-site-pictures-nested-stack
Commits: b4231e0, a1c4535, 5bb214d, ee52b90
Date: 2026-01-21
First applet migrated as proof of concept:
Created
site-pictures-stack.ts(338 lines)Lambda renamed to
Atria-SitePicturesLambdato avoid conflictsAPI Gateway with full CORS support
Phase 3: GREF Installs Migration
Branch: feature/phase-3-site-pictures-gref-installs
Commit: 4d2cf0d
Date: 2026-01-21
Migrated GREF Installs applet:
Created
gref-installs-stack.ts(353 lines)2 Lambda functions, 2 DynamoDB table references
Phase 4: Provisioning Migration
Branch: feature/phase-4-provisioning
Commit: 5e9dfed
Date: 2026-01-21
Migrated Provisioning applet:
Created
provisioning-stack.ts(276 lines)IoT Wireless permissions included
Phase 5: Device Health Migration
Branch: feature/phase-5-device-health
Commit: bd016b1
Date: 2026-01-21
Migrated Device Health applet (complex):
Created
device-health-stack.ts(568 lines)5 Lambda functions (Readings, Worker, Site Health, User Devices, Feedback)
Timestream database permissions
EventBridge scheduled rules
Phase 6: Site Survey Migration
Branch: feature/phase-6-site-survey
Commit: f2a442b
Date: 2026-01-21
Migrated Site Survey applet:
Created
site-survey-stack.ts(529 lines)CRUD Lambda + Reports Lambda
Email notification integration
SES permissions
Phase 7: Admin API Migration
Branch: feature/phase-7-admin-api
Commit: a635395
Date: 2026-01-21
Migrated Admin API (platform-level):
Created
admin-api-stack.ts(324 lines)User management Lambda with Cognito admin permissions
API routes: /users, /users/{userId}, /tags
Phase 8: Installer Migration
Branch: feature/phase-8-installer
Commit: 95a9009
Date: 2026-01-21
Migrated Installer applet (largest and most complex):
Created
installer-stack.ts(934 lines)6 Lambda functions including PythonFunction construct
PythonLayerVersion for image processing
Step Functions state machine
EventBridge scheduled rule (15-minute interval)
Complex API Gateway with multipart form-data support
File Changes Summary
New Files Created
File | Lines | Purpose |
|---|---|---|
| 338 | Site Pictures nested stack |
| 353 | GREF Installs nested stack |
| 276 | Provisioning nested stack |
| 568 | Device Health nested stack |
| 529 | Site Survey nested stack |
| 324 | Admin API nested stack |
| 934 | Installer nested stack |
| 13 | Barrel exports |
| 103 | SSM utilities |
| 152 | API Gateway utilities |
| 18 | Barrel exports |
Total New TypeScript: ~3,608 lines
Modified Files
File | Insertions | Deletions | Net Change |
|---|---|---|---|
| +1,178 | -922 | +256 |
Overall Statistics
12 files changed, 4,786 insertions(+), 922 deletions(-)Nested Stack Details
1. Site Pictures Nested Stack
File: lib/stacks/site-pictures-stack.ts
Resources:
IAM Role:
Atria-SitePicturesLambdaRoleLambda:
Atria-SitePicturesLambdaAPI Gateway:
Atria-SitePicturesAPI
Props Interface:
interface SitePicturesNestedStackProps {
// No required props - reads from SSM
}API Routes:
GET /pictures/{siteId}- Get pictures for a sitePOST /pictures/{siteId}- Upload pictureDELETE /pictures/{siteId}/{pictureKey}- Delete picture
2. GREF Installs Nested Stack
File: lib/stacks/gref-installs-stack.ts
Resources:
IAM Role:
Atria-GrefInstallsLambdaRoleLambda:
Atria-GrefInstallsLambdaLambda:
Atria-GrefInstallsDevicesLambdaAPI Gateway:
Atria-GrefInstallsAPI
Props Interface:
interface GrefInstallsNestedStackProps {
resourcesTableName: string
resourcesTableArn: string
devicesTableName: string
devicesTableArn: string
lorawanGatewaysTableName: string
lorawanDevicesTableName: string
}3. Provisioning Nested Stack
File: lib/stacks/provisioning-stack.ts
Resources:
IAM Role:
Atria-ProvisioningLambdaRoleLambda:
Atria-ProvisioningLambdaAPI Gateway:
Atria-ProvisioningAPI
Props Interface:
interface ProvisioningNestedStackProps {
lorawanDevicesTableName: string
lorawanGatewaysTableName: string
}Special Permissions:
iotwireless:*- Full IoT Wireless access for device provisioning
4. Device Health Nested Stack
File: lib/stacks/device-health-stack.ts
Resources:
IAM Role:
Atria-DeviceHealthLambdaRoleLambda:
Atria-DeviceReadingsLambdaLambda:
Atria-DeviceReadingsWorkerLambdaLambda:
Atria-SiteHealthLambdaLambda:
Atria-UserDevicesLambdaLambda:
Atria-FeedbackLambdaEventBridge Rule: Worker trigger (every 5 minutes)
API Gateway:
Atria-DeviceHealthAPI
Props Interface:
interface DeviceHealthNestedStackProps {
userDevicesTableName: string
userDevicesTableArn: string
feedbackTableName: string
feedbackTableArn: string
timestreamDatabaseName: string
timestreamTableName: string
sharedS3BucketName: string
sharedS3BucketArn: string
}5. Site Survey Nested Stack
File: lib/stacks/site-survey-stack.ts
Resources:
IAM Role:
Atria-SiteSurveyLambdaRoleLambda:
Atria-SiteSurveyCrudLambdaLambda:
Atria-SiteSurveyReportsLambdaAPI Gateway:
Atria-SiteSurveyAPI
Props Interface:
interface SiteSurveyNestedStackProps {
siteSurveyTableName: string
siteSurveyTableArn: string
lorawanSitesTableName: string
sharedS3BucketName: string
sharedS3BucketArn: string
sharedEmailNotificationLambdaArn: string
}6. Admin API Nested Stack
File: lib/stacks/admin-api-stack.ts
Resources:
IAM Role:
Atria-AdminUserManagementLambdaRoleLambda:
Atria-AdminUserManagementLambdaAPI Gateway:
Atria-AdminAPI
Props Interface:
interface AdminApiNestedStackProps {
userPoolId: string
userPoolArn: string
deviceDashboardUserTagsTableName: string
}Special Permissions:
cognito-idp:Admin*- Cognito admin operations for user management
7. Installer Nested Stack
File: lib/stacks/installer-stack.ts
Resources:
IAM Role:
Atria-InstallerLambdaRoleIAM Role:
Atria-InstallerAdminUserManagementLambdaRoleIAM Role:
Atria-InstallerMigrationLambdaRoleLambda:
Atria-InstallerDeviceDetailsLambdaLambda:
Atria-InstallerEntityResourceLambdaLambda:
Atria-InstallerProcessImagesLambda(PythonFunction)Lambda:
Atria-InstallerInstallationSubmissionLambdaLambda:
Atria-InstallerAdminUserManagementLambdaLambda:
Atria-InstallerMigrationLambdaPythonLayerVersion: Image processing dependencies
Step Functions State Machine: Migration workflow
EventBridge Rule: Migration trigger (every 15 minutes)
API Gateway:
Atria-InstallerAPI
Props Interface:
interface InstallerNestedStackProps {
userPoolId: string
userPoolArn: string
sharedS3Bucket: s3.IBucket
deviceDetailsTableName: string
deviceDetailsTableArn: string
entityResourceTableName: string
entityResourceTableArn: string
lorawanDevicesTableName: string
lorawanGatewaysTableName: string
siteIdIndexName: string
}Special Features:
S3 event trigger (attached from main stack)
Binary media type support for file uploads
Step Functions for scheduled data migration
Shared Utilities
SSM Parameters (lib/shared/ssm-parameters.ts)
Provides constants and helpers for SSM Parameter Store:
export const SSM_PATHS = {
COGNITO: {
USER_POOL_ID: "/atria/cognito/user-pool-id",
USER_POOL_ARN: "/atria/cognito/user-pool-arn",
// ...
},
S3: {
SHARED_BUCKET_NAME: "/atria/s3/shared-bucket-name",
// ...
},
// ...
}
export class SsmParameterWriter {
/* ... */
}
export function readSsmParameter(scope, path) {
/* ... */
}API Utilities (lib/shared/api-utils.ts)
Reusable API Gateway configurations:
CORS configurations
Method options with Cognito authentication
Lambda integration options
Request/response templates
Migration Principles
1. Stateful Resource Preservation
All stateful resources remain in the main stack:
DynamoDB tables with
RemovalPolicy.RETAINCognito User Pool and Identity Pools
S3 Buckets
2. Lambda Naming Convention
New Lambda naming pattern to avoid CloudFormation conflicts:
Atria-{FeatureName}LambdaExamples: Atria-SitePicturesLambda, Atria-DeviceReadingsLambda
3. API URL Handling
API URLs from nested stacks are:
Exposed via public properties on nested stack classes
Passed to Amplify environment variables
Included in
FrontendEnvironmentVariablesCfnOutput