Skill 3.2.1: Protected AI Environments
Task: Task 3.2
Goal: Ensure FM deployments operate inside tightly controlled network and identity boundaries.
User Story
As a Cloud Security Architect, I want MangaAssist components that access customer, order, and browsing data to run inside a protected AI environment so that sensitive information cannot leak through overly broad network paths, permissions, or data access patterns.
Grounded Scenarios
| Scenario | Why It Matters |
|---|---|
| The order-tracking flow needs access to customer order details and delivery status | This path carries the highest sensitivity in the chatbot |
| Recommendation quality analysis uses browsing and purchase history from curated datasets | Analysts need controlled access without opening broad data paths |
| A new Lambda tries to call both catalog and account systems with one oversized IAM role | Convenience-driven privilege creep is a common risk |
Deep-Dive Design
1. Network Isolation
For high-sensitivity workloads, keep the FM path inside private networking:
- VPC-enabled compute for orchestration and post-processing
- VPC endpoints for S3, Bedrock, CloudWatch, and other required AWS services
- private connectivity to internal order, profile, and catalog services
- no unmanaged outbound internet path from sensitive execution roles
The aim is to make data movement explicit and reviewable.
2. Least-Privilege IAM by Function
Split permissions by responsibility:
- chat orchestrator role
- retrieval role
- order tool role
- logging role
- evaluation job role
For example, the recommendation flow might read pseudonymized profile signals, while the order-tracking flow can read only order status fields for the authenticated customer context.
3. Fine-Grained Data Access
Where governed datasets are shared across teams:
- register datasets in Lake Formation
- define table- or column-level entitlements
- restrict who can query raw transcript stores versus masked analytics views
This is how we stop "one data lake" from becoming "one giant permission."
4. Monitoring and Access Visibility
CloudWatch and CloudTrail should capture:
- which service assumed which role
- which data resource was touched
- whether access was denied or unexpectedly spiked
- whether requests came from approved network paths
This turns environment security into an observable control, not a static diagram.
5. Session-Scoped Access Patterns
Sensitive requests should be authorized using request context:
- authenticated customer identity
- allowed intent types
- tool-specific scopes
- short-lived credentials or signed calls where appropriate
This reduces the chance that one compromised component can replay broad access.
Acceptance Criteria
- Sensitive FM paths use private connectivity and approved service endpoints only.
- IAM roles are scoped by workflow and do not share broad cross-service permissions.
- Raw and masked datasets are separated by policy and access path.
- Data access is logged with enough detail to support investigations.
- New GenAI components cannot be promoted without an access review.
Signals and Metrics
- percentage of sensitive traffic traversing private endpoints
- count of IAM roles with wildcard permissions
- denied-access events by component
- time to identify which component accessed a sensitive dataset
- number of services with direct access to raw transcript data
Failure Modes and Tradeoffs
- Private networking can add operational friction. Mitigation: standardize secure service templates.
- Least privilege can be bypassed by shared roles. Mitigation: enforce role boundaries through infrastructure review.
- Monitoring gaps turn secure design into guesswork. Mitigation: require logging for every sensitive access path.
Interview Takeaway
Protected AI environments are about network boundaries, identity boundaries, and data boundaries working together. If any one of those is loose, the FM path is not truly protected.