Skill 3.2.1: Protected AI Environments

Task: Task 3.2
Goal: Ensure FM deployments operate inside tightly controlled network and identity boundaries.

User Story

As a Cloud Security Architect, I want MangaAssist components that access customer, order, and browsing data to run inside a protected AI environment so that sensitive information cannot leak through overly broad network paths, permissions, or data access patterns.

Grounded Scenarios

Scenario	Why It Matters
The order-tracking flow needs access to customer order details and delivery status	This path carries the highest sensitivity in the chatbot
Recommendation quality analysis uses browsing and purchase history from curated datasets	Analysts need controlled access without opening broad data paths
A new Lambda tries to call both catalog and account systems with one oversized IAM role	Convenience-driven privilege creep is a common risk

Deep-Dive Design

1. Network Isolation

For high-sensitivity workloads, keep the FM path inside private networking:

VPC-enabled compute for orchestration and post-processing
VPC endpoints for S3, Bedrock, CloudWatch, and other required AWS services
private connectivity to internal order, profile, and catalog services
no unmanaged outbound internet path from sensitive execution roles

The aim is to make data movement explicit and reviewable.

2. Least-Privilege IAM by Function

Split permissions by responsibility:

chat orchestrator role
retrieval role
order tool role
logging role
evaluation job role

For example, the recommendation flow might read pseudonymized profile signals, while the order-tracking flow can read only order status fields for the authenticated customer context.

3. Fine-Grained Data Access

Where governed datasets are shared across teams:

register datasets in Lake Formation
define table- or column-level entitlements
restrict who can query raw transcript stores versus masked analytics views

This is how we stop "one data lake" from becoming "one giant permission."

4. Monitoring and Access Visibility

CloudWatch and CloudTrail should capture:

which service assumed which role
which data resource was touched
whether access was denied or unexpectedly spiked
whether requests came from approved network paths

This turns environment security into an observable control, not a static diagram.

5. Session-Scoped Access Patterns

Sensitive requests should be authorized using request context:

authenticated customer identity
allowed intent types
tool-specific scopes
short-lived credentials or signed calls where appropriate

This reduces the chance that one compromised component can replay broad access.

Acceptance Criteria

Sensitive FM paths use private connectivity and approved service endpoints only.
IAM roles are scoped by workflow and do not share broad cross-service permissions.
Raw and masked datasets are separated by policy and access path.
Data access is logged with enough detail to support investigations.
New GenAI components cannot be promoted without an access review.

Signals and Metrics

percentage of sensitive traffic traversing private endpoints
count of IAM roles with wildcard permissions
denied-access events by component
time to identify which component accessed a sensitive dataset
number of services with direct access to raw transcript data

Failure Modes and Tradeoffs

Private networking can add operational friction. Mitigation: standardize secure service templates.
Least privilege can be bypassed by shared roles. Mitigation: enforce role boundaries through infrastructure review.
Monitoring gaps turn secure design into guesswork. Mitigation: require logging for every sensitive access path.

Interview Takeaway

Protected AI environments are about network boundaries, identity boundaries, and data boundaries working together. If any one of those is loose, the FM path is not truly protected.