LOCAL PREVIEW View on GitHub

Skill 3.2.1: Protected AI Environments

Task: Task 3.2
Goal: Ensure FM deployments operate inside tightly controlled network and identity boundaries.

User Story

As a Cloud Security Architect, I want MangaAssist components that access customer, order, and browsing data to run inside a protected AI environment so that sensitive information cannot leak through overly broad network paths, permissions, or data access patterns.

Grounded Scenarios

Scenario Why It Matters
The order-tracking flow needs access to customer order details and delivery status This path carries the highest sensitivity in the chatbot
Recommendation quality analysis uses browsing and purchase history from curated datasets Analysts need controlled access without opening broad data paths
A new Lambda tries to call both catalog and account systems with one oversized IAM role Convenience-driven privilege creep is a common risk

Deep-Dive Design

1. Network Isolation

For high-sensitivity workloads, keep the FM path inside private networking:

  • VPC-enabled compute for orchestration and post-processing
  • VPC endpoints for S3, Bedrock, CloudWatch, and other required AWS services
  • private connectivity to internal order, profile, and catalog services
  • no unmanaged outbound internet path from sensitive execution roles

The aim is to make data movement explicit and reviewable.

2. Least-Privilege IAM by Function

Split permissions by responsibility:

  • chat orchestrator role
  • retrieval role
  • order tool role
  • logging role
  • evaluation job role

For example, the recommendation flow might read pseudonymized profile signals, while the order-tracking flow can read only order status fields for the authenticated customer context.

3. Fine-Grained Data Access

Where governed datasets are shared across teams:

  • register datasets in Lake Formation
  • define table- or column-level entitlements
  • restrict who can query raw transcript stores versus masked analytics views

This is how we stop "one data lake" from becoming "one giant permission."

4. Monitoring and Access Visibility

CloudWatch and CloudTrail should capture:

  • which service assumed which role
  • which data resource was touched
  • whether access was denied or unexpectedly spiked
  • whether requests came from approved network paths

This turns environment security into an observable control, not a static diagram.

5. Session-Scoped Access Patterns

Sensitive requests should be authorized using request context:

  • authenticated customer identity
  • allowed intent types
  • tool-specific scopes
  • short-lived credentials or signed calls where appropriate

This reduces the chance that one compromised component can replay broad access.

Acceptance Criteria

  • Sensitive FM paths use private connectivity and approved service endpoints only.
  • IAM roles are scoped by workflow and do not share broad cross-service permissions.
  • Raw and masked datasets are separated by policy and access path.
  • Data access is logged with enough detail to support investigations.
  • New GenAI components cannot be promoted without an access review.

Signals and Metrics

  • percentage of sensitive traffic traversing private endpoints
  • count of IAM roles with wildcard permissions
  • denied-access events by component
  • time to identify which component accessed a sensitive dataset
  • number of services with direct access to raw transcript data

Failure Modes and Tradeoffs

  • Private networking can add operational friction. Mitigation: standardize secure service templates.
  • Least privilege can be bypassed by shared roles. Mitigation: enforce role boundaries through infrastructure review.
  • Monitoring gaps turn secure design into guesswork. Mitigation: require logging for every sensitive access path.

Interview Takeaway

Protected AI environments are about network boundaries, identity boundaries, and data boundaries working together. If any one of those is loose, the FM path is not truly protected.