Beginner Level (0–1 Years)
1. What is the difference between an IAM role and an IAM user?
Answer:
An IAM user represents a person or application with long-term credentials (username/password or access keys), while an IAM role provides temporary permissions to entities like users, applications, or AWS services. Roles enhance security by avoiding long-term credentials and are used for delegated access.
2. Can an EC2 instance in a public subnet access the internet if it doesn’t have an Elastic IP or a Public IP?
Answer:
No, an EC2 instance in a public subnet requires a public IP or Elastic IP for direct internet access, even with a route to an Internet Gateway. Alternatively, it can access the internet through a NAT Gateway if it lacks a public IP.
3. What happens if you stop and start an EC2 instance with an instance store-backed root volume?
Answer:
If you stop an EC2 instance with an instance store-backed root volume, all data on that volume is lost because instance store volumes are ephemeral and do not persist across stops.
4. How does AWS ensure high availability in S3?
Answer:
AWS S3 replicates data across multiple Availability Zones within a region to ensure high durability (99.999999999% or 11 9’s) and high availability (99.99%). Durability ensures data is not lost, while availability ensures data can be accessed.
5. Can you attach an EBS volume to more than one EC2 instance at a time?
Answer:
Typically, no, as EBS volumes are designed for single EC2 instance attachment. However, EBS Multi-Attach (available for io1/io2 volume types) allows attachment to multiple instances in the same Availability Zone, provided the instances and file system support concurrent access.
6. What is the default behavior of a security group for inbound and outbound traffic?
Answer:
By default, security groups allow all outbound traffic and deny all inbound traffic. You must explicitly configure rules to allow specific inbound traffic.
7. What’s the difference between a NAT Gateway and a NAT Instance?
Answer:
A NAT Gateway is a managed AWS service that provides high availability, scalability, and minimal maintenance. A NAT Instance is a manually configured EC2 instance used for routing traffic, requiring more management and offering less scalability.
8. Can Lambda functions run indefinitely if configured correctly?
Answer:
No, AWS Lambda functions have a maximum timeout of 15 minutes and are designed for short-lived tasks. For long-running processes, use EC2 or container services like ECS or EKS.
9. What happens to objects stored in S3 when a bucket is deleted?
Answer:
An S3 bucket cannot be deleted unless it is empty. You must first delete all objects and, if versioning is enabled, all object versions or disable versioning before deleting the bucket.
10. Can you use RDS with SSH to connect to a database?
Answer:
No, RDS does not support direct SSH access. Connect using the database endpoint and a SQL client. For RDS in a private subnet, use an EC2 instance as a jump box. Publicly accessible RDS instances can be connected directly (though not recommended for security).
11. Why would an EC2 instance fail to access an S3 bucket even if the bucket policy allows it?
Answer:
The instance may lack an IAM role with S3 permissions, or there could be a conflicting bucket policy, ACL, VPC endpoint policy, or network ACL blocking access. Both IAM and network configurations must align.
12. What is the impact of enabling versioning on an S3 bucket?
Answer:
Enabling versioning retains all versions of an object, preventing accidental deletion. However, it increases storage costs since every version is stored separately.
13. How can you securely store secrets (like API keys) in AWS?
Answer:
Use AWS Secrets Manager or AWS Systems Manager Parameter Store with encryption to securely store secrets. Avoid storing secrets in environment variables or code to reduce security risks.
14. Can you run Docker containers on AWS Lambda?
Answer:
Yes, AWS Lambda supports container images up to 10 GB, allowing you to package functions as Docker containers (supported since 2020). This provides flexibility for custom runtimes and dependencies.
15. What’s the difference between spot and on-demand EC2 instances?
Answer:
On-demand EC2 instances are priced by the hour or second and are always available. Spot instances are cheaper but can be interrupted by AWS with short notice if demand increases.
16. What is an IAM policy, and what does a policy with full S3 access allow?
Answer:
An IAM policy is a JSON document that defines permissions for AWS resources, specifying actions, resources, and effects (Allow or Deny). For example, the following policy grants full access to all S3 buckets and actions:
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": "*"
}
This is highly permissive and a significant security risk, as it allows all S3 operations on all buckets. For production, restrict actions and resources to the minimum required.
17. What AWS service can be used to automate infrastructure provisioning?
Answer:
AWS CloudFormation enables infrastructure as code using JSON or YAML templates to automate resource provisioning. Alternatives include third-party tools like Terraform.
18. What is an Elastic Load Balancer (ELB) and how does it work in AWS?
Answer:
An Elastic Load Balancer (ELB) is an AWS service that automatically distributes incoming application traffic across multiple EC2 instances, containers, or other targets to improve availability and scalability. It operates at the application layer (Application Load Balancer), transport layer (Network Load Balancer), or gateway level (Gateway Load Balancer). ELB monitors the health of targets and routes traffic only to healthy ones, balancing the load efficiently.
19. What is Amazon Route 53, and what is its primary function?
Answer:
Amazon Route 53 is a scalable Domain Name System (DNS) web service that translates domain names (e.g., example.com) into IP addresses. Its primary function is to route user requests to AWS resources like EC2 instances, S3 buckets, or CloudFront distributions. It also supports domain registration and health checks for routing traffic to healthy resources.
20. How can you use Amazon CloudWatch to monitor an EC2 instance?
Answer:
Amazon CloudWatch collects and tracks metrics, logs, and events from EC2 instances, such as CPU utilization, disk I/O, and network traffic. You can enable detailed monitoring for more granular data, set alarms to trigger notifications or actions (e.g., rebooting an instance) when thresholds are breached, and view logs in CloudWatch Logs for troubleshooting.
21. Does deleting a CloudWatch log group delete its stored logs immediately?
Answer:
Yes, deleting a CloudWatch log group permanently deletes all its log streams and data immediately, with no recovery option.
22. If an EC2 instance is terminated, what happens to its attached EBS volumes?
Answer:
If the “Delete on Termination” flag is enabled (default for root volumes), the EBS volume is deleted. If disabled, the volume persists as a standalone resource.
23. How does a VPC differ from a traditional network?
Answer:
A VPC is a virtual network in AWS where you can define subnets, route tables, and gateways programmatically. Unlike traditional networks, it’s fully software-defined and integrates seamlessly with AWS services.
24. Can you increase the size of an EBS volume without stopping the EC2 instance?
Answer:
Yes, you can increase an EBS volume’s size while it’s attached to a running EC2 instance. Afterward, the operating system must extend the file system to use the additional space.
25. What is the purpose of an AWS Availability Zone?
Answer:
An AWS Availability Zone is an isolated location within a region, consisting of one or more data centers with independent power, cooling, and networking. It enhances high availability and fault tolerance by allowing resources like EC2 instances or RDS databases to be deployed across multiple zones within a region.
👋 Need top AWS developers for your project? Interview this week!
Fill out the form to book a call with our team. We’ll match you to the top developers who meet your requirements, and you’ll be interviewing this week!
Intermediate Level (1–3 Years)
1. What are the key differences between AWS CloudFormation and Terraform, and why might you choose one over the other?
Answer:
CloudFormation is AWS-native, tightly integrated with AWS services, and supports drift detection. Terraform is provider-agnostic, has a more flexible syntax (HCL), better modular support, and works well in multi-cloud environments. Choose CloudFormation for AWS-only projects with deep service integration; choose Terraform for multi-cloud or more complex dependency management.
2. How does AWS handle eventual consistency in services like S3 and DynamoDB?
Answer:
Since December 2020, S3 provides strong consistency for all operations, including PUTs, overwrites, and DELETEs, across all regions. DynamoDB offers eventual consistency by default for lower latency and cost, but strongly consistent reads can be enabled with higher latency and cost.
3. Why might you prefer an Application Load Balancer (ALB) over a Classic Load Balancer (CLB)?
Answer:
ALBs support layer 7 features like path-based routing, host-based routing, WebSocket support, and better metrics. CLBs are legacy and limited to basic layer 4/7 routing. ALBs are more efficient for modern microservices and containerized applications.
4. How do you troubleshoot a Lambda timeout error when everything appears correct in the code?
Answer:
Check for slow downstream services (e.g., RDS, S3 latency), increase timeout settings if justified, review cold start delays (especially with large deployment packages), and inspect VPC configuration (misconfigured subnets/NAT can block internet access).
5. What’s the difference between horizontal and vertical scaling in AWS, and which is preferable?
Answer:
Horizontal scaling adds more instances (e.g., via Auto Scaling Groups), while vertical scaling increases the size of a single instance. Horizontal is preferred for high availability and fault tolerance. Vertical is simpler but has upper limits and single points of failure.
6. What happens if you accidentally delete a CloudFormation stack managing live infrastructure?
Answer:
By default, all associated resources are deleted. You can prevent this by enabling termination protection on the stack. Always use stack policies to protect critical resources within the template.
7. In a private subnet, how can resources access the internet securely?
Answer:
Through a NAT Gateway or NAT Instance in a public subnet. This allows outbound internet access while still blocking unsolicited inbound traffic. Ensure route tables are configured correctly.
8. How does AWS KMS protect encryption keys and support secure key rotation?
Answer:
KMS uses envelope encryption and stores keys in secure, tamper-evident hardware. It supports automatic key rotation annually or manual rotation. Key policies and IAM control access. Audit logs are sent to CloudTrail.
9. How can you secure access to an S3 bucket from an EC2 instance without embedding credentials?
Answer:
Attach an IAM role to the EC2 instance with the necessary S3 permissions. AWS automatically injects temporary credentials into the instance via the Instance Metadata Service (IMDSv2) at http://169.254.169.254
, using a token-based approach for enhanced security.
10. What are the implications of enabling cross-zone load balancing in an AWS ELB?
Answer:
Cross-zone load balancing distributes traffic evenly across all registered targets, regardless of AZ. Without it, traffic is routed proportionally to AZ sizes. It improves balance but may incur additional data transfer costs across AZs.
11. How can you enforce MFA (Multi-Factor Authentication) for IAM users accessing the AWS Console?
Answer:
Create an IAM policy that denies all AWS actions unless MFA is present, and attach it to users or groups. For example:
{
"Effect": "Deny",
"Action": "*",
"Resource": "*",
"Condition": {
"BoolIfExists": {
"aws:MultiFactorAuthPresent": "false"
}
}
}
Combine with a separate “Allow” policy for specific actions to avoid unintended denials.
12. What is the purpose of a dead-letter queue (DLQ) in Amazon SQS or Lambda?
Answer:
A DLQ stores messages that can’t be processed successfully after a specified number of retries. It allows debugging failed events without losing them and prevents infinite retry loops.
13. Why might your EC2 instances lose access to the internet even though they have public IPs?
Answer:
This could be due to missing or incorrect route table entries, network ACLs blocking outbound access, or the Internet Gateway not being attached to the VPC. Security groups should also allow outbound traffic.
14. How does AWS ensure data durability in EBS volumes?
Answer:
EBS volumes are automatically replicated within their Availability Zone, offering 99.999% durability to protect against hardware failures. For higher durability (11 9’s), use snapshots stored in S3 or AMIs.
15. What is the difference between an alias and a version in AWS Lambda?
Answer:
A version is a snapshot of a Lambda function’s code and configuration. An alias is a pointer to a version, useful for managing environments like dev/stage/prod. Aliases let you route traffic between versions.
16. How can you restrict a specific IAM user to access only a particular DynamoDB table?
Answer:
Create an IAM policy that limits specific actions to the table’s ARN, following the principle of least privilege:
{
"Effect": "Allow",
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:UpdateItem",
"dynamodb:DeleteItem"
],
"Resource": "arn:aws:dynamodb:region:account-id:table/TableName"
}
17. What’s the difference between CloudWatch metrics and CloudWatch logs?
Answer:
Metrics are numeric time series data used for monitoring and alarms (e.g., CPUUtilization), while logs are raw text data generated by applications or services. Logs can be used to generate custom metrics.
18. How does Amazon RDS handle automatic failover?
Answer:
In a Multi-AZ setup, RDS automatically fails over to the standby instance in case of hardware, network, or storage failure. The DNS endpoint is updated to point to the standby instance.
19. How can you ensure that your Lambda function can access both the internet and private resources in a VPC?
Answer:
Place the Lambda in private subnets with NAT Gateway access for internet connectivity. Ensure VPC settings include correct route tables and security groups for both private and public traffic.
20. What’s the effect of enabling detailed monitoring on EC2 instances?
Answer:
Detailed monitoring provides 1-minute metrics instead of the default 5-minute intervals. This is useful for faster scaling and better visibility, but it incurs additional cost.
21. What’s the purpose of VPC peering, and what are its limitations?
Answer:
VPC peering allows private communication between two VPCs using AWS’ internal network. However, it does not support transitive peering, overlapping CIDR blocks, or automatic DNS resolution across VPCs unless configured explicitly.
22. How can you reduce cold start times in AWS Lambda?
Answer:
Use smaller deployment packages, avoid initializing heavy libraries outside the handler, configure provisioned concurrency, and avoid placing Lambda in VPCs unless necessary (or use improved VPC networking).
23. When would you choose Amazon Aurora over standard RDS engines like MySQL or PostgreSQL?
Answer:
Aurora offers better performance (up to 5x for MySQL, 3x for PostgreSQL), auto-scaling storage, higher availability, and automatic failover. It’s a good choice for high-throughput, mission-critical applications.
24. Can a Lambda function access another Lambda’s environment variables directly?
Answer:
No. Lambda environment variables are isolated per function. You must share data via secure storage like AWS Systems Manager Parameter Store or Secrets Manager.
25. How can you allow an EC2 instance in one VPC to access resources in another VPC without using the internet?
Answer:
Use VPC peering, AWS Transit Gateway, or a VPN connection between VPCs. Ensure appropriate route tables and security groups are configured to allow traffic.
26. What is an Elastic IP, and when should you avoid using it?
Answer:
An Elastic IP is a static, public IPv4 address. Avoid using it unless necessary—AWS prefers dynamic IPs via Auto Scaling and Load Balancers. Unused Elastic IPs incur charges.
27. What AWS service would you use to manage containerized microservices and why?
Answer:
Use Amazon ECS or EKS. ECS is simpler for AWS-only setups; EKS is more powerful for Kubernetes users. Both integrate with Fargate for serverless containers.
28. How does AWS Lambda pricing work?
Answer:
Lambda is billed based on number of invocations and duration (in GB-seconds). You also pay for provisioned concurrency if used. The first 1M invocations and 400,000 GB-seconds per month are free.
29. What are placement groups in EC2 and when would you use them?
Answer:
Placement groups determine how instances are placed on underlying hardware. Use “cluster” for low-latency/high throughput, “spread” for fault tolerance, and “partition” for large-scale distributed apps.
30. Can you use CloudFormation to create non-AWS resources?
Answer:
Yes, using AWS CloudFormation custom resources or resource providers, which rely on Lambda functions or external HTTP endpoints. This adds complexity and potential latency but allows integration with third-party APIs or on-premises services.
31. What is AWS Global Accelerator, and how does it differ from CloudFront?
Answer:
Global Accelerator uses AWS’ global network to route TCP/UDP traffic to the optimal endpoint based on health and latency. Unlike CloudFront, which caches content, Global Accelerator improves performance for non-HTTP applications like gaming or VoIP.
32. How do lifecycle policies in S3 help with cost optimization?
Answer:
Lifecycle policies automatically transition objects to cheaper storage classes (like S3 Glacier or Intelligent-Tiering) or delete them after a period, reducing long-term storage costs.
33. What are the implications of enabling public read access on an S3 bucket?
Answer:
It exposes data to the internet and creates a security risk. AWS now blocks public access by default. To allow it, you must override the public access block settings and apply a permissive bucket policy.
34. How does AWS Step Functions improve serverless application design?
Answer:
Step Functions coordinate multiple AWS services in workflows using visual state machines. It handles retries, failures, parallel execution, and simplifies orchestration without writing complex code.
35. What’s the difference between AWS Config and CloudTrail?
Answer:
CloudTrail logs API calls and user activity across AWS. AWS Config tracks resource configurations over time and can evaluate compliance. Use Config for audit/state change tracking, and CloudTrail for action tracing.
36. How do you avoid circular dependencies in CloudFormation templates?
Answer:
Break templates into nested stacks or separate stacks using exports/imports. Avoid hardcoding resource references that depend on each other. Use DependsOn
only when necessary.
37. What is the benefit of using AWS Organizations with Service Control Policies (SCPs)?
Answer:
SCPs allow central governance by enforcing permission boundaries across accounts in an organization. Even if a user has admin permissions in an account, SCPs can restrict access to certain services.
38. How can you enable high availability for applications deployed on EC2?
Answer:
Deploy instances across multiple AZs using an Auto Scaling Group behind a Load Balancer. Use health checks, EBS snapshots, and stateless design principles. Consider Multi-AZ RDS and S3 for shared storage.
39. What are AWS EventBridge rules used for?
Answer:
EventBridge rules filter and route events from AWS services, custom apps, or SaaS providers to targets like Lambda, SQS, Step Functions, or EC2. It’s useful for loosely-coupled, event-driven architectures.
40. How do AWS WAF and Shield differ?
Answer:
AWS WAF protects web apps from common exploits (e.g., SQL injection, XSS). Shield provides DDoS protection. Shield Standard is free and automatic; Shield Advanced offers enhanced protection and response.
41. What is the difference between a customer-managed CMK and an AWS-managed CMK in KMS?
Answer:
AWS-managed CMKs are created and managed automatically by AWS services. Customer-managed CMKs offer more control, including rotation, key policies, and audit logging. Choose customer-managed when you need tighter security compliance.
42. How can you ensure an RDS snapshot is encrypted?
Answer:
Snapshots of encrypted RDS instances are automatically encrypted. You cannot encrypt an unencrypted snapshot retroactively—you must create a new encrypted instance and snapshot from that.
43. What happens if your Lambda function exceeds the memory you allocated?
Answer:
The function is terminated immediately, and you receive an OutOfMemoryError
. Monitor usage via CloudWatch and adjust memory settings accordingly.
44. How can you enforce S3 data encryption without modifying applications?
Answer:
Apply a bucket policy that denies uploads lacking encryption headers (e.g., x-amz-server-side-encryption
). For example:
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Deny",
"Principal": "*",
"Action": "s3:PutObject",
"Resource": "arn:aws:s3:::bucket-name/*",
"Condition": {
"StringNotEquals": {
"s3:x-amz-server-side-encryption": "AES256"
}
}
}
}
This blocks unencrypted writes regardless of the source.
45. What’s the difference between an S3 presigned URL and a signed cookie?
Answer:
A presigned URL grants temporary access to a specific object. A signed cookie can grant access to multiple objects (e.g., media streams). Use presigned URLs for APIs or limited downloads; signed cookies for web app integration.
46. How do you minimize downtime during a blue/green deployment in Elastic Beanstalk?
Answer:
Deploy the new version to a separate environment, test it, then swap CNAMEs to redirect traffic. This provides zero-downtime deployment with easy rollback.
47. Can you use Amazon Route 53 to route traffic based on the user’s device type?
Answer:
No. Route 53 supports routing based on geography, latency, health, etc., but not device type. Device-based routing must be handled at the application layer (e.g., user-agent inspection).
48. What are the key differences between Amazon MQ and Amazon SQS?
Answer:
Amazon MQ is a managed broker for protocols like AMQP, MQTT, and STOMP—ideal for migrating legacy apps. SQS is a fully managed message queue supporting only AWS APIs. SQS scales better but is less flexible protocol-wise.
49. How can you track changes made to AWS infrastructure over time?
Answer:
Use AWS Config to capture point-in-time snapshots and track configuration history. Combine with CloudTrail for auditing who made what changes.
50. Why might using the root AWS account for daily operations be dangerous?
Answer:
The root account has unrestricted access and cannot be restricted via IAM. Using it regularly increases risk of accidental deletion or compromise. It should be locked down with MFA and used only for critical account-level tasks.

Hire Top LATAM Developers: Guide
We’ve prepared this guide that covers benefits, costs, recruitment, and remote team management to a succesful hiring of developers in LATAM.
Fill out the form to get our guide.
Advanced Level (3+ Years)
1. How would you design a multi-region, active-active architecture using AWS services?
Answer:
Use Route 53 latency-based or geolocation routing to direct users to the nearest region. Deploy services (e.g., API Gateway, Lambda, ECS) in multiple regions. Use DynamoDB Global Tables for replication (noting they provide eventual consistency, requiring application-level conflict resolution for strong consistency). Employ S3 cross-region replication and configure health checks for failover. Monitor cross-region data transfer costs to optimize expenses.
2. What are the trade-offs between using Global Accelerator vs Route 53 for global traffic routing?
Answer:
Global Accelerator provides lower latency and static IP addresses by leveraging AWS’s global edge network and TCP/UDP protocol support. Route 53 offers DNS-based routing policies but is subject to DNS caching. Choose Global Accelerator for real-time apps or IP whitelisting; Route 53 for cost-effective routing and flexibility.
3. How do you secure and rotate credentials for apps running on containers in ECS?
Answer:
Use IAM roles for tasks (ECS task roles) to provide short-lived credentials automatically injected via the Instance Metadata Service (IMDSv2). For additional secrets, integrate with AWS Secrets Manager or SSM Parameter Store using SDKs or sidecars. Task role credentials are rotated automatically by AWS, while Secrets Manager supports automated rotation for stored secrets.
4. How does VPC sharing work, and what are the implications for network management?
Answer:
VPC sharing allows multiple AWS accounts within an AWS Organization to share subnets in a centrally managed VPC. Resource accounts can deploy into shared subnets but cannot modify core networking components like route tables or gateways. This supports separation of duties but requires strict governance and tagging.
5. How would you implement zero-trust networking on AWS?
Answer:
Enforce authentication and authorization at every layer. Use IAM policies, security groups, NACLs, PrivateLink, and service-level TLS. Combine with AWS Verified Access, identity providers (via Cognito or IAM Identity Center), and network segmentation to validate requests contextually.
6. How can you automate cross-account resource provisioning using CloudFormation?
Answer:
Use StackSets with service-managed or self-managed permissions. Grant target accounts access via IAM roles. StackSets can deploy templates across accounts and regions, and manage drift detection and updates centrally.
7. How would you troubleshoot intermittent 5xx errors in a high-traffic application behind an ALB?
Answer:
Analyze ALB access logs to identify patterns or failing targets. Check target health checks, memory/CPU pressure, connection timeouts, or bad deploys. Use CloudWatch metrics and X-Ray traces to correlate spikes with backend issues.
8. Describe a secure strategy to share an S3 bucket between two AWS accounts with minimal permissions.
Answer:
Use a bucket policy allowing the second account’s IAM role access to specific prefixes. For example:
{
"Version": "2012-10-17",
"Statement": {
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::account-id:role/role-name"
},
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": "arn:aws:s3:::bucket-name/prefix/*",
"Condition": {
"StringEquals": {
"aws:PrincipalArn": "arn:aws:iam::account-id:role/role-name"
}
}
}
}
Enable logging and MFA delete for extra auditability and safety.
9. What are some challenges and best practices in designing event-driven architectures on AWS?
Answer:
Challenges include at-least-once delivery, ordering, idempotency, and failure handling. Best practices: use EventBridge or SNS for fan-out, SQS with DLQs for retry isolation, correlate events with unique IDs, and apply chaos testing to flows.
10. How does Control Tower help enforce governance in a multi-account AWS environment?
Answer:
Control Tower automates landing zone setup with account baselining, guardrails (SCPs), AWS Config rules, logging via centralized CloudTrail, and account vending. It ensures compliance and best practices across new and existing accounts.
11. How would you implement blue/green deployments for a stateful application on EC2?
Answer:
Deploy two Auto Scaling Groups (blue and green) behind an ALB. Use Route 53 or ALB target groups to shift traffic between them. For stateful data, use external shared storage (like RDS or EFS) to avoid losing data during switchover. Ensure DNS TTLs are low to support quick rollback.
12. What is the role of AWS Service Catalog in large enterprises?
Answer:
AWS Service Catalog enables centralized management of approved resources, including infrastructure-as-code templates. It restricts unauthorized modifications, supports cost control, and ensures compliance through curated service portfolios.
13. How would you mitigate noisy neighbor issues in multi-tenant EC2 environments?
Answer:
Use EC2 Dedicated Hosts or Dedicated Instances to isolate tenants. Apply resource limits via cgroups in containers, and monitor CPU credits (T-series). For consistent performance, choose instance types with dedicated hardware and use placement groups.
14. How does AWS Nitro Enclaves enhance security?
Answer:
Nitro Enclaves create isolated compute environments within EC2 instances for secure data processing. They use no persistent storage, no networking, and only local attestation and vsock communication, making them ideal for sensitive workloads like cryptographic operations.
15. How would you handle the migration of a petabyte-scale on-premises data warehouse to Redshift?
Answer:
Use AWS Snowball or Direct Connect for bulk transfer. Use Schema Conversion Tool and Data Migration Service (DMS) for schema and data loading. Optimize table design for columnar storage and distribution keys. Validate with spectrum queries and parallel COPY commands.
16. How would you secure cross-region API communication between microservices?
Answer:
Use private APIs with VPC endpoints, or authenticate via signed requests using IAM roles. For public APIs, use mutual TLS, WAF, and API Gateway with throttling and authorizers. Optionally, leverage AWS PrivateLink across regions with Transit Gateway attachments.
17. How would you respond to a security incident involving compromised IAM credentials?
Answer:
Immediately revoke credentials (delete keys, disable user). Use CloudTrail and AWS Config to assess the scope and configuration changes. Rotate secrets, enforce MFA, and use SCPs to isolate impact. Engage GuardDuty and Security Hub for forensic analysis, and apply IAM Access Analyzer for scope validation. Use Systems Manager for automated remediation.
18. How does Aurora Serverless v2 differ from Aurora Serverless v1?
Answer:
Aurora Serverless v2 allows instant scaling based on load, without cold starts. It uses fine-grained capacity increments instead of discrete ACUs (Aurora Capacity Units), supports Multi-AZ and global databases, and behaves more like provisioned Aurora with pay-per-use pricing.
19. How would you reduce latency for a global user base using DynamoDB?
Answer:
Use DynamoDB Global Tables to replicate data across regions. Combine with Route 53 latency-based routing to direct users to the nearest region. Ensure applications handle eventual consistency and conflict resolution.
20. Describe a scenario where using Lambda would be a bad choice despite being serverless.
Answer:
For long-running or CPU-intensive tasks (e.g., video processing >15 minutes), Lambda is unsuitable due to timeout limits and cost. Also, cold start latency or VPC networking constraints may degrade performance. Consider ECS Fargate or EC2 instead.
21. How would you design a secure CI/CD pipeline in AWS?
Answer:
Use CodePipeline with CodeBuild and CodeDeploy. Store code in CodeCommit or GitHub with webhooks. Use IAM roles with least privilege, KMS to encrypt artifacts, and Secrets Manager for credentials. Enforce approval stages, static code analysis, and sign artifacts with AWS Signer.
22. What is the benefit of using AWS Macie in a compliance-focused environment?
Answer:
Macie automatically discovers and classifies sensitive data in S3 (like PII, PHI, and credentials). It helps meet GDPR, HIPAA, and other regulations by flagging policy violations and enabling audit trails.
23. How can you ensure DNS resolution across multiple VPCs?
Answer:
Use Route 53 Resolver with inbound and outbound endpoints, or enable DNS hostnames and DNS resolution in VPC settings. For VPC peering, manually configure forwarding rules. For complex scenarios, leverage AWS Transit Gateway with DNS sharing.
24. What’s the difference between inline and managed IAM policies, and when should you use each?
Answer:
Inline policies are embedded directly into a single user, group, or role. Managed policies are standalone and reusable. Use inline for tight, one-off permissions; use managed for standardization and auditability across multiple entities.
25. How would you identify and reduce unnecessary costs in an enterprise AWS account?
Answer:
Use AWS Cost Explorer and AWS Budgets to monitor spend. Identify idle resources (EBS, Elastic IPs), underutilized EC2 instances (via Trusted Advisor), orphaned snapshots, and unnecessary NAT Gateway usage. Employ Savings Plans and Reserved Instances where predictable.
26. How do Amazon S3 Object Lock and MFA Delete differ?
Answer:
Object Lock prevents object version deletion via governance or compliance mode for a specified retention period. MFA Delete requires MFA to delete versions or change bucket versioning but must be configured manually using CLI/API. Object Lock offers better compliance guarantees.
27. How do you enforce compliance for tagging AWS resources across accounts?
Answer:
Use Service Control Policies (SCPs) to deny resource creation without required tags. Implement tagging policies in AWS Organizations. Use AWS Config rules to detect noncompliance and trigger remediation with Systems Manager Automation.
28. How would you prevent privilege escalation in a multi-user AWS environment?
Answer:
Avoid wildcard IAM permissions (e.g., "Action": "*"
). Use IAM Access Analyzer to detect risky policies. Implement least privilege, conditional access (e.g., using tags), and disable inline policy editing for lower-trust users. Apply SCPs in AWS Organizations to enforce account-level boundaries.
29. How does AWS Backup integrate with services like RDS and DynamoDB?
Answer:
AWS Backup supports scheduled backups, cross-region backup vaults, and lifecycle policies for RDS and DynamoDB. It centralizes backup compliance and auditing. However, DynamoDB supports PITR (Point-in-Time Recovery) independently too.
30. What are the best practices for managing AWS credentials in a CI/CD environment?
Answer:
Use IAM roles with short-lived credentials (via OIDC federation or AssumeRole). Store secrets in AWS Secrets Manager. Avoid static keys in code. Audit with CloudTrail and rotate credentials frequently using pipelines.
31. How would you implement hybrid DNS resolution between AWS and an on-premises network?
Answer:
Use Route 53 Resolver endpoints: create inbound endpoints for on-prem queries into AWS and outbound endpoints for AWS to on-prem DNS queries. Configure forwarding rules with conditional routing in Resolver rules. Ensure VPC and on-prem networks are connected (VPN/Direct Connect).
32. What is the difference between AWS EventBridge and SNS, and when should you use each?
Answer:
SNS is a simple pub/sub service with push-based fan-out. EventBridge supports event routing with filtering, schema discovery, and SaaS integration. Use SNS for simple fan-out; use EventBridge for complex routing, event-driven workflows, and integrations with third-party apps.
33. How would you architect a cost-effective high-throughput logging system in AWS?
Answer:
Ingest logs with Kinesis Data Streams or Firehose, buffer and batch them, and store them in S3 or OpenSearch. Compress and partition logs for storage efficiency. Use Athena or OpenSearch for querying. Apply lifecycle policies to archive or delete old data.
34. How can you enforce region restrictions for users in a multi-region AWS account?
Answer:
Use Service Control Policies (SCPs) in AWS Organizations to deny actions in unauthorized regions, while allowing global services (e.g., IAM, Route 53). For example:
{
"Effect": "Deny",
"Action": "*",
"Resource": "*",
"Condition": {
"StringNotEquals": {
"aws:RequestedRegion": ["us-east-1", "us-west-2"]
}
}
}
35. What are some limitations of AWS Lambda in high-throughput systems?
Answer:
Cold starts under VPC, 15-minute max execution, limited concurrency (default 1,000), and eventual consistency of logs can affect high-throughput use cases. Use provisioned concurrency and step scaling, or offload to containers/Fargate when needed.
36. How do you maintain consistency in distributed state using DynamoDB and Lambda?
Answer:
Implement idempotency keys and conditional writes in DynamoDB (e.g., ConditionExpression
). Use DynamoDB Streams to detect changes and propagate state. Handle retries and duplicate events carefully to avoid race conditions.
37. How would you detect and isolate a compromised EC2 instance?
Answer:
Use GuardDuty and CloudTrail to detect anomalies. Isolate the instance by modifying security groups or using VPC quarantine. Create an AMI for forensic analysis. Revoke credentials, notify teams, and redeploy from a clean source.
38. How does S3 Intelligent-Tiering differ from lifecycle policies?
Answer:
Intelligent-Tiering automatically moves objects between tiers based on access patterns without defining rules. Lifecycle policies require manual rules and time-based transitions. Intelligent-Tiering is better for unpredictable access, though slightly more expensive.
39. How would you optimize performance for a high-frequency trading app on AWS?
Answer:
Use EC2 bare metal or Nitro-based compute in placement groups (cluster strategy). Use Enhanced Networking (ENA), local NVMe storage, and provisioned IOPS. Minimize network hops using availability zones, and apply FIFO processing and low-latency messaging (e.g., MQ or Redis).
40. How can you track and audit cross-account access in AWS?
Answer:
Use AWS CloudTrail with AssumeRole
events to monitor cross-account activity. Enable organization-level trails, link them with AWS Config, and analyze logs for suspicious behavior. Tag temporary roles with session metadata to trace users across accounts.
41. How do you use AWS X-Ray to debug a serverless microservices architecture?
Answer:
Enable X-Ray tracing in API Gateway, Lambda, and downstream services. It captures segments and subsegments for each request, visualizing latencies, errors, and dependencies. Use annotations and metadata to filter traces and troubleshoot performance bottlenecks.
42. What is the difference between Transit Gateway and VPC Peering?
Answer:
VPC Peering is point-to-point and does not support transitive routing. Transit Gateway enables centralized connectivity among thousands of VPCs and on-prem networks, with transitive routing and better scalability. Transit Gateway is preferred in hub-and-spoke architectures.
43. How can you prevent data exfiltration from S3 using unintended IAM roles?
Answer:
Use S3 bucket policies with aws:PrincipalArn
or aws:SourceArn
to allow access only from specific IAM roles or VPC endpoints. Enable GuardDuty for anomaly detection and Macie for sensitive data discovery. Use VPC endpoint policies to restrict access paths.
44. How would you build a scalable real-time data ingestion pipeline on AWS?
Answer:
Use Amazon Kinesis or MSK (Managed Kafka) for ingesting events. Process with Lambda, Kinesis Data Analytics, or Flink. Store in S3, Redshift, or OpenSearch. Apply partitioning and compression for cost efficiency. Use autoscaling consumer groups to handle variable load.
45. How do you achieve single sign-on (SSO) across multiple AWS accounts?
Answer:
Use AWS IAM Identity Center (formerly AWS SSO) to federate with identity providers (e.g., Okta, Azure AD). Assign permissions to users/groups across accounts via permission sets. Users log in once to access multiple accounts with federated roles.
46. What are the risks of overly permissive Lambda execution roles, and how do you mitigate them?
Answer:
Overly broad permissions can lead to privilege escalation, data leakage, or misuse. Mitigate by scoping IAM policies tightly, applying resource-level permissions, using service control boundaries, and analyzing policies with IAM Access Analyzer.
47. How can you ensure strong consistency for reads in a global DynamoDB table?
Answer:
DynamoDB Global Tables provide eventual consistency natively and do not support strong consistency. Use DynamoDB Transactions for specific use cases requiring strong consistency within a region, or implement versioning, conflict resolution, or quorum reads at the application layer for global tables.
48. What are the architectural implications of using AWS Outposts?
Answer:
Outposts brings AWS services on-premises for low-latency or compliance-sensitive workloads. It requires robust on-site infrastructure and connectivity to the AWS region. You must design for hybrid latency, update management, and failover between Outpost and cloud.
49. How would you enforce encryption at rest and in transit across all AWS services?
Answer:
Use IAM policies, bucket policies, and service configurations to require SSE (e.g., SSE-KMS). For transit, use TLS and enforce HTTPS via security groups and Load Balancer listeners. AWS Config and Security Hub can monitor and enforce these controls centrally.
50. How do you test infrastructure as code (IaC) for security and compliance before deployment?
Answer:
Use tools like cfn-nag, Checkov, or tfsec to scan CloudFormation or Terraform templates. Integrate into CI/CD pipelines. Combine with AWS Config rules and automated OPA (Open Policy Agent) policies to enforce guardrails. Use sandbox accounts for dry-run validations.