[AWS Certified networking] Buổi 0.1: Compute Mastery

Amazon EC2

Deep dive into instance types (purpose-built, recent releases)

Amazon EC2 instances provide virtual servers with varying combinations of resources:

CPU: Different processor families (Intel, AMD, Graviton) and core counts.
Memory: RAM suited for diverse workload requirements.
Storage: Options ranging from fast SSDs to large capacity HDDs, and instance-attached or network-based storage (EBS).
Networking: Bandwidth and throughput variations.

Purpose-Built EC2 Instances

Amazon designs instance types for specific workload demands:

General Purpose Instances:https://docs.aws.amazon.com/ec2/latest/instancetypes/gp.html
- Offer a balance of resources (compute, memory, network).
- Examples: T-series (burstable), M-series
- Uses: Web servers, application servers, development environments.
Compute-Optimized Instances:https://docs.aws.amazon.com/ec2/latest/instancetypes/co.html
- Emphasize CPU power for compute-heavy workloads.
- Examples: C-series
- Uses: Scientific modeling, gaming servers, video encoding, batch processing.
Memory-Optimized Instances:https://docs.aws.amazon.com/ec2/latest/instancetypes/mo.html
- Larger amounts of RAM per vCPU for memory-intensive tasks.
- Examples: R-series, X-series, High Memory instances.
- Uses: In-memory databases (Redis, Memcached), real-time analytics, big data.
Storage-Optimized Instances:https://docs.aws.amazon.com/ec2/latest/instancetypes/so.html
- Feature high I/O performance and massive amounts of local storage.
- Examples: I-series, D-series.
- Uses: Data warehousing, distributed file systems (Hadoop), log processing.
Accelerated Computing Instances:https://docs.aws.amazon.com/ec2/latest/instancetypes/ac.html
- Include hardware accelerators like GPUs and FPGAs.
- Examples: P-series, G-series, F-series, Inf-series (for machine learning inference).
- Uses: Machine learning, graphics processing, scientific simulations, computational fluid dynamics.
High – performance computing instances:https://docs.aws.amazon.com/ec2/latest/instancetypes/hpc.html
- Built to offer the best price performance for running HPC workloads at scale on AWS
- Ideal for applications that benefit from high-performance processors, such as large, complex simulations and deep learning workloads.

Recent EC2 Instance Releases

AWS frequently introduces new instance types. Here are some notable recent additions:

Graviton Processors: AWS-designed ARM-based processors offering excellent price-performance. Examples: M7g**,** M7gd instances.
Specialized Compute:
- C6i (Intel optimized for compute-intensive workloads)
- C7g (latest generation Graviton with high performance)
Optimized Storage
- Im4gn/Is4gen (larger storage capacity)
Network Optimization:
- Hpc6a (AMD-based, optimized for tightly coupled HPC workloads)

See more: https://instancetyp.es/

Choosing the Right Instance

Selecting the right EC2 instance relies on:

Workload Type: Determine core needs (compute-heavy vs memory-intensive).
Resource Ratios: Evaluate appropriate balance of CPU, memory, storage, and network.
Cost: Consider price-performance and options like On-Demand vs. Reserved instances
Experimentation: Test with different instance types, monitor performance, and adjust.

Additional Notes

You can use AWS Compute Optimizer (https://aws.amazon.com/compute-optimizer/) to find recommendations for instance types.
AWS Elastic Block Store (EBS): EBS is a persistent storage service that provides block storage volumes for your EC2 instances. You can attach EBS volumes to your instances to provide them with additional storage
Pricing varies by instance type, region, and purchasing model.

Managing EC2 Instances:

Once you have launched your EC2 instances, you can manage them using the AWS Management Console, the AWS CLI, or the AWS SDK. You can perform a variety of tasks, such as:

Starting and stopping instances
Monitoring instance performance
Terminating instances
Creating and managing instance images
Configuring security groups
Managing instance storage

Key Features of EC2 Instances

Scalability: You can easily scale your EC2 instances up or down to meet the changing demands of your applications.
Flexibility: You can choose from a wide variety of instance types to meet the specific needs of your applications.
Reliability: EC2 instances are highly reliable and backed by Amazon’s global infrastructure.
Cost-effectiveness: You only pay for the resources that you use.

Instance store vs EBS backed instances:

Instance Store-Backed Instances

Storage Type: Ephemeral storage physically attached to the host machine where your instance is running.
Data Persistence: Data is lost when the instance is stopped, terminated, or fails unexpectedly.
Use Cases:
- Temporary storage for caching or buffers.
- Workloads tolerant of data loss (e.g., replicated databases where data exists elsewhere).
- Applications requiring extremely low-latency disk access.

EBS (Elastic Block Store)-Backed Instances

Storage Type: Persistent, network-based block storage volumes independent of the EC2 instance’s lifecycle.
Data Persistence: Data remains intact even if the instance stops, terminates, or fails.
Use Cases:
- Boot volumes for operating systems.
- Databases or applications requiring reliable storage.
- Scenarios needing data backups and snapshots.

Table Summary:

Feature	Instance Store	EBS
Storage Type	Ephemeral, local	Network-based, persistent
Data Persistence	Lost on shutdown/termination	Persists on shutdown/termination
Performance	Generally Faster	Network Dependency (still very fast)
Flexibility	Cannot resize	Volumes can be resized and reattached
Snapshots	Not Supported	Supported
Pricing	Included in instance cost	Charged separately

Choosing the Right Storage

Need very high IOPS and low latency? Instance store can be superior.
Is data persistence critical? EBS is essential.
Workloads with large amounts of data? EBS provides scalability.
Need boot volumes? EBS is required.
Frequent scaling or changes? EBS offers greater flexibility
Cost-conscious about storage? Using the included instance store storage might be appealing.

Important Notes

Some modern instance types don’t offer instance store.
You can use both instance store and EBS volumes with one instance.
EBS offers various volume types (general-purpose, provisioned IOPS, etc.) for performance tailoring.

Advanced use cases: Spot Instances for cost optimization, high-performance computing

Cost Optimization with Spot Instances

What Are Amazon EC2 Spot Instances?
- Spot Instances are virtual servers offered by cloud computing providers (such as Amazon Web Services, AWS) at a significantly discounted price compared to regular on-demand instances.
- These instances utilize spare capacity in the cloud, making them cost-effective for certain workloads.
Cost Optimization with Spot Instances:
- Massive Cost Savings: Spot Instances are available at up to a 90% discount compared to on-demand prices.
- Flexible Applications: You can use Spot Instances for various scenarios, including:
  - Big Data Processing: Analyzing large datasets efficiently.
  - Containerized Workloads: Running containers at a lower cost.
  - CI/CD Pipelines: Automating software delivery pipelines.
  - Web Servers: Hosting web applications.
  - Test & Development Workloads: Creating and testing applications.
- Integration with AWS Services: Spot Instances work seamlessly with services like Auto Scaling, EMR, ECS, and more.
- Predictable Pricing: While pricing changes gradually based on supply and demand, it remains lower than on-demand rates.

Think of it like this

On-Demand instances are like renting a taxi – guaranteed availability but a fixed price.
Spot Instances are like ride-sharing – you might get a great deal, but availability and price can fluctuate.

Here’s how Spot Instances can save you money:

Suitable for non-critical workloads: Use them for tasks that can be restarted without impacting results (e.g., data analysis, simulations).
Flexible bidding: Set a maximum price you’re willing to pay per hour. If the Spot price stays below your bid, you win the instance!
Spot Fleet: Launch multiple Spot Instances with different configurations to maximize your chance of acquiring resources at a good price.

However, there are some trade-offs:

Interruptions: Spot Instances can be reclaimed by AWS with a two-minute notice if the demand for On-Demand instances increases or the Spot price goes above your bid.
Need to be adaptable: Your application should be able to handle interruptions gracefully (e.g., by saving progress periodically and resuming later).

High-Performance Computing with Spot Instances

HPC workloads often require a lot of processing power, and Spot Instances can be surprisingly good for them too. Here’s why:

Cost-effective alternative: You can access powerful instances for demanding tasks like scientific simulations or large-scale data processing at a discounted price.
Many HPC workloads are fault-tolerant: They can be divided into smaller, independent tasks. If a Spot Instance is interrupted, the task can be restarted on another instance with minimal impact.
Tools to manage interruptions: Services like AWS Batch can help you schedule and manage HPC jobs on Spot Instances, automatically restarting interrupted tasks.

Things to consider for HPC with Spot Instances:

Choose fault-tolerant applications: Jobs should be able to be resumed after interruptions.
Optimize for parallel processing: Break down tasks into smaller, independent units.
Use tools like AWS Batch: They can handle scheduling and restarts for you.

Spot Instances are ideal for fault-tolerant, stateless applications, and they provide an excellent balance between cost savings and performance.

Custom AMIs and bootstrapping

Custom AMIs: Pre-Configured Blueprints for Your Instances

An AMI is a pre-configured virtual machine image that serves as a template for launching EC2 instances.
Custom AMIs are AMIs that you create by customizing an existing AMI or building one from scratch.
Here’s why you might use custom AMIs:
- Application Customization: You can install your application, dependencies, and customizations on an instance, then create a custom AMI from that instance.
- Performance Optimization: Custom AMIs allow you to fine-tune the operating system, install specific software, and configure settings.
- Security Hardening: You can apply security patches, set up firewalls, and configure access controls in your custom AMIs.
- Consistency: Custom AMIs ensure consistent environments across instances.
Think of it as a pre-configured snapshot you can use to launch identical instances quickly and efficiently.

Benefits of Custom AMIs

Standardization and Repeatability: Ensure all your instances have the same configuration, reducing errors and inconsistencies.
Faster Deployment: Launch new instances with your desired setup in minutes, saving you time.
Improved Security: Bake security best practices into the AMI, enhancing the overall security posture of your instances.
Version Control: Track changes to your AMIs, allowing you to roll back to previous configurations if needed.

Challenges of Custom AMIs:

Maintenance: Updating custom AMIs can be cumbersome if you need to make frequent changes.
Versioning: Managing different versions of custom AMIs can be complex.
Storage Costs: Custom AMIs consume storage space in your account.

Use Cases for Custom AMIs:

Application Stacks: Create custom AMIs with your application stack pre-installed.
Golden Images: Use custom AMIs as golden images for consistent deployments.
Compliance and Security: Customize AMIs to meet compliance requirements.
Prebaking vs. Bootstrapping: Consider whether to prebake components into the AMI or bootstrap them during instance launch (more on this next).

Creating a Custom AMI: A Step-by-Step Walkthough

Launch a Reference Instance: Create an EC2 instance with the desired operating system and configuration.
Install and Configure Applications: Install and configure all necessary applications and settings on the reference instance.
Optimize the Instance: Remove unnecessary packages and optimize configurations for performance and security.
Create an AMI from the Instance: Use the AWS Management Console, AWS CLI, or AWS SDK to create an AMI from the reference instance.

Bootstrapping: Automating Instance Configuration

Bootstrapping refers to configuring an instance after it launches. Instead of embedding everything in the AMI, you perform setup tasks during instance initialization.
This eliminates the need for manual configuration tasks and ensures consistency across your instances.

Common Bootstrapping Techniques

User Data Scripts: Scripts provided during launch that run on the instance after it boots. Often written in Bash or PowerShell.
Chef/Puppet: Configuration management tools that automate provisioning and configuration of software on instances.
Cloud-Init: A service pre-installed on most AMIs that allows you to run scripts or configure packages at boot time.

Benefits of Bootstrapping

Reduced Errors: Automates repetitive tasks, minimizing configuration errors.
Increased Efficiency: Saves time by eliminating manual configuration for each instance.
Improved Consistency: Ensures all instances have the same configuration.

Challenges of Bootstrapping:

Instance Startup Time: Bootstrapping adds time to instance startup.
Dependencies: Ensure that all required dependencies are available during bootstrapping.
Idempotency: Bootstrapping scripts should be idempotent (safe to run multiple times).

Use Cases for Bootstrapping:

Dynamic Configuration: Set instance-specific parameters during launch.
Software Installation: Install software, packages, and dependencies.
Post-Deployment Tasks: Execute tasks after instance launch.
Scaling Events: Bootstrapping allows customization during auto-scaling events.

Example Bootstrapping with User Data Script

Here’s a simplified example of a user data script that installs and configures a web server on an instance:

#!/bin/bash

# Update package lists
sudo apt update

# Install Apache web server
sudo apt install apache2 -y

# Start the Apache service
sudo systemctl start apache2

# Enable Apache to start automatically on boot
sudo systemctl enable apache2

# Create a basic index.html file
echo "<h1>Welcome to your custom EC2 instance!</h1>" | sudo tee /var/www/html/index.html

Combining Custom AMIs and Bootstrapping:

Best Practice: Use a combination of both approaches.
Prebaking: Embed critical components (e.g., base software, security configurations) into the custom AMI.
Bootstrapping: Perform dynamic configuration, application-specific setup, and late customization during instance launch.
Benefits: Faster instance provisioning with a consistent environment and the ability to adapt to specific contexts.

More information: https://docs.aws.amazon.com/whitepapers/latest/overview-deployment-options/prebaking-vs.-bootstrapping-amis.html

Notes when building AMI from EC2 instance running windows server joined domain

When creating an Amazon Machine Image (AMI) from an EC2 instance running Windows Server that is joined to a domain, there are some important considerations to keep in mind. Let’s break down the steps:

Sysprep and Domain Join:

As a general rule for domain-joined Windows machines, you should not clone them directly. Instead, follow these steps:
- Remove the instance from the domain.
- Use Sysprep to generalize the instance. Sysprep resets the machine’s security identifier (SID), removes unique system information, and prepares it for cloning.
- Once sysprep is complete, create an AMI from the instance. This AMI will serve as your base image.
- Each new clone created from this AMI will join the domain using a new SID, name, and IP address, creating a new computer account in Active Directory¹.

Automating Domain Join:

To automate domain join during instance launch, you can use a PowerShell script. Here’s how:
- Create a PowerShell script that joins the server to the domain.
- Secure the credentials by converting the PowerShell script to an executable (.exe) using tools like PS2exe.
- Upload the .exe file to an S3 bucket.
- Create an IAM role with a policy that allows read access to the S3 bucket.
- Attach this IAM role to your EC2 instances. During launch, the instance will execute the script and join the domain².

Creating the Custom AMI:

You can create an AMI using either the AWS Management Console or the command line.
The process involves creating an image from a running EC2 instance.
Follow the steps outlined in the official AWS documentation to create your custom Windows AMI⁵.

Domain-joined instances require careful handling to avoid issues related to duplicate SIDs and computer accounts.

Auto Scaling

Autoscaling Basics

Core Concept: Autoscaling dynamically adjusts the number of EC2 instances within an Auto Scaling Group (ASG) to match your application’s fluctuating workload demand
It ensures that your application can handle varying levels of traffic without manual intervention.
Here’s how it works:
- Scale Up: When there’s a spike in web traffic, autoscaling adds more resources (instances) to your server farm.
- Scale Down: During low traffic periods, it reduces the number of instances to save costs.
Benefits:
- Maintains Availability: Ensures you have enough instances to handle traffic spikes.
- Optimizes Costs: Keeps resource usage efficient by adding or removing instances as needed.

How Autoscaling Works (Simplified)

Define an Auto Scaling Group (ASG): A group of EC2 instances managed together.
Set Scaling Policies: Rules that define how and when to increase (scale out) or decrease (scale in) the number of instances in the ASG.
Metrics Monitoring: CloudWatch metrics (e.g., CPU utilization, network traffic) trigger scaling activities based on the policies you’ve defined.

Beyond Simple Scaling

While basic autoscaling is reactive and based on current metrics, these advanced techniques add more control and proactiveness:

Predictive Scaling
- How it Works: AWS analyzes historical data to anticipate future demand using machine learning, scaling your capacity in advance.
- Use Cases:
  - Cyclical Traffic: Regular business hours vs. evenings/weekends.
  - On-and-Off Workloads: Batch processing, testing, periodic data analysis.
  - Long Initialization Times: Applications taking time to start up.
- Benefits:
  - Proactive: Adds capacity in advance of forecasted load.
  - Cost Savings: Avoids over-provisioning.
  - High Availability: Maintains performance during load transitions.
- Implementation: Analyze historical patterns and adjust capacity accordingly.
Scheduled Scaling
- How it Works: Scale your capacity based on predefined schedules.
- Use Cases:
  - Anticipated traffic spikes due to events
  - Scaling down development environments during non-working hours
- Configuration: Create scheduled actions to perform scaling at specific times.
Lifecycle Hooks
- How it Works Pauses automatic scaling actions when instances are launched or terminated, allowing you to perform custom actions (e.g., warm-up tasks on new instances).
- Use Cases:
  - Ensuring applications are ready before being put into service.
  - Downloading data or logs before an instance is terminated.
- Examples:
  - Launch Event: Wait for instance readiness (e.g., software installation) before registering with Elastic Load Balancing.
  - Termination Event: Pause instance before termination for data retrieval (e.g., logs).
- Benefits:
  - Control: Fine-tune instance transitions.
  - Efficiency: Optimize actions before or after lifecycle events.

Key Considerations

Cooldown periods: Prevent rapid fluctuations after scaling actions.
Metrics: Choose metrics relevant to your application’s workload.
Combination of techniques: Use these strategies together for maximum flexibility.

Example: Combining Predictive, Scheduled, and Lifecycle Hooks

Predictive scaling handles regular workday traffic fluctuations.
Scheduled scaling increases capacity in advance of a known sales event.
Lifecycle hooks ensure new instances are fully configured before accepting traffic.

Predictive Scaling

Type of scaling: Proactive scaling based on machine learning analysis of historical usage patterns to forecast future demand.
How it Works:
1. AWS collects and analyzes your Auto Scaling Group’s CloudWatch metrics.
2. Machine learning models identify recurring patterns and predict upcoming workload demand.
3. Autoscaling adjusts capacity ahead of time to align with these demand forecasts.
Best Suited for:
- Applications with regular, recurring traffic fluctuations (daily, weekly, etc.).
- Workloads with slow instance initialization times. Predictive scaling lets you have capacity ready in advance.
- Optimizing responsiveness by reducing scaling lag compared to purely reactive methods.
Considerations:
- May require a substantial set of historical data to build accurate models.
- Might not be ideal for sudden, unexpected traffic spikes.

Practical Use-Case Scenario: Predictive Scaling

Problem Statement:

Imagine you’re developing a web application for an e-commerce platform. During regular business hours (9:00 AM to 6:00 PM), the website experiences a significant increase in traffic due to users shopping, browsing products, and making purchases. However, during evenings and weekends, the traffic decreases substantially.

Solution with Predictive Scaling:

Understanding the Context:
- The goal is to ensure optimal performance and cost efficiency throughout the day.
- We want to scale up the application during peak hours and scale down during off-peak hours.
Predictive Scaling Approach:
- Data Collection:
  - Gather historical data on user traffic patterns. Monitor the number of concurrent users, page views, and transactions during different time slots.
- Forecasting:
  - Use statistical methods (such as moving averages, exponential smoothing, or machine learning models) to predict future traffic based on historical data.
  - For example, if the last five Mondays showed increased traffic from 10:00 AM to 2:00 PM, we can expect a similar pattern next Monday.
- Thresholds and Triggers:
  - Set thresholds for scaling actions. For instance:
    - If the predicted traffic exceeds a certain threshold (e.g., 80% of server capacity), trigger scaling up.
    - If the predicted traffic drops below another threshold (e.g., 30% of server capacity), trigger scaling down.
- Automated Scaling Actions:
  - During peak hours (9:00 AM to 6:00 PM):
    - Autoscale by adding more instances to handle increased traffic.
    - Ensure that the system can handle sudden spikes (e.g., flash sales or promotions).
  - During off-peak hours (evenings and weekends):
    - Autoscale down by removing unnecessary instances to save costs.
    - Maintain a minimum baseline capacity to handle essential tasks (e.g., background jobs, maintenance).
Benefits:
- Performance: Predictive scaling ensures optimal performance during peak hours.
- Cost Efficiency: By scaling down during off-peak hours, you save on infrastructure costs.
- User Experience: Users experience consistent responsiveness regardless of the time of day.
Implementation:
- Use cloud services like Amazon EC2 Auto Scaling or similar tools to automate scaling based on predictions.
- Monitor actual traffic and adjust predictions periodically to adapt to changing user behavior.

Scheduled Scaling

Type of scaling: Proactive scaling based on predefined time intervals and capacity needs.
How it Works:
1. You configure scaling schedules that include the desired instance count, start time, and end time.
2. Autoscaling adheres to the schedule, adjusting the number of instances accordingly.
Best Suited for:
- Predictable traffic patterns: If your application experiences regular fluctuations based on time of day, day of the week, etc.
- Scheduled capacity changes: Need to scale up for specific events or promotions in advance.
- Cost optimization: Scale down non-production environments during off-hours.
Configuring Scheduled Scaling
- AWS Management Console: Create scheduled actions within your Auto Scaling Group.
- AWS CLI/SDK: Use commands or API calls to create and manage scheduled actions.

Additional Considerations

Combine with other scaling types: Use scheduled scaling as a foundation and then layer reactive scaling (based on metrics) to handle unexpected spikes within the event period itself.
Cooldown periods: Factor in cooldown periods to prevent unnecessary fluctuations.
Testing: Test your scaling strategy in a staging environment before implementing it in production.

Use Case Where Scheduled Scaling Might Struggle

Scenario: You have a news website with unpredictable traffic surges related to breaking news events.
Why Scheduled Scaling is a Challenge:
- You cannot anticipate when a news event will occur.
- Traffic spikes happen quickly and require rapid scaling.
Better Alternatives:
- Reactive scaling: Metric-based scaling for immediate response to sudden traffic increases.
- Large minimum capacity: Maintain a baseline to handle an initial influx of traffic while scaling up continues.

Real-World Examples: Companies Using Scheduled Scaling

Retailers: Scaling up for holiday shopping seasons, flash sales, or marketing events well in advance.
Financial Services: Ensuring capacity during predictable periods of stock market activity.
Development Teams: Scaling down non-production environments outside of working hours to optimize costs.
Practical Use Case: E-commerce Sales Event

Problem: You anticipate a significant traffic surge during a seasonal sale. Proactive scaling is needed as reactive scaling might not keep up.
Scheduled Scaling Solution:
- Analyze historical sales data to pinpoint peak traffic periods during the sales event.
- Create multiple scheduled scaling actions for the Auto Scaling Group:
  - Increase capacity a few hours before the sale starts.
  - Maintain peak capacity during the sale period.
  - Gradually decrease capacity based on anticipated traffic decline afterward.
Benefits:
- Ensures application availability by having resources already provisioned in advance of the traffic surge.
- Provides a smooth user experience throughout the event.
- Potentially reduces costs compared to overprovisioning with reactive scaling alone.

Combining Predictive and Scheduled Scaling

Complementary Roles: Scheduled scaling provides a baseline and handles known events, while predictive scaling adds responsiveness to recurring but unpredictable fluctuations within that framework

Example:

Scheduled Scaling:
- Increase capacity before peak sales event hours
- Decrease capacity after sales event
Predictive Scaling:
- Manages daily traffic fluctuations within the event period
- Handles smaller, unexpected traffic surges within the higher allocated capacity.

Designing highly resilient architectures that leverage auto-scaling groups

Key Principles

Redundancy: Build redundancy into your architecture by spreading resources across multiple Availability Zones (AZs). This ensures that a single AZ failure won’t take down your entire application.
Loose Coupling: Design components of your application to be loosely coupled. This minimizes the impact of a failure on other parts of the system.
Monitoring: Implement robust monitoring with CloudWatch. Set up alarms to trigger scaling activities or alert you about issues.
Automation: Automate as much as possible, from infrastructure deployment to self-healing mechanisms. This reduces the time to recover from failures.

Integrating Auto Scaling Groups for Resilience

Redundancy and Availability:
- Redundancy: Design your architecture with redundancy at every layer. This means having backup components or resources to prevent a single point of failure.
- Load Balancing: Use load balancers to distribute traffic across multiple instances. This ensures that if one instance fails, others can handle the load.
- Auto-Scaling Groups (ASGs): ASGs automatically adjust the number of instances based on demand. They add or remove instances as needed, maintaining availability during traffic spikes or failures.
Fault Tolerance:
- Design for Failure: Assume that components will fail and plan accordingly. Distribute workloads across multiple availability zones (AZs) to survive AZ-level failures.
- Multi-AZ Deployment: Deploy resources in multiple AZs to ensure high availability. ASGs can span AZs, allowing seamless failover.
Disaster Recovery:
- Cross-Region Replication: Leverage multi-region architectures. If one region experiences a disaster, fail over to another region.
- Backup and Restore Mechanisms: Regularly back up data and configurations. Use automated backup solutions to restore services quickly.
Monitoring and Testing:
- Continuous Monitoring: Monitor resource utilization, performance, and health. Set up alerts for anomalies.Regular Testing: Conduct failover tests to ensure that your architecture behaves as expected during failures.
Auto-Scaling Groups (ASGs):
- Dynamic Scaling: ASGs automatically adjust the number of instances based on metrics (CPU utilization, network traffic, etc.).
- Scaling Policies: Define scaling policies (e.g., target tracking, step scaling) to control scaling behavior.
- Health Checks: ASGs monitor instance health and replace unhealthy instances.
Lifecycle Hooks:
- Custom Actions: Use lifecycle hooks to perform custom actions during instance launch or termination.
- Graceful Shutdown: Pause instances during termination to allow data retrieval or cleanup.
Security and Compliance:
- Security Groups: Define security groups to control inbound and outbound traffic.
- IAM Roles: Assign appropriate permissions to instances using IAM roles.
- Compliance: Ensure compliance with industry standards (e.g., HIPAA, PCI-DSS).

Example Architecture

Here’s a simplified example of a resilient architecture for a web application:

Presentation Tier:
- An Elastic Load Balancer (ELB) in a public subnet, spanning two or more AZs.
- An Auto Scaling Group of EC2 instances running a web server, spread across multiple AZs.
Application Tier
- Another Auto Scaling Group with EC2 instances running your application logic, across AZs.
- Consider a microservices architecture here for granular scaling and resilience.
Data Tier
- Use a managed database service with multi-AZ support (e.g., Aurora RDS).
- Optionally, leverage read-replicas and database caching to enhance performance.

Best Practices

Test Failure Scenarios: Simulate failures to validate the resilience of your architecture. Use tools like Netflix’s Chaos Monkey for controlled failure testing.
Immutable Infrastructure: Treat your EC2 instances as immutable. Bake configuration details into your AMIs and rely on the ASG to replace instances, rather than modifying existing ones.
Graceful Degradation: Design your application to degrade gracefully in the event of component failures, preventing cascading failures.

Additional Considerations

Amazon Route53: Use Route 53 for DNS failover between availability zones should an ELB in one AZ fail.
Security Groups: Configure your security groups to allow necessary traffic flow while maintaining a strict security posture.

Netflix: A Case Study in Scalability & Resilience

Netflix is known for its streaming platform that experiences massive fluctuations in traffic at a global scale. Their core architectural principles heavily emphasize autoscaling to maintain seamless service. Here’s how:

Microservices Architecture: Dividing their applications into smaller, independently deployable services allows for granular scaling and fault isolation.
Diverse Instance Types: Using a mix of EC2 instance types lets them optimize for performance and cost based on the specific needs of each microservice.
Multi-Region and Multi-AZ: Their infrastructure spans multiple regions and availability zones worldwide, providing geographic redundancy and fault tolerance.
Heavy Use of Auto Scaling Groups: Netflix deploys extensive use of ASGs, relying on them for:
- Workload-based scaling: Automatically adjusting capacity to meet demand.
- Self-healing: Replacing unhealthy instances to maintain service availability.
- Rolling Deployments: Updating software across their services gradually, ensuring a safe rollout with zero downtime.

Specific Scaling Strategies

Predictive and Scheduled Scaling: Anticipating demand based on historical data and schedule-based increases during known peak times (evenings, weekends).
Chaos Engineering: Proactive testing of resilience by deliberately introducing failures. Netflix’s Chaos Monkey is a well-known tool for this purpose.
Custom Autoscaling Logic: Netflix has developed sophisticated in-house autoscaling tools tailored to their complex needs.

Outcomes

Seamless User Experience: Viewers rarely experience disruptions, even during extreme traffic events.
Global Availability: The combination of geographic redundancy, multi-AZ infrastructure, and autoscaling helps Netflix remain highly accessible worldwide.
Operational Efficiency: Automation and autoscaling reduce the burden on their operations teams, allowing them to focus on innovation.

Lessons Learned

Autoscaling is Not a Silver Bullet: It requires meticulous architecture, proper metrics, and continuous testing.
Start Simple, Iterate: Begin with basic autoscaling and gradually add sophisticated techniques (scheduled, predictive) as your understanding of workload patterns and system behavior grows.
Resilience as a First-Class Goal: Build your systems with resilience in mind from the start, considering fault tolerance, and graceful degradation.

AWS Lambda

AWS Lambda at a Glance

Serverless Function Service: AWS Lambda lets you run code without worrying about provisioning or managing servers. It’s the core pillar of serverless computing on AWS.
Pay-Per-Use: You’re charged based on the number of requests and the duration your code executes, making it highly cost-effective.
Scaling Built-in: Lambda can scale automatically in response to incoming events, handling massive spike in traffic effortlessly.
Language Support: Write your functions in popular languages like Python, Node.js, Java, Go, and more.

Event-Driven Architectures

Core Concept: Systems designed to react to specific events (e.g., a file uploaded to S3, a new database entry, or an API call). This model allows for decoupled components that respond autonomously.
Advantages:
- Scalability: Each event can trigger concurrent Lambda instances for seamless scaling.
- Agility: Makes it easy to add new features and react to events quickly without modifying a large codebase.
- Cost-Effectiveness: With Lambda’s pricing model, you often pay less compared to running always-on servers.

Key components of AWS Lambda:

1. Function

The Core: This is where your application logic resides—the code that does the actual work. AWS Lambda supports many popular programming languages like Python, Java, Node.js, C#, Go, and more.
Handler: The handler is the designated entry point for your code. It’s a function within your code that Lambda executes when the function is triggered.

2. Configuration

Runtime: The runtime environment determines which programming language your Lambda function will use (e.g., Python 3.9, Node.js 16, etc.).
Memory: You allocate a specific amount of memory to your function. More memory can lead to faster execution but incurs a higher cost.
Timeout: The maximum duration your function is allowed to run. If it exceeds this limit, Lambda will terminate it.
Execution Role: An IAM (Identity and Access Management) role that defines the permissions your Lambda function has when interacting with other AWS services.

3. Event Source (Trigger)

What sets it off: An event source is what triggers your Lambda function to run. This could be a wide variety of things:
- Changes in an S3 bucket (a file is uploaded)
- An HTTP request to an API Gateway endpoint
- Messages arriving in an SQS queue
- Scheduled events (e.g., run a function every hour)
- Many other AWS services

Other Important Components

Layers: Lambda Layers allow you to package up libraries, dependencies, or custom runtimes and share them across multiple Lambda functions.
Container Images: You can package and deploy Lambda functions as container images. This provides more flexibility in terms of dependencies and the runtime.
Lambda SnapStart: Optimized execution environments for certain runtimes (like Java) that can significantly reduce cold start times.

Triggers: What Initiates Lambda Functions

AWS services can directly trigger Lambda functions in various ways:

S3 Events: A file upload, deletion, or change in your S3 bucket.
API Gateway: Invoke Lambda functions as backends for your REST APIs.
DynamoDB Streams: Lambda can process updates to your DynamoDB tables in real-time.
SNS (Simple Notification Service): React to messages published to an SNS topic.
Scheduled Events: Functions can be triggered on a regular schedule (e.g., run data transformations daily).

Integration Patterns

Lambda integrates seamlessly with other AWS services, facilitating common use cases:

Data Processing:
- Triggered by changes in S3, Lambda can perform image resizing, video transcoding, or data analysis.
Web & Mobile Backends:
- Create REST APIs with API Gateway and Lambda, avoiding the need to manage web servers.
Microservices:
- Build small, interconnected services using Lambda functions, often paired with API Gateway for communication.
Stream Processing:
- Combine Lambda with Kinesis Data Streams for real-time processing of streaming data.

Key Considerations

Statelessness: Lambda functions are stateless; any data you need to persist should be stored in services like S3 or DynamoDB.
Concurrency Limits: Be aware of default concurrency limits, adjustable if needed.
Cold Starts: The first invocation of a new Lambda function may incur some extra latency. This is usually mitigated by design and keeping functions warm.

How can I define Lambda handler?

Understanding the Handler

The handler is the function within your Lambda codebase that AWS Lambda directly calls when the function is triggered. It has two main responsibilities:

Receiving the Event: The event object contains data relevant to the trigger (e.g., an S3 file upload event, an API request).
Returning a Response: Your handler function processes the event data and typically provides a response or result.

Defining the Handler

The way you define a handler depends on your chosen programming language:

Python:

def lambda_handler(event, context):
    # Your code to process the event
    return some_response

Node.js:

exports.handler = async (event, context) => {
    // Your code to process the event
    return some_response;
}

Java:

public class MyHandler implements RequestHandler<EventType, ResponseType> { 
    public ResponseType handleRequest(EventType event, Context context) {
        // Your code to process the event 
        return some_response; 
    }
}

Where to Specify the Handler:

AWS Console:

When creating your Lambda function, you’ll find a “Runtime settings” section.
Enter the handler in the format filename.function_name (e.g., lambda_function.lambda_handler for a Python code file named lambda_function.py)

Deployment Tools (SAM, Serverless Framework, etc.):
- Configuration files are used to specify the handler along with other function details.

Important Notes:

The event and context objects provide information about the trigger and the Lambda runtime environment.
The handler’s filename can be whatever you like, but the handler function’s name must match what you specify in the configuration.

Example event-driven architectures for common use cases

1. Real-Time Image and Video Processing

Scenario: Users upload images or videos to a website or application that require processing (resizing, format conversion, thumbnail generation, etc.).
Event-Driven Flow:
1. User uploads a file to an S3 bucket.
2. The S3 object creation event triggers an AWS Lambda function.
3. The Lambda function performs the necessary image/video processing tasks.
4. Processed files are saved back into S3 or delivered to a designated location.
Benefits:
- Highly scalable to handle bursts of uploads.
- Cost-effective—you don’t pay for idle servers.
- Responsive experience for users as processing happens in the background.

2. Data Pipelines and Analytics

Scenario: You need a system to collect, transform, and analyze data from various sources like IoT sensors or logs.
Event-Driven Flow:
1. Data from sensors or logs is sent continuously to an Amazon Kinesis Data Stream.
2. Kinesis triggers a Lambda function to process each data record in real-time.
3. The Lambda function cleans, transforms, and potentially enriches the data.
4. Processed data is sent to a data warehouse (e.g., Redshift) or a real-time dashboard for analysis.
Benefits:
- Real-time insights into operational metrics or user behavior.
- Flexibility to adjust data processing logic independently.
- Scalability to handle varying volumes of data.

3. Serverless Web Applications

Scenario: You want to build a dynamic website or API without managing traditional web servers.
Event-Driven Flow:
1. HTTP requests from users reach AWS API Gateway.
2. API Gateway routes requests to corresponding Lambda functions.
3. Lambda functions interact with databases (e.g., DynamoDB), external APIs, or other services as needed.
4. Lambda functions generate the HTTP response sent back to the user.
Benefits:
- No server management, enhancing focus on your application logic.
- Rapid scaling to handle traffic peaks.
- Can be highly cost-efficient, especially for variable workloads.

4. Automated IT and DevOps Tasks

Scenario: You want to automate tasks like responding to alerts, provisioning resources, or running scheduled cleanup jobs.
Event-Driven Flow:
1. A CloudWatch Event can trigger a Lambda function based on a schedule, an alarm, or changes in AWS resources.
2. The Lambda function executes the desired automation tasks. Examples include:
  - Snapshots of important data or databases.
  - Starting or stopping EC2 instances on a schedule.
  - Sending notifications to a Slack channel based on triggered alarms.
Benefits:
- Improves efficiency by automating routine tasks.
- Saves resources and cost by starting/stopping resources when needed.

Lambda Performance optimization: Layers, concurrency, provisioned concurrency

1. Layers: Streamlining Code and Dependencies

Concept: Lambda Layers are ZIP archives containing libraries, custom runtime code, or other dependencies that can be shared across multiple functions.
Benefits:
- Reduction of Deployment Package Size: Pre-packaging dependencies reduces your function’s code bundle, improving cold start times.
- Reusability: Share common libraries and code across multiple functions, easing maintenance.
- Versioning: Layers can be versioned, enabling controlled updates.

2. Concurrency: Fine-Tuning Parallelism

Basics: Concurrency refers to the number of instances of your Lambda function that can be processing requests simultaneously.
Optimization Considerations:
- Function Duration: If your functions execute quickly, higher concurrency allows for serving more requests with lower latency.
- Downstream Service Limits: If your Lambda function interacts with a database, you must consider the maximum concurrent connections your database can handle.
- Error Handling: Ensure your code is robust when handling concurrent execution.

3. Provisioned Concurrency: Combating Cold Starts

Concept: Provisioned concurrency keeps Lambda functions initialized and ready to serve requests, virtually eliminating cold starts.
When to Use:
- Applications with strict latency requirements where cold starts are unacceptable.
- Workloads with predictable traffic patterns and where you understand the necessary level of pre-warmed instances.
Important Notes:
- There is a cost associated with provisioned concurrency even when the functions aren’t actively executing.
- Provisioned concurrency needs careful calibration to find the optimal balance between cost and elimination of cold starts.

4.Lambda snapstart:

See more: https://shinchan.asia/2023/02/03/gioi-thieu-so-luoc-mot-so-dich-vu-cua-aws-trong-reinvent-2022/

Additional Performance Tips

Choose the Right Memory Allocation: Lambda charges based on memory and execution time. Finding the right memory allocation can lead to more efficient execution.
Language Choice: Languages like Node.js or Python often have faster cold start times compared to Java or .NET. Consider this if cold starts are a major concern.
Code Optimization: Standard software performance best practices apply to Lambda as well. Write efficient algorithms and avoid unnecessary operations.

Real-world examples of how companies have used Lambda optimization techniques

Example 1: Image Processing at SmugMug

Challenge: SmugMug, a photo-sharing platform, needed to optimize its image resizing process, which was experiencing increasing latency.
Solutions:
- Layers: Common image processing libraries were packaged as Lambda Layers, resulting in smaller deployment packages and faster cold starts.
- Memory Optimization: By experimenting with memory allocation, SmugMug found the ideal configuration that balanced performance and cost.
Results: Improved processing speed and dramatically reduced costs due to smaller Lambda packages and efficient memory usage.

Example 2: Serverless API at Nordstrom

Challenge: Nordstrom wanted to build highly scalable APIs with minimal latency for its online retail applications.
Solutions:
- Provisioned Concurrency Used for critical API endpoints to ensure consistent low-latency responses and eliminate cold starts.
- Concurrency Tuning: Nordstrom carefully adjusted concurrency levels based on the nature of each API endpoint to balance performance and cost.
Results: Significantly faster backend response times, resulting in a smoother and more responsive user experience on their e-commerce platform.

Example 3: Real-Time Analytics at Coca-Cola

Challenge: Coca-Cola needed to process vast amounts of IoT data from their vending machines for near real-time analytics.
Solutions:
- Concurrency Management: Lambda’s built-in scaling handled spikes in data, with concurrency levels carefully adjusted for different parts of the data processing pipeline.
- Optimized Code: Optimized algorithms for data transformation and aggregation further improved the speed of the Lambda functions.
Results: A real-time dashboard provided Coca-Cola with insights on vending machine performance, optimizing inventory and maintenance.

Key Takeaways

The Right Tool for the Job: Each optimization technique has ideal use cases. Picking the right ones is as important as implementing them well.
Test and Measure: Profiling your Lambda functions and load testing are essential to understand real-world performance impact.
Optimization is Iterative: Continuous monitoring and improvement are important. As usage patterns or code changes, your best settings might as well.