AZ-104: Microsoft Azure Administrator Comprehensive Study Guide

Complete Learning Path for Certification Success

Overview

This study guide provides a structured learning path from fundamentals to exam readiness for the Microsoft Azure Administrator Associate (AZ-104) certification. Designed for complete novices, it teaches all concepts progressively while focusing exclusively on exam-relevant content. Extensive diagrams and visual aids are integrated throughout to enhance understanding and retention.

Target Audience: Complete beginners with little to no Azure experience who need to learn everything from scratch.

Study Commitment: 6-10 weeks of dedicated study (2-3 hours per day)

What Makes This Guide Different:

Self-sufficient: You should NOT need external resources to understand concepts
Comprehensive: Explains WHY and HOW, not just WHAT
Novice-friendly: Assumes no prior Azure knowledge, builds up progressively
Example-rich: Multiple practical examples for every concept (3+ examples per topic)
Visually detailed: 120-200 diagrams with detailed explanations (200-800 words each)

File Organization

Study Chapters (Read in Order)

00_overview (this file) - How to use the guide and study plan

01_fundamentals - Chapter 0: Essential Background & Prerequisites

Azure cloud computing basics
Core Azure concepts and terminology
Azure hierarchy (subscriptions, resource groups, resources)
Azure Portal, CLI, and PowerShell fundamentals
Mental model of Azure ecosystem

02_domain_1_identities_governance - Chapter 1: Manage Azure Identities and Governance (24% of exam)

Microsoft Entra ID (formerly Azure AD) fundamentals
User and group management
Role-Based Access Control (RBAC)
Azure Policy and governance
Subscriptions and management groups
Cost management and tagging

03_domain_2_storage - Chapter 2: Implement and Manage Storage (18% of exam)

Storage account types and configuration
Blob storage and containers
Azure Files and file shares
Storage security (SAS, access keys, firewalls)
Storage redundancy and replication
Data management tools (Storage Explorer, AzCopy)

04_domain_3_compute - Chapter 3: Deploy and Manage Azure Compute Resources (24% of exam)

Virtual machines (VMs) creation and configuration
VM availability (availability sets, availability zones)
VM disks and encryption
Azure Virtual Machine Scale Sets
Containers (Azure Container Instances, Container Apps, Container Registry)
Azure App Service and deployment
ARM templates and Bicep

05_domain_4_networking - Chapter 4: Implement and Manage Virtual Networking (18% of exam)

Virtual networks (VNets) and subnets
Network security groups (NSGs) and application security groups
Virtual network peering
Azure Bastion and secure access
Service endpoints and private endpoints
Azure DNS configuration
Load balancing solutions

06_domain_5_monitoring - Chapter 5: Monitor and Maintain Azure Resources (16% of exam)

Azure Monitor fundamentals
Metrics and logs
Log Analytics and KQL queries
Alerts and action groups
Azure Backup configuration
Azure Site Recovery
Network Watcher tools

07_integration - Integration & Cross-Domain Scenarios

Multi-domain architecture patterns
Common exam scenario types
Decision frameworks for complex scenarios
Real-world integration examples

08_study_strategies - Study Techniques & Test-Taking Strategies

Effective study methods
Memory aids and mnemonics
Time management during exam
Question analysis techniques
Handling difficult questions

09_final_checklist - Final Week Preparation Checklist

7-day countdown plan
Knowledge audit checklist
Practice test marathon schedule
Exam day preparation

99_appendices - Quick Reference & Resources

Service comparison matrices
Important limits and quotas
Common formulas and calculations
Glossary of terms
Additional resources

Diagrams Folder

diagrams/ - Contains all Mermaid diagram files (.mmd)

120-200 visual diagrams organized by chapter
Architecture diagrams, sequence flows, decision trees
Each diagram referenced in the corresponding chapter

Study Plan Overview

Recommended Timeline: 8 Weeks (2-3 hours daily)

Week 1: Foundations

Read: 00_overview (this file)
Read: 01_fundamentals
Practice: Set up free Azure account, explore Portal
Goal: Understand Azure basics and terminology

Week 2: Identity & Governance (Part 1)

Read: 02_domain_1_identities_governance (first half)
Focus: Microsoft Entra ID, users, groups, RBAC
Practice: Create users, assign roles, test permissions
Goal: Master identity management

Week 3: Identity & Governance (Part 2)

Read: 02_domain_1_identities_governance (second half)
Focus: Azure Policy, subscriptions, cost management
Practice: Create policies, configure budgets, apply tags
Goal: Understand governance and cost control

Week 4: Storage

Read: 03_domain_2_storage
Focus: Storage accounts, blobs, files, security
Practice: Create storage accounts, upload files, configure SAS
Goal: Master Azure storage solutions

Week 5: Compute

Read: 04_domain_3_compute
Focus: VMs, scale sets, containers, App Service
Practice: Deploy VMs, create scale sets, deploy containers
Goal: Understand compute options and deployment

Week 6: Networking

Read: 05_domain_4_networking
Focus: VNets, NSGs, peering, load balancing
Practice: Create VNets, configure NSGs, set up peering
Goal: Master virtual networking

Week 7: Monitoring & Integration

Read: 06_domain_5_monitoring
Read: 07_integration
Focus: Azure Monitor, backup, cross-domain scenarios
Practice: Configure monitoring, set up backups, test recovery
Goal: Understand monitoring and integration patterns

Week 8: Exam Preparation

Read: 08_study_strategies
Read: 09_final_checklist
Practice: Full practice tests, review weak areas
Goal: Achieve 75%+ on practice tests

Alternative Timeline: 6 Weeks (Accelerated)

For those with some IT background:

Weeks 1-2: Fundamentals + Domain 1 (files 01-02)
Week 3: Domain 2 + Domain 3 (files 03-04)
Week 4: Domain 4 + Domain 5 (files 05-06)
Week 5: Integration + Practice tests (file 07)
Week 6: Final prep + exam strategies (files 08-09)

Alternative Timeline: 10 Weeks (Thorough)

For complete beginners:

Week 1: Fundamentals only (file 01)
Weeks 2-3: Domain 1 (file 02)
Week 4: Domain 2 (file 03)
Weeks 5-6: Domain 3 (file 04)
Week 7: Domain 4 (file 05)
Week 8: Domain 5 + Integration (files 06-07)
Week 9: Practice tests and review
Week 10: Final prep (files 08-09)

Learning Approach

The 3-Pass Method

Pass 1: Understanding (First Read)

Read each chapter thoroughly from start to finish
Take detailed notes on ⭐ Must Know items
Study all diagrams and their explanations
Complete practice exercises after each section
Don't worry about memorization yet - focus on understanding

Pass 2: Application (Second Read)

Review chapter summaries and key concepts
Focus on decision frameworks and comparison tables
Practice hands-on labs in Azure Portal
Take domain-focused practice tests
Identify weak areas for additional study

Pass 3: Reinforcement (Final Review)

Review only flagged items and weak areas
Memorize critical facts, limits, and formulas
Take full-length practice tests
Review all diagrams for visual recall
Use 99_appendices as quick reference

Active Learning Techniques

Teach Someone: Explain concepts out loud to a friend, colleague, or even a rubber duck
Draw Diagrams: Recreate architecture diagrams from memory
Write Scenarios: Create your own exam-style questions
Compare Options: Use comparison tables to understand differences
Hands-On Practice: Actually configure services in Azure Portal (use free tier)

Progress Tracking

Use checkboxes to track your completion:

Chapter Completion:

01_fundamentals completed
02_domain_1_identities_governance completed
03_domain_2_storage completed
04_domain_3_compute completed
05_domain_4_networking completed
06_domain_5_monitoring completed
07_integration completed
08_study_strategies completed
09_final_checklist completed
99_appendices reviewed

Practice Test Scores (Target: 75%+ to pass):

Beginner Bundle 1: ____%
Beginner Bundle 2: ____%
Intermediate Bundle 1: ____%
Intermediate Bundle 2: ____%
Advanced Bundle 1: ____%
Advanced Bundle 2: ____%
Full Practice Test 1: ____%
Full Practice Test 2: ____%
Full Practice Test 3: ____%

Self-Assessment by Domain (Rate 1-5, target 4+):

Domain 1 (Identities & Governance): ___/5
Domain 2 (Storage): ___/5
Domain 3 (Compute): ___/5
Domain 4 (Networking): ___/5
Domain 5 (Monitoring): ___/5

Legend & Visual Markers

Throughout this guide, you'll see these markers:

⭐ Must Know: Critical information for the exam - memorize this
💡 Tip: Helpful insight, shortcut, or best practice
⚠️ Warning: Common mistake or misconception to avoid
🔗 Connection: Related to other topics in the guide
📝 Practice: Hands-on exercise or lab suggestion
🎯 Exam Focus: Frequently tested on the actual exam
📊 Diagram: Visual representation available (see diagrams folder)

How to Navigate This Guide

For Complete Beginners

Start with 01_fundamentals - don't skip this
Read chapters sequentially (02 → 03 → 04 → 05 → 06)
Spend extra time on diagrams - they're your best friend
Do hands-on practice after each chapter
Use 99_appendices as a quick reference when needed

For IT Professionals with Some Cloud Experience

Skim 01_fundamentals to fill knowledge gaps
Focus on Azure-specific concepts in each domain chapter
Pay attention to ⭐ Must Know and 🎯 Exam Focus items
Use comparison tables to understand service differences
Jump to 07_integration for advanced scenarios

For Visual Learners

Start by reviewing all diagrams in the diagrams/ folder
Read the diagram explanations in each chapter
Recreate diagrams from memory as a study technique
Use diagrams to understand relationships between services
Create your own diagrams for complex scenarios

For Hands-On Learners

Set up a free Azure account immediately
Follow 📝 Practice exercises after each section
Build real resources in Azure Portal as you learn
Break things and troubleshoot (best way to learn)
Use Azure sandbox environments for safe experimentation

Study Resources

Included in This Package

This comprehensive study guide (60,000-120,000 words)
120-200 Mermaid diagrams for visual learning
600 practice questions organized by domain and difficulty
Practice test bundles (difficulty-based, domain-focused, full tests)
Quick reference cheat sheet (separate file)

External Resources (Optional)

Azure Free Account: 12 months of free services + $200 credit
- Sign up at: https://azure.microsoft.com/free
Microsoft Learn: Official free training modules
- AZ-104 learning path: https://learn.microsoft.com/certifications/azure-administrator/
Azure Documentation: Official service documentation
- https://learn.microsoft.com/azure/
Azure Portal: Web-based management interface
- https://portal.azure.com

Tools You'll Need

Web Browser: For Azure Portal access
Azure CLI (optional): Command-line tool for Azure management
Azure PowerShell (optional): PowerShell module for Azure
Azure Storage Explorer (optional): GUI tool for storage management
Text Editor: For viewing ARM templates and scripts

Exam Details

Exam Information

Exam Code: AZ-104
Exam Name: Microsoft Azure Administrator
Duration: 120 minutes (150 minutes for non-native English speakers)
Number of Questions: 40-60 questions
Passing Score: 700 (out of 1000)
Question Types: Multiple choice, multiple select, drag-and-drop, hot area, case studies
Cost: $165 USD (varies by region)
Languages: English, Japanese, Chinese (Simplified), Korean, German, French, Spanish, Portuguese (Brazil), Russian, Arabic (Saudi Arabia), Chinese (Traditional), Italian

What to Expect

Case Studies: 1-3 case studies with multiple questions each
Scenario-Based: Most questions present real-world scenarios
Multiple Correct Answers: Some questions require selecting 2-3 correct answers
Performance-Based: Some questions involve configuring settings or ordering steps
No Partial Credit: Must select ALL correct answers to get points

Scoring

Scored on a scale of 100-1000
Passing score is 700
Some questions are experimental and don't count toward your score
You won't know which questions are experimental
Results available immediately after exam

Renewal

Certification valid for 12 months
Renewal required annually through Microsoft Learn
Free renewal assessment (no exam fee)
Must complete renewal within 6 months of expiration

Tips for Success

Before You Start

Set realistic goals: 2-3 hours of study per day for 6-10 weeks
Create a schedule: Block out study time on your calendar
Get hands-on access: Sign up for Azure free account immediately
Eliminate distractions: Find a quiet study space
Join a community: Consider study groups or online forums

During Your Study

Follow the plan: Don't skip chapters or rush through content
Take notes: Write down key concepts in your own words
Practice regularly: Hands-on experience is crucial
Test yourself: Use practice questions after each chapter
Review mistakes: Understand WHY you got questions wrong
Use diagrams: Visual learning enhances retention
Teach others: Explaining concepts solidifies understanding

Final Week

Review weak areas: Focus on domains where you scored <70%
Take full practice tests: Simulate exam conditions
Review cheat sheet: Quick refresher of key facts
Don't cram: Light review only, trust your preparation
Get rest: 8 hours of sleep before exam day

Exam Day

Arrive early: 30 minutes before scheduled time
Bring ID: Government-issued photo ID required
Brain dump: Write down key facts on scratch paper immediately
Read carefully: Don't rush through questions
Flag and move on: Don't get stuck on difficult questions
Review flagged: Use remaining time to review marked questions

Common Pitfalls to Avoid

Study Mistakes

❌ Skipping fundamentals chapter (01_fundamentals)
❌ Reading without hands-on practice
❌ Memorizing without understanding
❌ Ignoring diagrams and visual aids
❌ Not taking practice tests until the end
❌ Studying only theory without practical application
❌ Rushing through chapters to "finish faster"

Exam Mistakes

❌ Not reading questions carefully
❌ Spending too much time on one question
❌ Changing answers without good reason
❌ Ignoring keywords in questions (e.g., "least cost", "most secure")
❌ Not flagging difficult questions for review
❌ Panicking when encountering unfamiliar topics

Motivation & Mindset

Why This Certification Matters

Career advancement: Azure administrators are in high demand
Salary increase: Certified professionals earn 15-20% more on average
Skill validation: Proves your Azure administration expertise
Foundation: Stepping stone to advanced Azure certifications
Cloud expertise: Essential skill in modern IT infrastructure

Staying Motivated

Set milestones: Celebrate completing each chapter
Track progress: Use the checkboxes in this guide
Visualize success: Imagine yourself passing the exam
Remember your why: Keep your career goals in mind
Take breaks: Rest is part of effective learning
Join communities: Connect with other learners for support

Growth Mindset

Embrace challenges: Difficult topics are opportunities to learn
Learn from mistakes: Every wrong answer teaches something
Persist through frustration: Confusion is part of the learning process
Ask for help: Use forums, communities, and study groups
Celebrate progress: Acknowledge how far you've come

Ready to Begin?

You're about to embark on a comprehensive learning journey. This guide contains everything you need to pass the AZ-104 exam and become a confident Azure Administrator.

Your next step: Open 01_fundamentals and start Chapter 0.

Remember:

Quality over speed: Understanding is more important than rushing
Practice makes perfect: Hands-on experience is essential
Consistency wins: 2 hours daily beats 14 hours on weekends
You can do this: Thousands have passed this exam, and so will you

Good luck on your certification journey! 🚀

Last Updated: October 2025
Study Guide Version: 1.0
Exam Version: April 18, 2025 skills measured

Chapter 0: Essential Background & Prerequisites

File: 01_fundamentals

What You Need to Know First

This chapter builds the foundation for your Azure Administrator journey. Before diving into specific Azure services, you need to understand the fundamental concepts that underpin everything in Azure. Think of this as learning the alphabet before reading books.

Prerequisites for this certification:

Basic understanding of computer networks (IP addresses, DNS, HTTP/HTTPS)
Familiarity with operating systems (Windows or Linux basics)
Understanding of basic IT concepts (servers, storage, databases)
Comfort using a web browser and basic computer operations

If you're missing any: Don't worry! We'll explain Azure-specific concepts from scratch. For general IT concepts, consider reviewing basic networking and operating system tutorials online before continuing.

Time to complete this chapter: 3-4 hours

Core Concepts Foundation

What is Cloud Computing?

What it is: Cloud computing is the delivery of computing services (servers, storage, databases, networking, software) over the internet ("the cloud") instead of owning and maintaining physical hardware in your own data center.

Why it exists: Traditional IT infrastructure requires significant upfront investment in hardware, physical space, cooling, power, and maintenance staff. When your business grows, you need to buy more servers (which takes weeks or months). When demand decreases, those servers sit idle, wasting money. Cloud computing solves these problems by letting you rent computing resources on-demand, paying only for what you use.

Real-world analogy: Think of cloud computing like electricity. You don't build your own power plant to get electricity - you plug into the power grid and pay for what you use. Similarly, you don't build your own data center - you "plug into" Azure and pay for the computing resources you consume.

How it works (Detailed step-by-step):

You sign up for a cloud provider (like Microsoft Azure) and create an account. This gives you access to their global network of data centers.
You select the services you need through a web portal or command-line tools. For example, you might create a virtual machine (a computer in the cloud) or a storage account (disk space in the cloud).
The cloud provider provisions these resources in their data centers within seconds or minutes. Behind the scenes, they're allocating physical hardware, configuring networking, and setting up security.
You access and manage your resources over the internet using web browsers, mobile apps, or command-line tools. You can start, stop, resize, or delete resources at any time.
You're billed based on usage - typically by the hour or minute for compute resources, and by the gigabyte for storage. When you delete resources, billing stops.

💡 Tip: The key advantage of cloud computing is elasticity - the ability to quickly scale resources up or down based on demand. During Black Friday, an e-commerce site can add 100 servers in minutes, then remove them when traffic returns to normal.

Cloud Service Models

Cloud computing offers three main service models, each providing different levels of control and management:

Infrastructure as a Service (IaaS)

What it is: IaaS provides virtualized computing resources over the internet. You rent virtual machines, storage, and networks, but you're responsible for managing the operating system, applications, and data.

Why it exists: IaaS gives you maximum control and flexibility. It's like renting an empty apartment - you get the space and utilities, but you furnish it and maintain it yourself.

Real-world analogy: Renting a car. The rental company provides the vehicle (infrastructure), but you're responsible for driving it, putting gas in it, and returning it in good condition.

When to use:

✅ Use when: You need full control over the operating system and installed software
✅ Use when: You're migrating existing applications to the cloud ("lift and shift")
✅ Use when: You have specific compliance requirements that require OS-level control
❌ Don't use when: You want to minimize management overhead (consider PaaS instead)

Azure IaaS examples: Virtual Machines, Virtual Networks, Storage Accounts

Platform as a Service (PaaS)

What it is: PaaS provides a complete development and deployment environment in the cloud. The cloud provider manages the infrastructure, operating system, and runtime environment. You focus only on your application and data.

Why it exists: PaaS eliminates the complexity of managing infrastructure, allowing developers to focus on writing code. It's like renting a furnished apartment - you get everything you need to move in and start living.

Real-world analogy: Taking an Uber. You don't worry about the car, maintenance, or driving - you just specify your destination and the service handles everything else.

When to use:

✅ Use when: You want to focus on application development, not infrastructure management
✅ Use when: You need built-in scalability and high availability
✅ Use when: Multiple developers need to collaborate on the same project
❌ Don't use when: You need OS-level customization or specific software installations

Azure PaaS examples: Azure App Service, Azure SQL Database, Azure Functions

Software as a Service (SaaS)

What it is: SaaS provides complete, ready-to-use applications over the internet. You simply use the software through a web browser or app - the provider manages everything else.

Why it exists: SaaS eliminates all infrastructure and application management. It's like staying in a hotel - everything is provided and maintained for you.

Real-world analogy: Netflix. You don't install software, manage servers, or worry about updates. You just log in and watch movies.

When to use:

✅ Use when: You need standard business applications (email, CRM, collaboration tools)
✅ Use when: You want zero management overhead
✅ Use when: You need quick deployment with no setup time
❌ Don't use when: You need to customize the application significantly

Azure SaaS examples: Microsoft 365, Dynamics 365, Azure DevOps

Comparison Table:

Aspect	IaaS	PaaS	SaaS
You manage	Applications, Data, Runtime, OS	Applications, Data	Nothing (just use it)
Provider manages	Virtualization, Servers, Storage, Networking	Everything except apps & data	Everything
Control level	High	Medium	Low
Management effort	High	Medium	Low
Flexibility	Maximum	Moderate	Minimal
Example	Azure VM	Azure App Service	Microsoft 365
Best for	Custom infrastructure needs	Application development	End-user applications

⭐ Must Know: For the AZ-104 exam, you'll primarily work with IaaS (Virtual Machines, Storage, Networking) and some PaaS services (App Service, Azure SQL Database). Understanding the difference is crucial for choosing the right service.

Understanding the Azure Hierarchy

Azure organizes resources in a hierarchical structure with four levels. Understanding this hierarchy is absolutely critical for the AZ-104 exam because it affects permissions, billing, policies, and resource management.

📊 Azure Organization Hierarchy Diagram:

graph TB
    subgraph "Azure Organization Hierarchy"
        ROOT[Root Management Group<br/>Tenant Level]
        
        subgraph "Management Groups Layer"
            MG1[Management Group:<br/>Production]
            MG2[Management Group:<br/>Development]
            MG3[Management Group:<br/>Corporate]
        end
        
        subgraph "Subscriptions Layer"
            SUB1[Subscription:<br/>Prod-App1<br/>Billing Boundary]
            SUB2[Subscription:<br/>Prod-App2<br/>Billing Boundary]
            SUB3[Subscription:<br/>Dev-Testing<br/>Billing Boundary]
        end
        
        subgraph "Resource Groups Layer"
            RG1[Resource Group:<br/>WebApp-RG<br/>Lifecycle Container]
            RG2[Resource Group:<br/>Database-RG<br/>Lifecycle Container]
            RG3[Resource Group:<br/>Network-RG<br/>Lifecycle Container]
        end
        
        subgraph "Resources Layer"
            R1[Virtual Machine]
            R2[Storage Account]
            R3[SQL Database]
            R4[Virtual Network]
        end
    end
    
    ROOT --> MG1
    ROOT --> MG2
    ROOT --> MG3
    
    MG1 --> SUB1
    MG1 --> SUB2
    MG2 --> SUB3
    
    SUB1 --> RG1
    SUB1 --> RG2
    SUB2 --> RG3
    
    RG1 --> R1
    RG1 --> R2
    RG2 --> R3
    RG3 --> R4
    
    style ROOT fill:#e1f5fe
    style MG1 fill:#fff3e0
    style MG2 fill:#fff3e0
    style MG3 fill:#fff3e0
    style SUB1 fill:#f3e5f5
    style SUB2 fill:#f3e5f5
    style SUB3 fill:#f3e5f5
    style RG1 fill:#e8f5e9
    style RG2 fill:#e8f5e9
    style RG3 fill:#e8f5e9
    style R1 fill:#ffebee
    style R2 fill:#ffebee
    style R3 fill:#ffebee
    style R4 fill:#ffebee

See: diagrams/01_fundamentals_azure_hierarchy.mmd

Diagram Explanation (Detailed):

This diagram shows the complete Azure organizational hierarchy from top to bottom. At the very top (blue) is the Root Management Group, which represents your entire Azure tenant (your organization's Azure account). This is automatically created when you first use Azure and cannot be deleted.

Below that are Management Groups (orange), which are containers for organizing multiple subscriptions. In this example, we have three management groups: Production, Development, and Corporate. Management groups allow you to apply governance policies and access controls across multiple subscriptions at once. For instance, you might apply a policy at the Production management group level that requires all resources to be tagged with a cost center - this policy would automatically apply to all subscriptions under that management group.

The next layer shows Subscriptions (purple), which are the billing boundaries in Azure. Each subscription has its own billing account and spending limits. In the diagram, Prod-App1 and Prod-App2 are production subscriptions under the Production management group, while Dev-Testing is a development subscription under the Development management group. Organizations typically use separate subscriptions to isolate environments (production vs. development) or to separate billing for different departments or projects.

Below subscriptions are Resource Groups (green), which are logical containers for resources that share the same lifecycle. In the example, WebApp-RG contains resources for a web application, Database-RG contains database resources, and Network-RG contains networking resources. When you delete a resource group, all resources inside it are deleted together - this makes cleanup easy and prevents orphaned resources.

At the bottom are the actual Resources (red) - the Azure services you create and use. These include Virtual Machines, Storage Accounts, SQL Databases, Virtual Networks, and hundreds of other Azure services. Each resource must belong to exactly one resource group and cannot exist outside of a resource group.

The arrows show the inheritance flow: policies, permissions, and tags applied at higher levels automatically flow down to lower levels. For example, if you assign someone the "Reader" role at the Management Group level, they can read all resources in all subscriptions under that management group.

⭐ Must Know: This hierarchy is fundamental to Azure administration. Permissions, policies, and tags flow downward (inheritance), but billing is tracked at the subscription level.

Level 1: Management Groups

What it is: Management groups are containers that help you manage access, policies, and compliance across multiple Azure subscriptions. They sit above subscriptions in the hierarchy.

Why it exists: Large organizations often have dozens or hundreds of Azure subscriptions. Without management groups, you'd have to apply the same policies and permissions to each subscription individually, which is time-consuming and error-prone. Management groups let you apply governance once at a higher level, and it automatically cascades down to all child subscriptions.

Real-world analogy: Think of management groups like corporate divisions in a company. The CEO (root management group) sets company-wide policies, then each division (management group) can have additional policies, and finally each department (subscription) operates within those constraints.

How it works (Detailed step-by-step):

Azure automatically creates a root management group when you first use management groups. This root group represents your entire organization and cannot be deleted.
You create child management groups under the root to organize your subscriptions. For example, you might create "Production", "Development", and "Sandbox" management groups.
You move subscriptions into management groups by assigning them as children. A subscription can only be in one management group at a time.
You apply Azure Policies at the management group level. For example, you might create a policy that requires all resources to be in specific Azure regions. This policy automatically applies to all subscriptions in that management group.
You assign permissions (RBAC roles) at the management group level. For example, you might give your security team "Reader" access at the root management group level, allowing them to view all resources across all subscriptions.
Inheritance flows downward automatically. Any policy or permission assigned at a management group level is inherited by all child management groups and subscriptions.

Detailed Example 1: Multi-Environment Organization

Contoso Corporation has 50 Azure subscriptions across different environments and departments. Without management groups, their IT team had to manually apply the same security policies to each subscription, which took hours and often resulted in inconsistencies.

They implement this management group structure:

Root Management Group (Contoso)
- Production Management Group (contains 20 production subscriptions)
- Non-Production Management Group (contains 30 dev/test subscriptions)

At the Production management group level, they apply:

Azure Policy requiring all VMs to have disk encryption enabled
Azure Policy restricting resource creation to specific regions (East US, West US)
RBAC role assignment giving the security team "Security Reader" access

These policies and permissions automatically apply to all 20 production subscriptions. When they add a new production subscription, it automatically inherits all these settings - no manual configuration needed. This saves the IT team approximately 40 hours per month and ensures consistent governance.

Detailed Example 2: Department-Based Organization

Fabrikam Inc. organizes their Azure environment by department:

Root Management Group (Fabrikam)
- Finance Management Group (3 subscriptions)
- Marketing Management Group (5 subscriptions)
- Engineering Management Group (12 subscriptions)

At the root level, they apply company-wide policies:

All resources must be tagged with "CostCenter" and "Owner"
All resources must be in North America or Europe regions
Multi-factor authentication required for all users

At the Finance management group level, they add additional policies:

All data must be encrypted at rest
Audit logging must be enabled for all resources
Only specific VM sizes allowed (to control costs)

The Finance subscriptions inherit both the root-level policies AND the Finance-specific policies. This layered approach allows for company-wide governance while still enabling department-specific requirements.

Detailed Example 3: Acquisition Integration

Northwind Traders acquires a smaller company that already has 10 Azure subscriptions. Instead of migrating everything immediately, they create a new management group called "Acquired-Company" under their root management group and move the acquired subscriptions there.

This allows them to:

Apply Northwind's core security policies to the acquired subscriptions immediately
Keep the acquired company's existing resource organization intact temporarily
Gradually migrate resources to Northwind's standard subscription structure
Track costs separately for the acquired company during the integration period

⭐ Must Know (Critical Facts):

Maximum depth: Management group hierarchies can be up to 6 levels deep (not including root and subscription levels)
Root management group: Automatically created, cannot be deleted, represents your entire Azure AD tenant
Inheritance: Policies and RBAC roles assigned at a management group apply to all child management groups and subscriptions
Subscription limit: A subscription can only be in ONE management group at a time
Moving subscriptions: You can move subscriptions between management groups, and they immediately inherit the new parent's policies

When to use:

✅ Use when: You have multiple subscriptions that need consistent governance
✅ Use when: You want to apply policies or permissions across many subscriptions at once
✅ Use when: You need to organize subscriptions by environment, department, or business unit
❌ Don't use when: You only have 1-2 subscriptions (overhead not worth it)
❌ Don't use when: Each subscription needs completely different policies (defeats the purpose)

Limitations & Constraints:

Maximum 10,000 management groups per Azure AD tenant
Maximum 6 levels of depth (not counting root and subscriptions)
Each management group can have multiple children but only one parent
Moving a subscription between management groups requires specific permissions
Some Azure services don't support management group-level operations

💡 Tips for Understanding:

Think of management groups as folders on your computer - they organize things but don't contain the actual files (resources)
The root management group is like the C:\ drive - it's always there and everything else is inside it
Inheritance flows downward like water - policies "flow down" from parent to child

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Thinking management groups are required for Azure
- Why it's wrong: Management groups are optional. You can use Azure with just subscriptions and resource groups.
- Correct understanding: Management groups are a governance tool for organizations with multiple subscriptions. Small organizations with 1-2 subscriptions don't need them.
Mistake 2: Trying to put resources directly in management groups
- Why it's wrong: Management groups can only contain other management groups or subscriptions, not resources.
- Correct understanding: The hierarchy is: Management Groups → Subscriptions → Resource Groups → Resources. You can't skip levels.
Mistake 3: Assuming you can override inherited policies at lower levels
- Why it's wrong: By default, policies inherited from parent management groups cannot be overridden or removed at child levels.
- Correct understanding: Inheritance is enforced. If a parent management group requires encryption, child subscriptions cannot disable that requirement. This is by design for security and compliance.

🔗 Connections to Other Topics:

Relates to Azure Policy (Chapter 1) because: Management groups are the primary scope for applying policies across multiple subscriptions
Builds on Azure AD tenants by: Providing organizational structure within a single tenant
Often used with RBAC (Chapter 1) to: Assign permissions across multiple subscriptions efficiently

Level 2: Subscriptions

What it is: An Azure subscription is a logical container that serves as a billing boundary and access control boundary for Azure resources. It's the agreement between you (or your organization) and Microsoft to use Azure services.

Why it exists: Subscriptions solve several critical problems: (1) They provide a clear billing boundary - each subscription gets its own invoice, making cost tracking easy. (2) They provide isolation - resources in different subscriptions are separated for security and management. (3) They enforce quotas and limits - each subscription has limits on how many resources you can create, preventing runaway costs or resource exhaustion.

Real-world analogy: Think of a subscription like a credit card account. You can have multiple credit cards (subscriptions) for different purposes - one for business expenses, one for personal use, one for a specific project. Each card has its own statement (bill), spending limit (quota), and you can control who has access to each card.

How it works (Detailed step-by-step):

You create or are assigned an Azure subscription through the Azure portal, Enterprise Agreement, or Cloud Solution Provider. Each subscription has a unique subscription ID (a GUID).
The subscription is linked to an Azure AD tenant (your organization's identity directory). This tenant authenticates users who want to access resources in the subscription.
You assign permissions to users using Role-Based Access Control (RBAC). For example, you might give developers "Contributor" access (can create/modify resources) and give managers "Reader" access (can only view resources).
You create resource groups within the subscription to organize related resources. Each resource group belongs to exactly one subscription.
Azure tracks all resource usage within the subscription and generates a monthly bill. The bill shows costs broken down by resource group, resource type, and region.
Quotas and limits are enforced at the subscription level. For example, by default you can create up to 250 storage accounts per subscription per region. If you need more, you must request a quota increase.

Detailed Example 1: Environment Separation

Tailspin Toys uses three subscriptions to separate their environments:

Production Subscription (Prod-TailspinToys):

Contains all production workloads serving real customers
Strict access controls: Only senior engineers and operations team have Contributor access
High spending limit: $50,000/month budget
Policies enforced: All VMs must have backup enabled, all data must be encrypted
Billing: Charged to the Operations department budget

Development Subscription (Dev-TailspinToys):

Contains development and testing environments
Relaxed access: All developers have Contributor access to experiment
Lower spending limit: $5,000/month budget with alerts at $4,000
Fewer policies: Developers can create resources more freely
Billing: Charged to the Engineering department budget

Sandbox Subscription (Sandbox-TailspinToys):

Used for learning, experimentation, and proof-of-concepts
Open access: Anyone in the company can request access
Very low spending limit: $500/month, automatically shuts down resources at limit
No policies: Complete freedom to experiment
Billing: Charged to the Training budget

This separation provides several benefits:

Cost control: Each environment has its own budget and billing, making it easy to track spending
Security: Production resources are isolated from development experiments
Compliance: Production can have strict policies without hindering development agility
Risk management: If someone accidentally deletes resources in Dev, Production is unaffected

Detailed Example 2: Department-Based Subscriptions

Contoso Corporation has 5,000 employees across multiple departments. They create separate subscriptions for each major department:

Finance Subscription:

Contains financial applications and databases
Access: Only Finance IT team and approved finance staff
Special requirements: All data must remain in specific regions for compliance
Cost: $15,000/month, billed to Finance department
Resources: 50 VMs, 20 databases, 100 storage accounts

Marketing Subscription:

Contains marketing websites, campaign management tools, analytics platforms
Access: Marketing IT team and marketing staff
Special requirements: High bandwidth for video content, CDN for global reach
Cost: $8,000/month, billed to Marketing department
Resources: 30 VMs, 5 databases, 200 storage accounts (lots of media files)

HR Subscription:

Contains HR systems, employee portals, recruitment platforms
Access: HR IT team and HR staff only (highly sensitive data)
Special requirements: Extra security controls, audit logging for all access
Cost: $3,000/month, billed to HR department
Resources: 10 VMs, 5 databases, 20 storage accounts

Benefits of this approach:

Clear cost allocation: Each department sees exactly what they're spending on Azure
Appropriate access control: HR data is completely isolated from Marketing
Customized policies: Each department can have policies matching their compliance needs
Simplified billing: Finance can easily charge back costs to the correct department

Detailed Example 3: Project-Based Subscriptions

Fabrikam Inc. creates temporary subscriptions for large projects:

Project Phoenix Subscription (6-month project):

Created specifically for a major application modernization project
Access: Project team members only (15 people)
Budget: $30,000 total for 6 months
Resources: Development, testing, and staging environments for the new application
Lifecycle: When project completes, resources are migrated to production subscription, then this subscription is deleted

This approach provides:

Clear project cost tracking: All Project Phoenix costs are in one place
Temporary access: Project team members automatically lose access when subscription is deleted
Clean separation: Project resources don't clutter the main production subscription
Easy cleanup: Delete the entire subscription when project ends, ensuring no orphaned resources

⭐ Must Know (Critical Facts):

Billing boundary: Each subscription gets its own monthly invoice
Access boundary: RBAC permissions are assigned at subscription level (or lower)
Quota boundary: Resource limits (like max VMs) are enforced per subscription
One tenant: Each subscription is linked to exactly ONE Azure AD tenant
Multiple subscriptions: A tenant can have unlimited subscriptions
Subscription types: Pay-As-You-Go, Enterprise Agreement, CSP, Free Trial, Student, etc.

When to use multiple subscriptions:

✅ Use when: You need to separate billing for different departments or projects
✅ Use when: You need to isolate environments (production vs. development)
✅ Use when: You're approaching subscription quotas/limits
✅ Use when: You need different access controls for different groups
✅ Use when: You have compliance requirements for data isolation
❌ Don't use when: You just want to organize resources (use resource groups instead)

Limitations & Constraints:

Default limit: 250 storage accounts per subscription per region (can be increased)
Default limit: 25,000 VMs per subscription per region (can be increased)
Default limit: 980 resource groups per subscription (can be increased to 980,000)
Some resources have subscription-wide limits that cannot be increased
Moving resources between subscriptions requires downtime for some resource types

💡 Tips for Understanding:

Think of subscriptions as separate "accounts" within Azure - each has its own bill and access controls
Use subscriptions to separate things that should be billed separately or managed by different teams
Don't create too many subscriptions - they add management overhead. Start with 2-3 (prod, dev, sandbox)

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Creating a new subscription for every project or application
- Why it's wrong: Too many subscriptions create management overhead. You'll spend more time managing subscriptions than resources.
- Correct understanding: Use resource groups to organize applications within a subscription. Only create new subscriptions when you need billing separation or different access controls.
Mistake 2: Thinking you can move resources freely between subscriptions
- Why it's wrong: While many resources can be moved, some cannot, and moving often requires downtime.
- Correct understanding: Plan your subscription structure carefully upfront. Moving resources between subscriptions is possible but should be avoided when possible.
Mistake 3: Assuming subscription limits are hard caps
- Why it's wrong: Many subscription limits can be increased by submitting a support request.
- Correct understanding: Default limits are soft limits for most resources. If you're approaching a limit, contact Azure support to request an increase. However, some limits (like max subscriptions per tenant) are hard limits.

🔗 Connections to Other Topics:

Relates to Cost Management (Chapter 1) because: Subscriptions are the primary unit for cost tracking and budgeting
Builds on Azure AD by: Each subscription trusts one Azure AD tenant for authentication
Often used with Management Groups (above) to: Apply consistent policies across multiple subscriptions
Connected to Resource Groups (below) because: Resource groups exist within subscriptions

Troubleshooting Common Issues:

Issue 1: "Cannot create resource - quota exceeded"
- Solution: Check subscription quotas in Azure Portal → Subscriptions → Usage + quotas. Request increase if needed.
Issue 2: "User cannot access resources in subscription"
- Solution: Verify user has appropriate RBAC role assigned at subscription or resource group level.
Issue 3: "Unexpected high costs in subscription"
- Solution: Use Cost Management + Billing to analyze costs by resource group and resource type. Set up budget alerts.

Level 3: Resource Groups

What it is: A resource group is a logical container that holds related Azure resources. It's like a folder that groups resources that share the same lifecycle, permissions, or purpose.

Why it exists: Without resource groups, you'd have thousands of individual resources scattered across your subscription with no organization. Resource groups solve this by: (1) Grouping related resources together for easier management. (2) Allowing you to apply permissions to multiple resources at once. (3) Enabling you to delete multiple resources together. (4) Providing a way to track costs for a group of resources. (5) Organizing resources by application, environment, or any criteria that makes sense for your organization.

Real-world analogy: Think of resource groups like folders on your computer. Just as you create folders to organize related files (Documents/Work/Project1), you create resource groups to organize related Azure resources (RG-WebApp-Prod). When you delete a folder, all files inside are deleted - same with resource groups.

How it works (Detailed step-by-step):

You create a resource group by specifying a name and an Azure region (location). The region determines where the resource group's metadata is stored, but resources inside can be in any region.
You create resources within the resource group. For example, you might create a virtual machine, a storage account, and a virtual network all in the same resource group.
Azure tracks all resources in the group and maintains relationships between them. If resources depend on each other, Azure knows about these dependencies.
You can apply permissions (RBAC) at the resource group level. For example, giving a developer "Contributor" access to a resource group allows them to manage all resources in that group.
You can apply tags at the resource group level for cost tracking and organization. Tags can be inherited by resources (optional).
When you delete a resource group, Azure deletes all resources inside it in the correct order (respecting dependencies). This is a powerful cleanup mechanism.

Detailed Example 1: Application-Based Organization

Contoso runs an e-commerce web application with multiple components. They organize resources by application tier:

RG-WebApp-Frontend-Prod:

Contains: 3 web server VMs, 1 load balancer, 1 public IP address, 1 application gateway
Purpose: All resources needed to run the web frontend
Permissions: Web team has Contributor access
Tags: Application=ECommerce, Tier=Frontend, Environment=Production, CostCenter=IT-001
Location: East US (metadata), but VMs are in East US and West US for redundancy

RG-WebApp-Backend-Prod:

Contains: 5 application server VMs, 1 internal load balancer, 1 virtual network
Purpose: All resources for the application backend/API layer
Permissions: Backend team has Contributor access, Web team has Reader access
Tags: Application=ECommerce, Tier=Backend, Environment=Production, CostCenter=IT-001
Location: East US

RG-WebApp-Database-Prod:

Contains: 1 Azure SQL Database, 1 storage account (for backups), 1 Key Vault (for secrets)
Purpose: All data storage resources
Permissions: DBA team has Contributor access, Backend team has Reader access
Tags: Application=ECommerce, Tier=Database, Environment=Production, CostCenter=IT-001
Location: East US

Benefits of this organization:

Clear separation of concerns: Each team manages their tier independently
Granular permissions: DBAs can't accidentally modify web servers
Easy cost tracking: Can see costs per tier using tags
Simplified deployment: Can deploy/update each tier independently
Disaster recovery: Can restore just the database tier if needed

Detailed Example 2: Environment-Based Organization

Fabrikam organizes the same application by environment instead of tier:

RG-ECommerce-Production:

Contains: ALL production resources (web VMs, app VMs, database, load balancers, storage)
Purpose: Complete production environment in one group
Permissions: Only senior engineers have Contributor access, everyone else has Reader
Tags: Application=ECommerce, Environment=Production
Policies: Backup required, encryption required, no resource deletion without approval

RG-ECommerce-Staging:

Contains: ALL staging resources (mirrors production but smaller scale)
Purpose: Pre-production testing environment
Permissions: QA team and senior engineers have Contributor access
Tags: Application=ECommerce, Environment=Staging
Policies: Backup optional, can be deleted/recreated freely

RG-ECommerce-Development:

Contains: ALL development resources (smaller scale, may not include all components)
Purpose: Active development and testing
Permissions: All developers have Contributor access
Tags: Application=ECommerce, Environment=Development
Policies: Minimal restrictions, auto-shutdown at night to save costs

Benefits of this organization:

Environment isolation: Development changes can't affect production
Easy environment cloning: Can copy entire environment by copying resource group
Simple cleanup: Can delete entire dev environment and recreate it fresh
Clear lifecycle management: All resources in an environment have the same lifecycle

Detailed Example 3: Project-Based Organization

Northwind Traders uses resource groups for temporary projects:

RG-Migration-Phase1:

Contains: Resources being migrated from on-premises in Phase 1
Purpose: Temporary holding area during migration
Lifecycle: Created at project start, deleted after migration completes
Contains: 20 VMs, 10 storage accounts, 5 databases
Tags: Project=DataCenterMigration, Phase=1, Owner=MigrationTeam

RG-Migration-Phase2:

Contains: Resources being migrated in Phase 2
Purpose: Second wave of migration
Lifecycle: Created after Phase 1 completes
Contains: 30 VMs, 15 storage accounts, 8 databases
Tags: Project=DataCenterMigration, Phase=2, Owner=MigrationTeam

After migration completes, resources are moved to permanent resource groups (RG-Production-App1, RG-Production-App2, etc.), and the migration resource groups are deleted. This keeps the migration organized and makes cleanup easy.

⭐ Must Know (Critical Facts):

Lifecycle container: Deleting a resource group deletes ALL resources inside it
One resource group: Each resource can only be in ONE resource group at a time
Any region: Resources in a resource group can be in different Azure regions
Metadata location: The resource group's location only affects where its metadata is stored
Cannot nest: You cannot put resource groups inside other resource groups
Can move resources: Resources can be moved between resource groups (with some limitations)
Permissions scope: RBAC roles can be assigned at resource group level

When to use:

✅ Use when: You have resources that share the same lifecycle (created/deleted together)
✅ Use when: You want to apply the same permissions to multiple resources
✅ Use when: You need to organize resources by application, environment, or project
✅ Use when: You want to track costs for a group of related resources
❌ Don't use when: Resources have completely different lifecycles (put in separate groups)

Limitations & Constraints:

Maximum 980 resource groups per subscription (can be increased to 980,000)
Resource group names must be unique within a subscription
Resource group names can be 1-90 characters: letters, numbers, hyphens, underscores, periods, parentheses
Some resources cannot be moved between resource groups (e.g., some networking resources)
Moving resources between resource groups may cause downtime
Cannot rename a resource group (must create new one and move resources)

💡 Tips for Understanding:

Think of resource groups as "delete boundaries" - everything inside gets deleted together
Use consistent naming conventions: RG-{Application}-{Environment} or RG-{Project}-{Component}
Group resources by lifecycle, not by type (don't create "RG-AllVMs" and "RG-AllDatabases")
Use tags for cross-cutting concerns (cost center, owner, project) that span multiple resource groups

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Creating one giant resource group for everything
- Why it's wrong: You lose the ability to manage resources independently. Deleting the resource group deletes everything.
- Correct understanding: Create separate resource groups for resources with different lifecycles or management needs.
Mistake 2: Organizing by resource type (RG-VMs, RG-Databases, RG-Storage)
- Why it's wrong: Resources that work together are scattered across multiple groups, making management difficult.
- Correct understanding: Organize by application, environment, or project. Keep related resources together.
Mistake 3: Thinking resource group location affects resource location
- Why it's wrong: Resource group location only determines where metadata is stored.
- Correct understanding: You can create a resource group in East US and put resources in West Europe. The resource group location doesn't restrict resource locations.
Mistake 4: Trying to nest resource groups
- Why it's wrong: Azure doesn't support resource group nesting.
- Correct understanding: Use management groups to organize subscriptions, and use resource groups to organize resources. These are different organizational levels.

🔗 Connections to Other Topics:

Relates to RBAC (Chapter 1) because: Permissions can be assigned at resource group level
Builds on Subscriptions by: Existing within subscriptions as organizational containers
Connected to Azure Policy (Chapter 1) because: Policies can be applied at resource group scope
Used with Tags (Chapter 1) to: Organize and track costs across resource groups

Troubleshooting Common Issues:

Issue 1: "Cannot delete resource group - resources still exist"
- Solution: Some resources have locks preventing deletion. Check for resource locks and remove them first.
Issue 2: "Cannot move resource to another resource group"
- Solution: Not all resource types support moving. Check Azure documentation for move support. Some resources require downtime during move.
Issue 3: "Resource group deletion taking a long time"
- Solution: Azure deletes resources in dependency order. Large resource groups with many dependencies can take 10-30 minutes to delete.

Level 4: Resources

What it is: Resources are the actual Azure services you create and use - virtual machines, storage accounts, databases, virtual networks, and hundreds of other services. Resources are the "things" that do the work in Azure.

Why it exists: Resources are the fundamental building blocks of your Azure solutions. Each resource provides specific functionality: VMs run your applications, storage accounts store your data, databases manage your structured data, virtual networks connect your resources, etc.

Real-world analogy: If Azure is a city, resources are the individual buildings - houses (VMs), warehouses (storage), banks (databases), roads (networks). Each building serves a specific purpose and can be used independently or connected to others.

How it works (Detailed step-by-step):

You create a resource by specifying its type (VM, storage account, etc.), name, location, and configuration settings.
Azure provisions the resource in the specified region, allocating physical hardware and configuring it according to your specifications.
The resource is assigned a unique Resource ID - a long string that uniquely identifies it across all of Azure.
You configure the resource by setting properties, connecting it to other resources, and applying security settings.
The resource runs and provides its service - a VM runs your application, a storage account stores your files, etc.
Azure meters usage and bills you based on the resource's pricing model (per hour, per GB, per transaction, etc.).

Detailed Example 1: Web Application Resources

A typical web application might consist of these resources:

Virtual Machine (VM): "vm-webserver-prod-001"

Type: Standard_D2s_v3 (2 vCPUs, 8 GB RAM)
Location: East US
Purpose: Runs the web server software (IIS or Apache)
Cost: ~$70/month (pay per hour)
Dependencies: Requires virtual network, network interface, disk

Storage Account: "stwebappprod001"

Type: Standard_LRS (Locally Redundant Storage)
Location: East US
Purpose: Stores static website content (images, CSS, JavaScript)
Cost: ~$20/month (pay per GB stored + transactions)
Dependencies: None (standalone resource)

SQL Database: "sqldb-webapp-prod"

Type: Standard S2 (50 DTUs)
Location: East US
Purpose: Stores application data (user accounts, orders, products)
Cost: ~$150/month (pay per hour based on tier)
Dependencies: Requires SQL Server (logical server)

Virtual Network: "vnet-webapp-prod"

Type: Virtual Network
Location: East US
Purpose: Provides network connectivity between resources
Cost: Free (only pay for data transfer)
Dependencies: None

Load Balancer: "lb-webapp-prod"

Type: Standard Load Balancer
Location: East US
Purpose: Distributes traffic across multiple web servers
Cost: ~$20/month (pay per hour + data processed)
Dependencies: Requires public IP address

All these resources work together to deliver the web application. The VM runs the web server, which reads data from the SQL Database and serves static files from the Storage Account. The Load Balancer distributes incoming traffic, and the Virtual Network connects everything securely.

Detailed Example 2: Resource Dependencies

When you create a Virtual Machine, Azure automatically creates several dependent resources:

Primary Resource: Virtual Machine "vm-app-001"

Automatically Created Resources:

Network Interface (NIC): "vm-app-001-nic"
- Connects the VM to the virtual network
- Has a private IP address
- Cannot be deleted while VM exists
OS Disk: "vm-app-001-osdisk"
- Stores the operating system
- Typically 127 GB for Windows, 30 GB for Linux
- Deleted when VM is deleted (by default)
Network Security Group (NSG): "vm-app-001-nsg"
- Firewall rules for the VM
- Controls inbound and outbound traffic
- Can be shared across multiple VMs
Public IP Address (if requested): "vm-app-001-ip"
- Allows internet access to the VM
- Can be static or dynamic
- Costs extra (~$3/month)

Understanding these dependencies is crucial because:

Deleting the VM doesn't automatically delete all dependent resources (except OS disk)
You must delete dependent resources separately to avoid ongoing costs
Some resources (like NICs) cannot be deleted while the VM exists

Detailed Example 3: Resource Naming and Organization

Contoso uses a consistent naming convention for all resources:

Naming Pattern: {resource-type}-{application}-{environment}-{region}-{instance}

Examples:

vm-ecommerce-prod-eastus-001 (Virtual Machine)
st-ecommerce-prod-eastus-001 (Storage Account - note: storage names can't have hyphens)
sql-ecommerce-prod-eastus-001 (SQL Database)
vnet-ecommerce-prod-eastus-001 (Virtual Network)
nsg-ecommerce-prod-eastus-001 (Network Security Group)

Benefits:

Instantly recognizable: Anyone can tell what a resource is and where it belongs
Prevents conflicts: Instance numbers prevent duplicate names
Easier troubleshooting: Can quickly identify related resources
Better cost tracking: Can filter costs by naming patterns

⭐ Must Know (Critical Facts):

Unique Resource ID: Every resource has a unique ID in the format: /subscriptions/{subscription-id}/resourceGroups/{rg-name}/providers/{provider}/{resource-type}/{resource-name}
One resource group: Each resource belongs to exactly ONE resource group
Location matters: Most resources must be created in a specific Azure region
Cannot rename: Most resources cannot be renamed after creation (must recreate)
Dependencies: Some resources depend on others and cannot be deleted independently
Resource providers: Each resource type belongs to a resource provider (e.g., Microsoft.Compute for VMs)

When to use:

✅ Use when: You need specific functionality (compute, storage, networking, etc.)
✅ Use when: You've planned your architecture and know what services you need
✅ Use when: You understand the costs and have budget approval
❌ Don't use when: You haven't planned your architecture (avoid creating resources randomly)

Limitations & Constraints:

Each resource type has specific limits (e.g., max VM size, max storage account size)
Some resources can only be created in specific regions
Some resources have minimum billing periods (e.g., reserved instances)
Resource names must be unique within their scope (resource group or globally)
Some resource types have naming restrictions (length, allowed characters)

💡 Tips for Understanding:

Think of resources as the actual "things" you're paying for in Azure
Always use consistent naming conventions - it saves hours of confusion later
Tag resources with metadata (owner, cost center, environment) for better organization
Document dependencies between resources to avoid accidental deletions

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Deleting a VM and assuming all related resources are deleted
- Why it's wrong: NICs, public IPs, and NSGs often remain after VM deletion, continuing to incur costs.
- Correct understanding: When deleting a VM, manually check for and delete related resources (or delete the entire resource group).
Mistake 2: Creating resources without a naming convention
- Why it's wrong: You end up with resources named "VM1", "VM2", "test", "new-vm", making management impossible.
- Correct understanding: Establish a naming convention before creating any resources and stick to it religiously.
Mistake 3: Assuming all resources can be in any region
- Why it's wrong: Some resources are region-specific, and some regions don't support all resource types.
- Correct understanding: Check region availability before creating resources. Some services are only available in specific regions.

🔗 Connections to Other Topics:

Relates to Resource Groups (above) because: Resources must be created within resource groups
Connected to Azure Resource Manager (below) because: ARM manages all resource operations
Used with Tags (Chapter 1) to: Add metadata for organization and cost tracking
Managed through Azure Portal, CLI, PowerShell (below) for: Creation, configuration, and deletion

Azure Management Tools

Azure provides multiple ways to create and manage resources. Understanding these tools is essential for the AZ-104 exam because you'll need to use all of them in different scenarios.

Azure Portal

What it is: The Azure Portal is a web-based graphical user interface (GUI) for managing Azure resources. It's accessible at https://portal.azure.com from any web browser.

Why it exists: The Portal provides an intuitive, visual way to manage Azure resources without needing to learn command-line syntax. It's perfect for beginners, one-time tasks, and exploring Azure services.

Real-world analogy: The Azure Portal is like using a smartphone app - you tap buttons, fill in forms, and see visual representations of your resources. It's user-friendly but not ideal for automation.

How it works (Detailed step-by-step):

You navigate to portal.azure.com and sign in with your Azure AD credentials.
The Portal displays a dashboard showing your resources, recent activity, and quick access to common services.
You navigate using the left sidebar or search bar to find the service you want to manage.
You create resources using wizards - step-by-step forms that guide you through configuration.
You manage existing resources by clicking on them to view properties, metrics, and configuration options.
The Portal sends API calls to Azure Resource Manager behind the scenes to execute your actions.

Detailed Example 1: Creating a Virtual Machine

Using the Azure Portal to create a VM:

Click "Create a resource" → "Compute" → "Virtual Machine"
Fill in the "Basics" tab:
- Subscription: Select your subscription
- Resource group: Create new or select existing
- VM name: "vm-webserver-prod-001"
- Region: "East US"
- Image: "Windows Server 2022"
- Size: "Standard_D2s_v3"
- Username/Password: Set administrator credentials
Configure "Disks" tab:
- OS disk type: "Premium SSD"
- Data disks: Add if needed
Configure "Networking" tab:
- Virtual network: Create new or select existing
- Subnet: Select subnet
- Public IP: Create new
- NIC network security group: "Basic"
Review and create
Wait 2-5 minutes for deployment

The Portal guides you through every option with helpful tooltips and validation.

When to use:

✅ Use when: You're learning Azure and exploring services
✅ Use when: You need to create a single resource with complex configuration
✅ Use when: You want to visualize resource relationships and metrics
✅ Use when: You're troubleshooting and need to see resource properties
❌ Don't use when: You need to create many resources (too slow and repetitive)
❌ Don't use when: You need to automate tasks (use CLI or PowerShell instead)

💡 Tips:

Use the search bar (top center) to quickly find any resource or service
Pin frequently used resources to your dashboard for quick access
Use Cloud Shell (icon in top toolbar) to run CLI or PowerShell commands without leaving the Portal
Customize your dashboard by adding tiles for resources you monitor frequently

Azure CLI

What it is: Azure CLI (Command-Line Interface) is a cross-platform command-line tool for managing Azure resources. It runs on Windows, macOS, and Linux, and uses simple, consistent commands.

Why it exists: Azure CLI provides a fast, scriptable way to manage Azure resources. It's perfect for automation, repetitive tasks, and scenarios where you need to create many resources quickly. Unlike the Portal, CLI commands can be saved in scripts and reused.

Real-world analogy: Azure CLI is like using keyboard shortcuts instead of clicking through menus. It's faster once you learn it, and you can record your actions (scripts) to repeat them later.

How it works (Detailed step-by-step):

You install Azure CLI on your local machine or use Azure Cloud Shell (built into the Portal).
You authenticate by running az login, which opens a browser for you to sign in.
You run commands using the format: az <service> <subservice> <action> --parameters
Azure CLI sends API calls to Azure Resource Manager to execute your commands.
Results are returned in JSON format by default (can be changed to table or YAML).

Detailed Example 1: Creating a Resource Group

# Create a resource group
az group create \
  --name RG-WebApp-Prod \
  --location eastus \
  --tags Environment=Production Application=WebApp CostCenter=IT-001

# Output (JSON):
{
  "id": "/subscriptions/xxxx/resourceGroups/RG-WebApp-Prod",
  "location": "eastus",
  "name": "RG-WebApp-Prod",
  "properties": {
    "provisioningState": "Succeeded"
  },
  "tags": {
    "Application": "WebApp",
    "CostCenter": "IT-001",
    "Environment": "Production"
  },
  "type": "Microsoft.Resources/resourceGroups"
}

This single command creates a resource group with tags in about 2 seconds. In the Portal, this would require clicking through multiple screens.

Detailed Example 2: Creating a Virtual Machine

# Create a VM with one command
az vm create \
  --resource-group RG-WebApp-Prod \
  --name vm-webserver-prod-001 \
  --image Win2022Datacenter \
  --size Standard_D2s_v3 \
  --admin-username azureuser \
  --admin-password 'ComplexP@ssw0rd!' \
  --location eastus \
  --vnet-name vnet-webapp-prod \
  --subnet default \
  --public-ip-address vm-webserver-prod-001-ip \
  --nsg vm-webserver-prod-001-nsg \
  --tags Environment=Production Application=WebApp

# This creates:
# - The VM
# - OS disk
# - Network interface
# - Public IP address
# - Network security group
# - Connects to virtual network
# All in one command!

Detailed Example 3: Scripting Multiple Resources

#!/bin/bash
# Script to create 10 VMs for a web farm

RESOURCE_GROUP="RG-WebFarm-Prod"
LOCATION="eastus"
VM_SIZE="Standard_B2s"
IMAGE="UbuntuLTS"

# Create resource group
az group create --name $RESOURCE_GROUP --location $LOCATION

# Create virtual network
az network vnet create \
  --resource-group $RESOURCE_GROUP \
  --name vnet-webfarm \
  --address-prefix 10.0.0.0/16 \
  --subnet-name subnet-web \
  --subnet-prefix 10.0.1.0/24

# Create 10 VMs in a loop
for i in {1..10}
do
  VM_NAME="vm-web-$(printf "%03d" $i)"
  echo "Creating $VM_NAME..."
  
  az vm create \
    --resource-group $RESOURCE_GROUP \
    --name $VM_NAME \
    --image $IMAGE \
    --size $VM_SIZE \
    --admin-username azureuser \
    --generate-ssh-keys \
    --vnet-name vnet-webfarm \
    --subnet subnet-web \
    --public-ip-address "" \
    --nsg "" \
    --tags Environment=Production Tier=Web Instance=$i
done

echo "Created 10 VMs successfully!"

This script creates 10 VMs in about 10 minutes. Doing this in the Portal would take hours and be error-prone.

Common Azure CLI Commands:

# Authentication
az login                          # Sign in to Azure
az account list                   # List subscriptions
az account set --subscription "My Subscription"  # Set active subscription

# Resource Groups
az group create --name RG-Test --location eastus
az group list                     # List all resource groups
az group delete --name RG-Test    # Delete resource group

# Virtual Machines
az vm list                        # List all VMs
az vm start --name MyVM --resource-group RG-Test
az vm stop --name MyVM --resource-group RG-Test
az vm deallocate --name MyVM --resource-group RG-Test  # Stop and release compute
az vm delete --name MyVM --resource-group RG-Test

# Storage Accounts
az storage account create --name mystorageacct --resource-group RG-Test
az storage account list
az storage account delete --name mystorageacct --resource-group RG-Test

# Getting Help
az --help                         # General help
az vm --help                      # Help for VM commands
az vm create --help               # Help for specific command

When to use:

✅ Use when: You need to create multiple similar resources
✅ Use when: You want to automate tasks with scripts
✅ Use when: You need to integrate Azure management into CI/CD pipelines
✅ Use when: You prefer command-line interfaces
❌ Don't use when: You're just exploring and learning (Portal is better for this)
❌ Don't use when: You need complex conditional logic (PowerShell is better)

💡 Tips:

Use --output table for human-readable output: az vm list --output table
Use --query to filter results: az vm list --query "[?location=='eastus']"
Use --dry-run to see what would happen without executing (not all commands support this)
Save common commands in shell scripts for reuse

⚠️ Common Mistakes:

Mistake: Forgetting to set the correct subscription before running commands
- Solution: Always run az account show to verify you're in the right subscription
Mistake: Not quoting parameters with spaces or special characters
- Solution: Use single or double quotes: --name "My VM Name"

Azure PowerShell

What it is: Azure PowerShell is a set of PowerShell modules (cmdlets) for managing Azure resources. It integrates with the PowerShell scripting language and runs on Windows, macOS, and Linux.

Why it exists: PowerShell provides powerful scripting capabilities with object-oriented programming, error handling, and complex logic. It's ideal for Windows administrators familiar with PowerShell and for scenarios requiring sophisticated automation.

Real-world analogy: If Azure CLI is like using a calculator, Azure PowerShell is like using Excel with formulas. Both can do math, but Excel (PowerShell) can handle much more complex scenarios with variables, conditions, and loops.

How it works (Detailed step-by-step):

You install the Az PowerShell module using Install-Module -Name Az or use Azure Cloud Shell.
You authenticate by running Connect-AzAccount, which opens a browser for sign-in.
You run cmdlets using the format: Verb-AzNoun -Parameter Value
PowerShell returns objects (not just text), which you can manipulate with PowerShell commands.
You can pipe objects between cmdlets for complex operations.

Detailed Example 1: Creating a Resource Group

# Create a resource group
New-AzResourceGroup `
  -Name "RG-WebApp-Prod" `
  -Location "East US" `
  -Tag @{Environment="Production"; Application="WebApp"; CostCenter="IT-001"}

# Output (PowerShell object):
ResourceGroupName : RG-WebApp-Prod
Location          : eastus
ProvisioningState : Succeeded
Tags              : 
                    Name         Value
                    ===========  ===========
                    Environment  Production
                    Application  WebApp
                    CostCenter   IT-001

Detailed Example 2: Creating a Virtual Machine

# Create a VM with PowerShell
$resourceGroup = "RG-WebApp-Prod"
$location = "East US"
$vmName = "vm-webserver-prod-001"

# Create credential object
$securePassword = ConvertTo-SecureString 'ComplexP@ssw0rd!' -AsPlainText -Force
$credential = New-Object System.Management.Automation.PSCredential ('azureuser', $securePassword)

# Create VM configuration
$vmConfig = New-AzVMConfig -VMName $vmName -VMSize "Standard_D2s_v3" |
  Set-AzVMOperatingSystem -Windows -ComputerName $vmName -Credential $credential |
  Set-AzVMSourceImage -PublisherName "MicrosoftWindowsServer" `
                      -Offer "WindowsServer" `
                      -Skus "2022-Datacenter" `
                      -Version "latest" |
  Add-AzVMNetworkInterface -Id $nic.Id

# Create the VM
New-AzVM -ResourceGroupName $resourceGroup -Location $location -VM $vmConfig

Detailed Example 3: Advanced Scripting with Logic

# Script to resize all VMs in a resource group based on time of day
$resourceGroup = "RG-WebApp-Prod"
$currentHour = (Get-Date).Hour

# Define sizes based on time
if ($currentHour -ge 8 -and $currentHour -lt 18) {
    # Business hours: Use larger VMs
    $targetSize = "Standard_D4s_v3"
    Write-Host "Business hours detected. Scaling UP to $targetSize"
} else {
    # Off hours: Use smaller VMs to save costs
    $targetSize = "Standard_D2s_v3"
    Write-Host "Off hours detected. Scaling DOWN to $targetSize"
}

# Get all VMs in the resource group
$vms = Get-AzVM -ResourceGroupName $resourceGroup

foreach ($vm in $vms) {
    $currentSize = $vm.HardwareProfile.VmSize
    
    if ($currentSize -ne $targetSize) {
        Write-Host "Resizing $($vm.Name) from $currentSize to $targetSize..."
        
        # Stop the VM
        Stop-AzVM -ResourceGroupName $resourceGroup -Name $vm.Name -Force
        
        # Resize the VM
        $vm.HardwareProfile.VmSize = $targetSize
        Update-AzVM -ResourceGroupName $resourceGroup -VM $vm
        
        # Start the VM
        Start-AzVM -ResourceGroupName $resourceGroup -Name $vm.Name
        
        Write-Host "$($vm.Name) resized successfully!"
    } else {
        Write-Host "$($vm.Name) is already the correct size."
    }
}

Write-Host "All VMs processed."

This script demonstrates PowerShell's power: conditional logic, loops, object manipulation, and error handling - all in one script.

Common Azure PowerShell Cmdlets:

# Authentication
Connect-AzAccount                 # Sign in to Azure
Get-AzSubscription                # List subscriptions
Set-AzContext -Subscription "My Subscription"  # Set active subscription

# Resource Groups
New-AzResourceGroup -Name "RG-Test" -Location "East US"
Get-AzResourceGroup               # List all resource groups
Remove-AzResourceGroup -Name "RG-Test" -Force

# Virtual Machines
Get-AzVM                          # List all VMs
Start-AzVM -Name "MyVM" -ResourceGroupName "RG-Test"
Stop-AzVM -Name "MyVM" -ResourceGroupName "RG-Test" -Force
Remove-AzVM -Name "MyVM" -ResourceGroupName "RG-Test" -Force

# Storage Accounts
New-AzStorageAccount -Name "mystorageacct" -ResourceGroupName "RG-Test" `
                     -Location "East US" -SkuName "Standard_LRS"
Get-AzStorageAccount
Remove-AzStorageAccount -Name "mystorageacct" -ResourceGroupName "RG-Test"

# Working with Objects
Get-AzVM | Where-Object {$_.Location -eq "eastus"}  # Filter VMs by location
Get-AzVM | Select-Object Name, Location, VmSize     # Select specific properties
Get-AzVM | Export-Csv -Path "vms.csv"               # Export to CSV

# Getting Help
Get-Help Get-AzVM                 # Get help for cmdlet
Get-Help Get-AzVM -Examples       # Show examples
Get-Command -Module Az.Compute    # List all cmdlets in a module

Comparison: Azure CLI vs Azure PowerShell:

Aspect	Azure CLI	Azure PowerShell
Syntax	`az vm create --name MyVM`	`New-AzVM -Name "MyVM"`
Output	JSON (default)	PowerShell objects
Platform	Cross-platform (Python-based)	Cross-platform (PowerShell Core)
Learning curve	Easier for beginners	Steeper (need PowerShell knowledge)
Scripting	Bash/Shell scripts	PowerShell scripts (.ps1)
Object manipulation	Limited (text/JSON parsing)	Powerful (native object handling)
Best for	Simple automation, Linux users	Complex automation, Windows admins
Integration	CI/CD pipelines, Linux environments	Windows environments, complex logic

When to use Azure PowerShell:

✅ Use when: You need complex scripting with conditional logic
✅ Use when: You're familiar with PowerShell
✅ Use when: You need to manipulate objects and properties
✅ Use when: You're working in a Windows-centric environment
❌ Don't use when: You prefer simpler, more concise syntax (use CLI)
❌ Don't use when: You're working primarily in Linux (CLI is more natural)

💡 Tips:

Use Get-Help extensively - PowerShell has excellent built-in documentation
Use tab completion to discover cmdlet parameters
Use Get-Command -Module Az.* to discover available cmdlets
Save common scripts as .ps1 files for reuse

⚠️ Common Mistakes:

Mistake: Forgetting to use -Force when deleting resources, causing interactive prompts in scripts
- Solution: Always use -Force in automated scripts to skip confirmation prompts
Mistake: Not handling errors in scripts
- Solution: Use try/catch blocks and check $? (last command success status)

Azure Cloud Shell

What it is: Azure Cloud Shell is a browser-based shell environment hosted in Azure. It provides both Bash and PowerShell environments with Azure CLI and Azure PowerShell pre-installed.

Why it exists: Cloud Shell eliminates the need to install tools locally. You can manage Azure from any device with a web browser - no installation, no configuration, no maintenance. It's perfect for quick tasks, learning, and managing Azure from devices you don't own.

Real-world analogy: Cloud Shell is like using Google Docs instead of Microsoft Word. You don't install anything - just open a browser and start working. Your files are saved in the cloud and accessible from anywhere.

How it works (Detailed step-by-step):

You click the Cloud Shell icon in the Azure Portal (top toolbar, looks like >_).
Azure provisions a temporary Linux container for your session (takes 5-10 seconds).
You choose Bash or PowerShell environment (can switch anytime).
You get a command prompt with Azure CLI and PowerShell pre-installed and authenticated.
Your session persists for 20 minutes of inactivity, then automatically closes.
Your files are stored in an Azure Storage account (automatically created on first use).

Detailed Example 1: First-Time Setup

When you first use Cloud Shell:

Click the Cloud Shell icon (>_) in Azure Portal
Choose "Bash" or "PowerShell"
Azure prompts: "You have no storage mounted"
Click "Create storage" - Azure creates:
- Storage account (name: cs{random-id})
- File share (name: cloudshell)
- Resource group (name: cloud-shell-storage-{region})
Cloud Shell opens with a prompt: user@Azure:~$
You're authenticated and ready to run commands

Detailed Example 2: Using Cloud Shell for Quick Tasks

# Scenario: You're at a coffee shop and need to check VM status

# Open Cloud Shell in Azure Portal (no installation needed)
# Already authenticated - no need to run 'az login'

# List all VMs
az vm list --output table

# Check specific VM status
az vm get-instance-view \
  --name vm-webserver-prod-001 \
  --resource-group RG-WebApp-Prod \
  --query "instanceView.statuses[1].displayStatus"

# Output: "VM running"

# Start a stopped VM
az vm start \
  --name vm-database-prod-001 \
  --resource-group RG-Database-Prod

# Done! Close browser and leave coffee shop.

Detailed Example 3: Persistent Storage

Cloud Shell mounts an Azure File Share to persist your files:

# Your home directory is persistent
cd ~
pwd
# Output: /home/user

# Create a script that will persist across sessions
cat > deploy-vm.sh << 'EOF'
#!/bin/bash
az vm create \
  --resource-group $1 \
  --name $2 \
  --image UbuntuLTS \
  --size Standard_B2s \
  --generate-ssh-keys

### Azure Cloud Shell

**What it is**: Azure Cloud Shell is a browser-based shell environment hosted in Azure. It provides both Bash and PowerShell environments with Azure CLI and Azure PowerShell pre-installed.

**Why it exists**: Cloud Shell eliminates the need to install tools locally. You can manage Azure from any device with a web browser - no installation, no configuration, no maintenance. It's perfect for quick tasks, learning, and managing Azure from devices you don't own.

**Real-world analogy**: Cloud Shell is like using Google Docs instead of Microsoft Word. You don't install anything - just open a browser and start working. Your files are saved in the cloud and accessible from anywhere.

**Features of Cloud Shell**:
- Pre-installed tools: Azure CLI, Azure PowerShell, kubectl, Terraform, Ansible, Git, and more
- Persistent storage: 5 GB file share for your scripts and files
- Integrated editor: Built-in code editor
- No cost: Cloud Shell itself is free (only pay for storage account)
- Authenticated: Automatically signed in with your Azure credentials
- Secure: Runs in isolated container, destroyed after session ends

**When to use**:
- ✅ Use when: You don't want to install tools locally
- ✅ Use when: You're on a device you don't own
- ✅ Use when: You need quick access to Azure CLI or PowerShell
- ✅ Use when: You're learning and don't want to set up a local environment
- ❌ Don't use when: You need to run long-running scripts (20-minute timeout)

---

## Azure Resource Manager (ARM)

**What it is**: Azure Resource Manager (ARM) is the deployment and management service for Azure. It's the underlying engine that processes all requests to create, update, or delete Azure resources.

**Why it exists**: ARM provides a consistent management layer that ensures all Azure tools work the same way. It handles authentication, authorization, resource deployment, and dependency management.

**Key ARM Concepts**:

**Resource Providers**: Every Azure service is implemented by a resource provider:
- Microsoft.Compute - Virtual Machines, Scale Sets
- Microsoft.Storage - Storage Accounts
- Microsoft.Network - Virtual Networks, Load Balancers
- Microsoft.Sql - SQL Databases

**ARM Templates**: JSON files that define infrastructure declaratively. They enable Infrastructure as Code (IaC) for consistent, repeatable deployments.

⭐ **Must Know**: ARM templates are declarative (describe desired state) and idempotent (running multiple times produces same result).

---

## Chapter Summary

### What We Covered

✅ **Cloud Computing Fundamentals**: IaaS, PaaS, SaaS service models
✅ **Azure Hierarchy**: Management Groups, Subscriptions, Resource Groups, Resources
✅ **Management Tools**: Portal, CLI, PowerShell, Cloud Shell
✅ **Azure Resource Manager**: The engine behind all Azure operations

### Critical Takeaways

1. Azure hierarchy flows downward - permissions and policies inherit from parent to child
2. Subscriptions are billing boundaries - use them to separate costs
3. Resource groups are lifecycle containers - deleting a group deletes all resources inside
4. Choose the right tool: Portal for learning, CLI for automation, PowerShell for complex scripts
5. ARM is the foundation - all tools ultimately use ARM APIs

### Self-Assessment Checklist

- [ ] I can explain IaaS, PaaS, and SaaS with examples
- [ ] I understand the four levels of Azure hierarchy
- [ ] I know when to use management groups vs subscriptions vs resource groups
- [ ] I can use Azure Portal, CLI, and PowerShell
- [ ] I understand what Azure Resource Manager does

### Quick Reference Card

**Azure Hierarchy**:
1. Management Groups → Organize subscriptions
2. Subscriptions → Billing boundaries
3. Resource Groups → Lifecycle containers
4. Resources → Actual Azure services

**Management Tools**:
- Portal: Visual, easy to learn
- CLI: Fast, scriptable, cross-platform
- PowerShell: Powerful, object-oriented
- Cloud Shell: Browser-based, no installation

### What's Next?

Chapter 1 covers **Manage Azure Identities and Governance**:
- Microsoft Entra ID for identity management
- Role-Based Access Control (RBAC)
- Azure Policy for governance
- Cost management and tagging

**Next file**: 02_domain_1_identities_governance

---

**Chapter 0 Complete!** ✅


---

# Chapter 1: Manage Azure Identities and Governance (24% of exam)
**File**: 02_domain_1_identities_governance

## Chapter Overview

This domain represents 24% of the AZ-104 exam - the largest single domain. It covers how to manage identities, control access to resources, and implement governance policies across your Azure environment.

**What you'll learn**:
- Microsoft Entra ID (formerly Azure AD) for identity management
- Creating and managing users and groups
- Role-Based Access Control (RBAC) for permissions
- Azure Policy for governance and compliance
- Subscriptions and management groups organization
- Cost management and resource tagging strategies

**Time to complete**: 8-10 hours

**Prerequisites**: Chapter 0 (Fundamentals) - understanding of Azure hierarchy

---

## Section 1: Microsoft Entra ID Fundamentals

### Introduction

**The problem**: In traditional on-premises environments, Active Directory manages user identities and access to resources. But in the cloud, you need a different approach - one that works across the internet, supports modern authentication protocols, and integrates with cloud services.

**The solution**: Microsoft Entra ID (formerly Azure Active Directory) is Microsoft's cloud-based identity and access management service. It authenticates users, manages their identities, and controls access to Azure resources and Microsoft 365 services.

**Why it's tested**: Identity is the foundation of cloud security. The AZ-104 exam heavily tests your understanding of how to create users, manage groups, assign permissions, and secure access to Azure resources.

### What is Microsoft Entra ID?

**What it is**: Microsoft Entra ID is a cloud-based identity and access management service that helps your employees sign in and access resources. It's the identity provider for Azure, Microsoft 365, and thousands of other SaaS applications.

**Why it exists**: Traditional Active Directory was designed for on-premises networks with domain controllers and Kerberos authentication. Cloud services need an identity system that works over the internet, supports modern protocols (OAuth 2.0, SAML, OpenID Connect), and scales globally. Microsoft Entra ID solves these problems.

**Real-world analogy**: Think of Microsoft Entra ID as a digital security guard for your organization. Just as a security guard checks IDs before letting people into a building, Entra ID verifies identities before granting access to cloud resources. It also keeps a log of who entered when (audit logs).

**How it works** (Detailed step-by-step):
1. **Your organization creates a Microsoft Entra tenant** when you sign up for Azure or Microsoft 365. This tenant is your organization's dedicated instance of Entra ID.
2. **You add users to the tenant** - either by creating them directly in Entra ID (cloud-only users) or by synchronizing them from on-premises Active Directory (hybrid users).
3. **Users authenticate** by entering their username and password (or using multi-factor authentication, passwordless methods, etc.).
4. **Entra ID verifies the credentials** and issues an access token if authentication succeeds.
5. **The user presents the token** to access Azure resources, Microsoft 365, or other applications.
6. **The resource validates the token** with Entra ID and grants or denies access based on the user's permissions.

⭐ **Must Know**: Microsoft Entra ID is NOT the same as Windows Server Active Directory. They serve similar purposes but are different technologies. Entra ID is cloud-native and designed for internet-based authentication.

**Key Differences: Active Directory vs Microsoft Entra ID**:

| Aspect | Windows Server AD | Microsoft Entra ID |
|--------|-------------------|-------------------|
| **Deployment** | On-premises servers | Cloud service |
| **Protocol** | Kerberos, NTLM, LDAP | OAuth 2.0, SAML, OpenID Connect |
| **Structure** | Forests, domains, OUs | Flat structure (tenant) |
| **Authentication** | Domain controllers | Cloud-based authentication |
| **Primary use** | On-premises resources | Cloud resources and SaaS apps |
| **Management** | Group Policy | Azure Policy, Conditional Access |
| **Scope** | Local network | Global (internet-based) |

**Detailed Example 1: Cloud-Only User Authentication**

Contoso Corporation is a startup with no on-premises infrastructure. They create users directly in Microsoft Entra ID:

1. **Admin creates a user**: john.doe@contoso.com in the Azure Portal
2. **John receives a welcome email** with temporary password
3. **John signs in** to portal.azure.com for the first time
4. **Entra ID prompts** John to change his password and set up MFA
5. **John completes setup** and is now authenticated
6. **John accesses Azure resources** - Entra ID issues tokens for each resource
7. **Audit logs record** all of John's sign-ins and activities

This entire process happens in the cloud with no on-premises infrastructure needed.

**Detailed Example 2: Hybrid Identity with Azure AD Connect**

Fabrikam Inc. has 5,000 employees with existing Active Directory accounts on-premises. They want to use Azure but don't want to recreate all user accounts:

1. **Fabrikam installs Azure AD Connect** on a server in their datacenter
2. **Azure AD Connect synchronizes** user accounts from on-premises AD to Entra ID every 30 minutes
3. **Users keep their existing credentials** - same username and password work for both on-premises and cloud resources
4. **Password Hash Synchronization** sends a hash of user passwords to Entra ID (not the actual password)
5. **Users can sign in** to both on-premises resources (using domain controllers) and cloud resources (using Entra ID)
6. **Changes sync automatically** - if a user's password changes on-premises, it syncs to Entra ID within 30 minutes

This hybrid approach allows Fabrikam to leverage their existing identity infrastructure while moving to the cloud.

**Detailed Example 3: External User Collaboration (B2B)**

Northwind Traders needs to collaborate with external consultants from partner companies:

1. **Northwind invites** consultant@partnercorp.com as a guest user
2. **Consultant receives email invitation** with a link to accept
3. **Consultant clicks link** and authenticates with their own organization's credentials (partnercorp.com)
4. **Entra ID creates a guest user object** in Northwind's tenant
5. **Northwind assigns permissions** to the guest user (e.g., access to specific SharePoint sites)
6. **Consultant accesses Northwind's resources** using their own credentials - no need for a separate Northwind account
7. **When project ends**, Northwind removes the guest user, revoking all access

This B2B collaboration allows secure external access without creating and managing separate accounts.

### Microsoft Entra ID Concepts

#### Tenants

**What it is**: A tenant is a dedicated instance of Microsoft Entra ID that your organization receives when you sign up for a Microsoft cloud service. Each tenant represents a single organization and is completely isolated from other tenants.

**Why it exists**: Tenants provide security boundaries. Data and identities in one tenant are completely separate from another tenant. This ensures that Company A's users cannot access Company B's resources, even though both use Microsoft Entra ID.

**Real-world analogy**: Think of a tenant like an apartment building. Each tenant (organization) has their own apartment (Entra ID instance) with their own locks and keys. Tenants share the building infrastructure (Microsoft's cloud) but cannot access each other's apartments.

**Key Facts**:
- Each tenant has a unique domain name: yourcompany.onmicrosoft.com
- A tenant can have custom domains: yourcompany.com
- Tenant ID is a GUID that uniquely identifies your organization
- One organization typically has one tenant (but can have multiple for specific scenarios)

#### Users

**What it is**: A user is an identity in Microsoft Entra ID that represents a person who needs to access resources. Each user has a unique username (User Principal Name or UPN) and authentication credentials.

**Types of Users**:

**1. Cloud-Only Users**:
- Created directly in Microsoft Entra ID
- Credentials stored only in the cloud
- Example: john.doe@contoso.onmicrosoft.com
- Best for: Cloud-first organizations, external contractors

**2. Synchronized Users**:
- Synced from on-premises Active Directory using Azure AD Connect
- Credentials can be synchronized (Password Hash Sync) or federated (AD FS)
- Example: john.doe@contoso.com (exists both on-premises and in cloud)
- Best for: Hybrid organizations with existing AD infrastructure

**3. Guest Users (B2B)**:
- External users from other organizations
- Authenticate with their home organization's credentials
- Example: consultant@partnercorp.com invited to your tenant
- Best for: External collaboration, partners, vendors

**User Properties**:
- Display Name: "John Doe"
- User Principal Name (UPN): john.doe@contoso.com
- Job Title, Department, Manager (optional metadata)
- Contact Information: email, phone
- Licenses: Microsoft 365, Azure AD Premium, etc.

**Detailed Example 1: Creating a Cloud User**

Using Azure Portal:
1. Navigate to Microsoft Entra ID → Users
2. Click "New user" → "Create new user"
3. Fill in details:
   - User name: jane.smith@contoso.com
   - Name: Jane Smith
   - First name: Jane
   - Last name: Smith
   - Job title: Marketing Manager
   - Department: Marketing
4. Set initial password (user must change on first sign-in)
5. Assign licenses (if needed)
6. Click "Create"

Using Azure CLI:
```bash
az ad user create \
  --display-name "Jane Smith" \
  --user-principal-name jane.smith@contoso.com \
  --password "TempP@ssw0rd123!" \
  --force-change-password-next-sign-in true \
  --job-title "Marketing Manager" \
  --department "Marketing"

Using PowerShell:

New-AzADUser `
  -DisplayName "Jane Smith" `
  -UserPrincipalName "jane.smith@contoso.com" `
  -Password (ConvertTo-SecureString "TempP@ssw0rd123!" -AsPlainText -Force) `
  -ForceChangePasswordNextSignIn $true `
  -JobTitle "Marketing Manager" `
  -Department "Marketing"

Detailed Example 2: Bulk User Creation

For creating many users at once, use CSV import:

Download CSV template from Azure Portal
Fill in user details:

name,userPrincipalName,initialPassword,jobTitle,department
John Doe,john.doe@contoso.com,TempPass1!,Developer,Engineering
Jane Smith,jane.smith@contoso.com,TempPass2!,Manager,Marketing
Bob Johnson,bob.johnson@contoso.com,TempPass3!,Analyst,Finance

Upload CSV in Portal: Microsoft Entra ID → Users → Bulk operations → Bulk create
Azure creates all users and provides a results report

This is much faster than creating users one by one when onboarding many employees.

Groups

What it is: A group is a collection of users (and sometimes other groups or devices) that you can manage as a single unit. Groups simplify permission management - instead of assigning permissions to 50 individual users, you assign permissions to one group containing those 50 users.

Why it exists: Managing permissions for individual users doesn't scale. Imagine a company with 10,000 employees and 500 applications. Without groups, you'd need to manage millions of individual permission assignments. Groups reduce this to thousands of group assignments.

Real-world analogy: Groups are like mailing lists. Instead of sending an email to 50 people individually, you send one email to the "Marketing Team" group, and everyone in that group receives it.

Types of Groups:

1. Security Groups:

Used for assigning permissions to Azure resources
Can contain users, devices, and other groups
Membership can be assigned or dynamic
Example: "Finance-Team" group with access to finance applications

2. Microsoft 365 Groups:

Used for collaboration (shared mailbox, calendar, files, SharePoint site)
Can only contain users (not devices or other groups)
Automatically creates associated resources (mailbox, SharePoint site, Teams team)
Example: "Project-Phoenix" group for project collaboration

Group Membership Types:

Assigned Membership:

Administrators manually add/remove members
Full control over who is in the group
Best for: Small groups, groups with specific membership criteria
Example: "Executives" group with manually selected senior leaders

Dynamic Membership:

Members automatically added/removed based on rules
Rules use user attributes (department, job title, location, etc.)
Requires Azure AD Premium P1 license
Best for: Large groups, groups based on organizational structure
Example: "All-Marketing-Users" with rule: department equals "Marketing"

Dynamic Group Rule Examples:

# All users in Marketing department
(user.department -eq "Marketing")

# All users in Seattle office
(user.city -eq "Seattle")

# All managers
(user.jobTitle -contains "Manager")

# All users in Marketing OR Sales
(user.department -eq "Marketing") -or (user.department -eq "Sales")

# All users in Marketing AND in Seattle
(user.department -eq "Marketing") -and (user.city -eq "Seattle")

# All devices running Windows
(device.deviceOSType -eq "Windows")

Detailed Example 1: Creating a Security Group

Using Azure Portal:

Navigate to Microsoft Entra ID → Groups
Click "New group"
Select:
- Group type: Security
- Group name: "Finance-Team"
- Group description: "All finance department employees"
- Membership type: Assigned
Click "No members selected" and add users
Click "Create"

Using Azure CLI:

# Create group
az ad group create \
  --display-name "Finance-Team" \
  --mail-nickname "finance-team" \
  --description "All finance department employees"

# Add members
az ad group member add \
  --group "Finance-Team" \
  --member-id <user-object-id>

Detailed Example 2: Creating a Dynamic Group

Dynamic groups automatically maintain membership based on rules:

Navigate to Microsoft Entra ID → Groups → New group
Select:
- Group type: Security
- Group name: "All-Marketing-Users"
- Membership type: Dynamic User
Click "Add dynamic query"
Build rule:
- Property: department
- Operator: Equals
- Value: Marketing
Validate rule (shows how many users match)
Click "Create"

Now, whenever a user's department is set to "Marketing", they're automatically added to this group. When they leave Marketing, they're automatically removed.

Detailed Example 3: Nested Groups

Groups can contain other groups (nesting):

Group: "All-Employees"
├── Group: "Engineering"
│   ├── User: John (Developer)
│   ├── User: Jane (Developer)
│   └── Group: "Engineering-Managers"
│       └── User: Bob (Engineering Manager)
├── Group: "Marketing"
│   ├── User: Alice (Marketing Specialist)
│   └── User: Charlie (Marketing Manager)
└── Group: "Finance"
    ├── User: David (Accountant)
    └── User: Eve (CFO)

If you assign permissions to "All-Employees", everyone in all nested groups gets those permissions. This hierarchical structure mirrors organizational structure.

⭐ Must Know (Critical Facts):

Security groups are for permissions, Microsoft 365 groups are for collaboration
Dynamic groups require Azure AD Premium P1 license
Group nesting is supported (groups can contain other groups)
Maximum nesting depth: No hard limit, but keep it simple (2-3 levels max for manageability)
Dynamic group rules are evaluated continuously - membership updates automatically
Assigned groups give you full control but require manual management

When to use Security Groups vs Microsoft 365 Groups:

Use Case	Security Group	Microsoft 365 Group
Assign Azure resource permissions	✅ Yes	❌ No
Assign app permissions	✅ Yes	❌ No
Email distribution list	❌ No	✅ Yes
Shared mailbox	❌ No	✅ Yes
Team collaboration (Teams)	❌ No	✅ Yes
SharePoint site	❌ No	✅ Yes (auto-created)
Can contain devices	✅ Yes	❌ No
Can contain other groups	✅ Yes	❌ No
Dynamic membership	✅ Yes (Premium)	✅ Yes (Premium)

💡 Tips for Understanding:

Think of security groups as "permission containers" - they hold users who need the same permissions
Think of Microsoft 365 groups as "collaboration containers" - they hold users who work together
Use dynamic groups for large organizations where manual management doesn't scale
Use assigned groups when you need precise control over membership

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Using Microsoft 365 groups for Azure resource permissions
- Why it's wrong: Microsoft 365 groups cannot be assigned Azure RBAC roles
- Correct understanding: Use security groups for Azure permissions, Microsoft 365 groups for collaboration
Mistake 2: Creating too many assigned groups instead of using dynamic groups
- Why it's wrong: Manual management becomes overwhelming as organization grows
- Correct understanding: Use dynamic groups based on user attributes (department, location) for automatic management
Mistake 3: Not planning group structure before creating groups
- Why it's wrong: You end up with inconsistent naming, duplicate groups, and confusion
- Correct understanding: Design your group structure to mirror your organizational structure and permission needs

🔗 Connections to Other Topics:

Relates to RBAC (next section) because: Groups are the primary way to assign permissions at scale
Builds on Microsoft Entra ID by: Providing organizational structure for users
Connected to Azure Policy because: Groups can be used in policy assignments
Used with Conditional Access (advanced topic) to: Apply access policies to groups of users

Section 2: Role-Based Access Control (RBAC)

Introduction

The problem: You have hundreds of users who need different levels of access to Azure resources. Some need to create VMs, others only need to view costs, and some need full administrative access. Managing individual permissions for each user on each resource would be impossible.

The solution: Azure Role-Based Access Control (RBAC) allows you to assign roles to users, groups, or service principals at different scopes. A role defines what actions are allowed (read, write, delete, etc.), and the scope defines where those actions apply (subscription, resource group, or individual resource).

Why it's tested: RBAC is fundamental to Azure security. The AZ-104 exam tests your ability to assign appropriate roles, understand role inheritance, and troubleshoot access issues.

What is Azure RBAC?

What it is: Azure RBAC is an authorization system built on Azure Resource Manager that provides fine-grained access management for Azure resources. It allows you to grant users only the access they need to do their jobs, following the principle of least privilege.

Why it exists: Without RBAC, you'd have only two options: full admin access or no access. RBAC provides granular control - you can give someone permission to manage VMs but not delete them, or view costs but not create resources. This granularity is essential for security and compliance.

Real-world analogy: RBAC is like different levels of access cards in an office building. A janitor's card opens cleaning closets and common areas. An employee's card opens their office and meeting rooms. A manager's card opens all employee offices plus executive areas. The CEO's card opens everything. Each person has exactly the access they need - no more, no less.

How it works (Detailed step-by-step):

You create a role assignment by selecting three things: Security Principal (who), Role Definition (what they can do), and Scope (where they can do it).
Azure stores the role assignment in Azure Resource Manager.
When a user tries to perform an action (e.g., create a VM), Azure checks all role assignments for that user.
Azure evaluates permissions at all scopes (management group, subscription, resource group, resource) and combines them.
If any role assignment grants the permission, the action is allowed (unless explicitly denied).
Azure logs the action in Activity Logs for auditing.

RBAC Components

1. Security Principal (Who)

The "who" in RBAC - the identity that needs access:

User: An individual person with a profile in Microsoft Entra ID

Example: john.doe@contoso.com

Group: A collection of users

Example: "Engineering-Team" group containing all engineers

Service Principal: An identity for applications or services

Example: An application that needs to read from Azure Storage

Managed Identity: A special type of service principal managed by Azure

Example: A VM that needs to access Key Vault

💡 Tip: Always assign roles to groups, not individual users. This makes management much easier as people join/leave teams.

2. Role Definition (What)

The "what" in RBAC - the set of permissions:

Built-in Roles (Azure provides 100+ built-in roles):

Owner:

Full access to all resources
Can manage access (assign roles to others)
Can delete resources
Use case: Subscription administrators, project leads

Contributor:

Full access to all resources
CANNOT manage access (cannot assign roles)
Can create and delete resources
Use case: Developers, engineers who need to manage resources but not permissions

Reader:

View all resources
CANNOT make any changes
CANNOT see sensitive data (like storage account keys)
Use case: Auditors, managers who need visibility but not control

User Access Administrator:

Manage user access to Azure resources
CANNOT manage the resources themselves
Can assign roles to others
Use case: Security team members who manage permissions

Resource-Specific Roles:

Virtual Machine Contributor: Manage VMs but not the network or storage
Storage Account Contributor: Manage storage accounts
SQL DB Contributor: Manage SQL databases
Network Contributor: Manage networks
And 100+ more...

Custom Roles:

Create your own roles with specific permissions
Requires Azure AD Premium P1 or P2
Example: "VM Operator" role that can start/stop VMs but not create/delete them

Role Definition Structure:

{
  "Name": "Virtual Machine Contributor",
  "Id": "9980e02c-c2be-4d73-94e8-173b1dc7cf3c",
  "IsCustom": false,
  "Description": "Lets you manage virtual machines, but not access to them, and not the virtual network or storage account they're connected to.",
  "Actions": [
    "Microsoft.Compute/virtualMachines/*",
    "Microsoft.Network/networkInterfaces/read",
    "Microsoft.Storage/storageAccounts/read"
  ],
  "NotActions": [],
  "DataActions": [],
  "NotDataActions": [],
  "AssignableScopes": [
    "/"
  ]
}

Actions: What operations are allowed

* means all operations
Microsoft.Compute/virtualMachines/* means all VM operations
Microsoft.Compute/virtualMachines/read means only read VMs

NotActions: Exceptions to Actions (denies)

Used to exclude specific operations from a wildcard
Example: Allow all VM operations EXCEPT delete

DataActions: Operations on data within a resource

Example: Read blobs in a storage account
Example: Read/write data in a database

3. Scope (Where)

The "where" in RBAC - where the permissions apply:

Management Group Scope:

Applies to all subscriptions in the management group
Example: /providers/Microsoft.Management/managementGroups/Production
Use case: Apply permissions across multiple subscriptions

Subscription Scope:

Applies to all resources in the subscription
Example: /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
Use case: Give someone access to everything in a subscription

Resource Group Scope:

Applies to all resources in the resource group
Example: /subscriptions/.../resourceGroups/RG-WebApp-Prod
Use case: Give developers access to all resources in their project's resource group

Resource Scope:

Applies to a single resource only
Example: /subscriptions/.../resourceGroups/RG-WebApp-Prod/providers/Microsoft.Compute/virtualMachines/vm-web-001
Use case: Give someone access to a specific VM only

Scope Inheritance:
Permissions assigned at a higher scope automatically apply to all child scopes:

Management Group (Owner)
└── Subscription (inherits Owner)
    └── Resource Group (inherits Owner)
        └── Resource (inherits Owner)

If you assign someone "Reader" at the subscription level, they can read ALL resources in ALL resource groups in that subscription.

RBAC in Action: Detailed Examples

Detailed Example 1: Developer Access to Project Resources

Scenario: You have a development team working on a web application. They need to manage all resources in their project's resource group but shouldn't access production resources.

Solution:

Create a security group: "WebApp-Dev-Team"
Add all developers to the group
Assign "Contributor" role to the group at resource group scope:
- Security Principal: "WebApp-Dev-Team" group
- Role: Contributor
- Scope: /subscriptions/.../resourceGroups/RG-WebApp-Dev

Result:

Developers can create, modify, and delete resources in RG-WebApp-Dev
Developers CANNOT assign permissions (Contributor doesn't allow this)
Developers CANNOT access resources in other resource groups
When new developers join, just add them to the group - they automatically get access

Detailed Example 2: Read-Only Access for Managers

Scenario: Department managers need to view all resources and costs in their department's subscription but shouldn't be able to make changes.

Solution:

Create a security group: "Department-Managers"
Add all managers to the group
Assign "Reader" role at subscription scope:
- Security Principal: "Department-Managers" group
- Role: Reader
- Scope: /subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

Result:

Managers can view all resources in the subscription
Managers can view costs and usage
Managers CANNOT create, modify, or delete anything
Managers CANNOT see sensitive data like storage account keys

Detailed Example 3: VM Operator Custom Role

Scenario: You have operations staff who need to start/stop VMs for maintenance but shouldn't be able to create or delete VMs.

Solution:

Create a custom role "VM Operator":

{
  "Name": "VM Operator",
  "Description": "Can start, stop, and restart VMs but cannot create or delete them",
  "Actions": [
    "Microsoft.Compute/virtualMachines/read",
    "Microsoft.Compute/virtualMachines/start/action",
    "Microsoft.Compute/virtualMachines/restart/action",
    "Microsoft.Compute/virtualMachines/deallocate/action"
  ],
  "NotActions": [],
  "AssignableScopes": [
    "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
  ]
}

Create security group: "VM-Operators"
Assign "VM Operator" role to the group at subscription scope

Result:

Operators can start, stop, and restart any VM in the subscription
Operators CANNOT create new VMs
Operators CANNOT delete VMs
Operators CANNOT modify VM configuration

Detailed Example 4: Combining Multiple Role Assignments

Scenario: Alice is a developer who needs different levels of access to different environments.

Role Assignments:

"Contributor" on RG-WebApp-Dev (development resource group)
"Reader" on RG-WebApp-Staging (staging resource group)
"Reader" on RG-WebApp-Prod (production resource group)

Result:

Alice can fully manage resources in development
Alice can view but not modify staging resources
Alice can view but not modify production resources
Azure combines all role assignments - Alice gets the union of all permissions

⭐ Must Know (Critical Facts):

Least Privilege: Always assign the minimum permissions needed
Inheritance: Permissions flow down from parent scopes to child scopes
Additive: Multiple role assignments are combined (union of permissions)
No Deny: Azure RBAC is allow-only (except NotActions which are exclusions, not denies)
Contributor vs Owner: Contributor can manage resources but NOT permissions; Owner can do both
Reader limitations: Reader cannot see sensitive data like keys, passwords, connection strings

Common Built-in Roles Summary:

Role	Can Manage Resources	Can Assign Roles	Can Delete	Use Case
Owner	✅ Yes	✅ Yes	✅ Yes	Full admin
Contributor	✅ Yes	❌ No	✅ Yes	Developers
Reader	❌ No (view only)	❌ No	❌ No	Auditors, managers
User Access Administrator	❌ No	✅ Yes	❌ No	Security team

When to use each scope:

✅ Management Group: Apply permissions across multiple subscriptions (e.g., all production subscriptions)
✅ Subscription: Give someone access to everything in a subscription (e.g., subscription admin)
✅ Resource Group: Most common - give team access to their project resources
✅ Resource: Rare - only when someone needs access to a single specific resource

💡 Tips for Understanding:

Think of RBAC as "Who can do What on Where"
Always use groups for role assignments, not individual users
Start with built-in roles - only create custom roles when absolutely necessary
Document your RBAC strategy - it gets complex quickly in large organizations

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Assigning Owner role to everyone "just to be safe"
- Why it's wrong: Violates least privilege principle, creates security risks
- Correct understanding: Use Contributor for most users, Owner only for admins who need to manage permissions
Mistake 2: Assigning roles to individual users instead of groups
- Why it's wrong: Becomes unmanageable as organization grows, hard to audit
- Correct understanding: Create groups based on job functions, assign roles to groups
Mistake 3: Thinking Reader role gives access to everything
- Why it's wrong: Reader cannot see sensitive data like storage account keys
- Correct understanding: Reader is view-only for resource properties, not data or secrets
Mistake 4: Not understanding scope inheritance
- Why it's wrong: Assigning permissions at subscription level gives access to ALL resource groups
- Correct understanding: Permissions flow downward - be careful with high-level assignments

🔗 Connections to Other Topics:

Relates to Microsoft Entra ID because: RBAC uses Entra ID identities (users, groups)
Builds on Azure Hierarchy by: Using scopes (management groups, subscriptions, resource groups)
Connected to Azure Policy because: Both are governance tools (RBAC for access, Policy for compliance)
Used with Managed Identities (later topic) to: Give Azure resources permissions to access other resources

Troubleshooting Common RBAC Issues:

Issue 1: "User cannot access resource even though they have Contributor role"

Check: Is the role assigned at the correct scope? Check parent scopes too.
Check: Has the role assignment propagated? Can take up to 30 minutes.
Check: Is there a resource lock preventing changes?

Issue 2: "User can view resources but cannot see storage account keys"

Explanation: Reader role doesn't include permission to list keys (sensitive operation)
Solution: Assign "Storage Account Contributor" or "Storage Account Key Operator Service Role"

Issue 3: "User has Owner role but cannot assign roles to others"

Check: Is the Owner role assigned at the correct scope?
Check: Does the user have "Microsoft.Authorization/roleAssignments/write" permission?
Check: Is there a policy blocking role assignments?

Section 3: Azure Policy

Introduction

The problem: You have hundreds of developers creating resources in Azure. Without controls, they might create resources in the wrong regions (violating data residency requirements), use expensive VM sizes (blowing the budget), or forget to apply required tags (making cost tracking impossible). Manually reviewing every resource creation is not scalable.

The solution: Azure Policy allows you to create rules that automatically enforce organizational standards and assess compliance at scale. Policies can prevent non-compliant resources from being created (deny), automatically fix non-compliant resources (modify), or simply report on compliance (audit).

Why it's tested: Azure Policy is a core governance tool. The AZ-104 exam tests your ability to create policy definitions, assign policies at appropriate scopes, and understand policy effects.

What is Azure Policy?

What it is: Azure Policy is a service in Azure that you use to create, assign, and manage policies. These policies enforce different rules and effects over your resources, ensuring those resources stay compliant with your corporate standards and service level agreements.

Why it exists: Organizations need to ensure consistency and compliance across their Azure environment. Without Azure Policy, you'd rely on manual processes, documentation, and hope that everyone follows the rules. Azure Policy automates compliance enforcement and provides visibility into compliance status.

Real-world analogy: Azure Policy is like building codes and inspections. Just as a city has building codes (policies) that all construction must follow, and inspectors (Azure Policy) check compliance and can stop non-compliant construction, Azure Policy enforces rules and checks compliance for Azure resources.

How it works (Detailed step-by-step):

You create or use a policy definition that describes the compliance condition and the effect to take
You assign the policy to a scope (management group, subscription, or resource group)
Azure Policy evaluates resources within that scope against the policy definition
For new/updated resources, Azure Policy evaluates them during creation/update and applies the effect
For existing resources, Azure Policy scans them periodically (every 24 hours) and marks compliance status
You view compliance results in the Azure Portal to see which resources are compliant or non-compliant

Policy Components

1. Policy Definition

What it is: A policy definition describes the compliance condition (what to check) and the effect (what to do if the condition is met). It's written in JSON format.

Built-in Policy Definitions (Azure provides 300+ built-in policies):

Common Built-in Policies:

"Allowed locations": Restrict which Azure regions resources can be created in
"Allowed virtual machine size SKUs": Restrict which VM sizes can be used
"Require a tag on resources": Ensure all resources have a specific tag
"Not allowed resource types": Prevent creation of specific resource types
"Audit VMs that do not use managed disks": Find VMs using unmanaged disks

Policy Definition Structure:

{
  "properties": {
    "displayName": "Allowed locations",
    "description": "This policy enables you to restrict the locations your organization can specify when deploying resources.",
    "mode": "Indexed",
    "parameters": {
      "listOfAllowedLocations": {
        "type": "Array",
        "metadata": {
          "description": "The list of locations that can be specified when deploying resources.",
          "displayName": "Allowed locations"
        }
      }
    },
    "policyRule": {
      "if": {
        "not": {
          "field": "location",
          "in": "[parameters('listOfAllowedLocations')]"
        }
      },
      "then": {
        "effect": "deny"
      }
    }
  }
}

Key Elements:

mode: "Indexed" (for most resources) or "All" (includes resource groups)
parameters: Variables that make the policy reusable
policyRule: The logic (if condition, then effect)
if: The condition to evaluate
then: The effect to apply if condition is true

2. Policy Effects

What it is: The effect determines what happens when a policy rule is matched. Different effects provide different levels of enforcement.

Common Policy Effects:

Deny:

What it does: Prevents the resource from being created or updated
When to use: When you want to block non-compliant resources completely
Example: Deny creation of resources outside allowed regions
Impact: Blocks deployment, user sees error message

Audit:

What it does: Creates a warning event in the activity log but allows the resource
When to use: When you want visibility into non-compliance without blocking
Example: Audit VMs without backup enabled
Impact: No blocking, just logging for review

AuditIfNotExists:

What it does: Audits if a related resource doesn't exist
When to use: When compliance depends on a child or extension resource
Example: Audit VMs that don't have antimalware extension installed
Impact: Marks resource as non-compliant if related resource missing

DeployIfNotExists:

What it does: Automatically deploys a related resource if it doesn't exist
When to use: When you want to automatically remediate non-compliance
Example: Automatically deploy antimalware extension to VMs that don't have it
Impact: Automatically creates missing resources

Modify:

What it does: Adds, updates, or removes tags or properties on resources
When to use: When you want to automatically fix resource properties
Example: Automatically add required tags to resources
Impact: Modifies resource properties during creation/update

Append:

What it does: Adds additional fields to a resource during creation/update
When to use: When you want to add properties that users might forget
Example: Add specific network security rules to NSGs
Impact: Adds properties to resources

Disabled:

What it does: Turns off the policy without deleting the assignment
When to use: Temporarily disable a policy for testing or troubleshooting
Impact: Policy is not evaluated

Effect Comparison:

Effect	Blocks Creation	Modifies Resource	Audits Only	Auto-Remediation
Deny	✅ Yes	❌ No	❌ No	❌ No
Audit	❌ No	❌ No	✅ Yes	❌ No
AuditIfNotExists	❌ No	❌ No	✅ Yes	❌ No
DeployIfNotExists	❌ No	✅ Yes	❌ No	✅ Yes
Modify	❌ No	✅ Yes	❌ No	✅ Yes
Append	❌ No	✅ Yes	❌ No	❌ No

3. Policy Assignment

What it is: A policy assignment is the act of applying a policy definition to a specific scope. The assignment determines where the policy is enforced.

Assignment Scopes:

Management Group: Policy applies to all subscriptions in the management group
Subscription: Policy applies to all resource groups and resources in the subscription
Resource Group: Policy applies to all resources in the resource group

Assignment Properties:

Scope: Where the policy applies
Exclusions: Specific scopes to exclude from the policy
Parameters: Values for policy parameters
Enforcement Mode: Enabled (enforces policy) or Disabled (audit only)

Detailed Example 1: Restrict Resource Locations

Scenario: Your organization has data residency requirements - all resources must be in US regions only.

Solution:

Use built-in policy: "Allowed locations"
Assign at subscription scope
Set parameters: ["eastus", "westus", "centralus"]
Effect: Deny

Result:

Users can create resources in East US, West US, or Central US
Attempts to create resources in other regions (e.g., West Europe) are blocked
User sees error: "Resource creation failed. Policy 'Allowed locations' denied the request."

Detailed Example 2: Require Tags on Resources

Scenario: All resources must have "CostCenter" and "Owner" tags for cost tracking.

Solution:

Use built-in policy: "Require a tag on resources"
Create two policy assignments:
- Assignment 1: Require "CostCenter" tag
- Assignment 2: Require "Owner" tag
Assign at subscription scope
Effect: Deny

Result:

Users must specify CostCenter and Owner tags when creating resources
Resources without these tags are blocked
Existing resources without tags are marked as non-compliant

Detailed Example 3: Auto-Apply Tags

Scenario: Automatically add "Environment" tag to all resources in a resource group.

Solution:

Use built-in policy: "Add a tag to resources"
Assign at resource group scope (RG-WebApp-Dev)
Set parameters:
- Tag name: "Environment"
- Tag value: "Development"
Effect: Modify

Result:

All new resources in RG-WebApp-Dev automatically get Environment=Development tag
Users don't need to remember to add the tag
Existing resources can be remediated to add the tag

Detailed Example 4: Audit VM Backup Compliance

Scenario: All production VMs should have Azure Backup enabled, but you don't want to block VM creation.

Solution:

Use built-in policy: "Azure Backup should be enabled for Virtual Machines"
Assign at resource group scope (RG-Production)
Effect: AuditIfNotExists

Result:

VMs can be created without backup (not blocked)
VMs without backup are marked as non-compliant
Compliance dashboard shows which VMs need backup configured
Operations team can review and remediate non-compliant VMs

4. Initiative Definitions (Policy Sets)

What it is: An initiative definition (also called a policy set) is a collection of policy definitions grouped together to achieve a larger compliance goal.

Why it exists: Instead of assigning 20 individual policies, you can group them into one initiative and assign the initiative. This simplifies management and ensures related policies are applied together.

Real-world analogy: An initiative is like a checklist. Instead of remembering 20 individual tasks, you have one checklist that contains all 20 tasks. Completing the checklist means completing all tasks.

Example Initiative: "CIS Microsoft Azure Foundations Benchmark"

Contains 100+ policy definitions
Implements security recommendations from CIS
Assign once to get all CIS policies

Custom Initiative Example:

{
  "properties": {
    "displayName": "Production Environment Standards",
    "description": "Policies required for all production resources",
    "policyDefinitions": [
      {
        "policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/allowed-locations",
        "parameters": {
          "listOfAllowedLocations": {
            "value": ["eastus", "westus"]
          }
        }
      },
      {
        "policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/require-tag",
        "parameters": {
          "tagName": {
            "value": "CostCenter"
          }
        }
      },
      {
        "policyDefinitionId": "/providers/Microsoft.Authorization/policyDefinitions/allowed-vm-sizes",
        "parameters": {
          "listOfAllowedSKUs": {
            "value": ["Standard_D2s_v3", "Standard_D4s_v3"]
          }
        }
      }
    ]
  }
}

This initiative groups three policies together. Assigning this initiative applies all three policies at once.

⭐ Must Know (Critical Facts):

Policy vs Initiative: Policy is a single rule, Initiative is a group of policies
Evaluation timing: New/updated resources evaluated immediately, existing resources scanned every 24 hours
Deny vs Audit: Deny blocks non-compliant resources, Audit only reports them
Scope inheritance: Policies assigned at higher scopes apply to all child scopes
Exclusions: You can exclude specific scopes from a policy assignment
Remediation: DeployIfNotExists and Modify policies can automatically fix non-compliant resources

When to use each effect:

✅ Deny: When non-compliance is unacceptable (security, compliance requirements)
✅ Audit: When you want visibility before enforcing (testing, gradual rollout)
✅ Modify: When you want to automatically fix simple issues (tags, properties)
✅ DeployIfNotExists: When you want to automatically deploy missing resources (extensions, configurations)

💡 Tips for Understanding:

Start with Audit effect to understand impact before using Deny
Use initiatives to group related policies for easier management
Test policies in a dev subscription before applying to production
Use parameters to make policies reusable across different environments

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Assigning Deny policies without testing first
- Why it's wrong: Can break existing deployments and workflows
- Correct understanding: Start with Audit to see impact, then switch to Deny after validation
Mistake 2: Thinking policies apply retroactively to existing resources
- Why it's wrong: Deny and Modify only affect new/updated resources
- Correct understanding: Existing resources are marked non-compliant but not automatically fixed (unless you run remediation)
Mistake 3: Creating too many individual policy assignments
- Why it's wrong: Becomes unmanageable, hard to track what's assigned where
- Correct understanding: Group related policies into initiatives for easier management

🔗 Connections to Other Topics:

Relates to RBAC because: Both are governance tools (RBAC for access, Policy for compliance)
Builds on Azure Hierarchy by: Using scopes (management groups, subscriptions, resource groups)
Connected to Tags (next section) because: Policies can enforce tagging standards
Used with Cost Management because: Policies can control costs by restricting expensive resources

Section 4: Tags and Resource Organization

Introduction

The problem: You have 500 resources across multiple subscriptions. Finance asks: "How much are we spending on the Marketing department?" Operations asks: "Which resources belong to Project Phoenix?" Security asks: "Who owns this VM?" Without proper organization, answering these questions requires manually reviewing each resource.

The solution: Tags are name-value pairs that you attach to resources for organization, cost tracking, and management. Tags allow you to categorize resources by any criteria that makes sense for your organization.

Why it's tested: Tags are fundamental to Azure resource management. The AZ-104 exam tests your ability to apply tags, use tags for cost tracking, and implement tagging strategies.

What are Tags?

What it is: A tag is a name-value pair (like "Environment: Production" or "CostCenter: IT-001") that you apply to Azure resources, resource groups, and subscriptions. Tags provide metadata that helps you organize and manage resources.

Why it exists: Azure resources don't have built-in organizational structure beyond resource groups. Tags provide flexible, custom categorization that can span multiple resource groups and subscriptions. They're essential for cost tracking, automation, and resource management.

Real-world analogy: Tags are like labels on file folders. You might label folders by project, department, or date. Similarly, you tag Azure resources by environment, cost center, owner, or any criteria you choose.

Tag Structure:

Tag name: The category (e.g., "Environment", "CostCenter", "Owner")
Tag value: The specific value (e.g., "Production", "IT-001", "john.doe@contoso.com")
Format: Case-insensitive for names, case-sensitive for values
Limits: Up to 50 tags per resource, 512 characters for name, 256 characters for value

Common Tagging Strategies:

1. Cost Tracking Tags:

CostCenter: IT-001, Marketing-002, Finance-003
Department: IT, Marketing, Finance, HR
Project: ProjectPhoenix, Migration2024
BudgetOwner: john.doe@contoso.com

2. Operational Tags:

Environment: Production, Staging, Development, Test
Tier: Frontend, Backend, Database, Network
MaintenanceWindow: Weekend, Weeknight, 24x7
BackupRequired: Yes, No

3. Organizational Tags:

Owner: john.doe@contoso.com
BusinessUnit: Retail, Enterprise, SMB
Application: ECommerce, CRM, ERP
Criticality: Mission-Critical, High, Medium, Low

4. Automation Tags:

AutoShutdown: Yes, No
AutoStart: Yes, No
PatchGroup: Group1, Group2, Group3
MonitoringEnabled: Yes, No

Detailed Example 1: Comprehensive Tagging Strategy

Contoso implements this tagging strategy for all resources:

Required Tags (enforced by Azure Policy):

Environment: Production | Staging | Development | Test
CostCenter: IT-001 | Marketing-002 | Finance-003 | etc.
Owner: email address of responsible person
Application: name of the application

Optional Tags:

Project: project name (if applicable)
MaintenanceWindow: when resource can be updated
DataClassification: Public | Internal | Confidential | Restricted

Example Resource Tags:

Resource: vm-web-prod-001
Tags:
  Environment: Production
  CostCenter: IT-001
  Owner: john.doe@contoso.com
  Application: ECommerce
  Tier: Frontend
  MaintenanceWindow: Weekend
  DataClassification: Internal
  AutoShutdown: No

Benefits:

Finance can filter costs by CostCenter to charge back departments
Operations can find all Production resources by filtering Environment tag
Security can identify resource owners using Owner tag
Automation can shut down non-production resources using AutoShutdown tag

Detailed Example 2: Using Tags for Cost Allocation

Fabrikam wants to track Azure costs by department:

Apply CostCenter tags to all resources:
- IT resources: CostCenter=IT-001
- Marketing resources: CostCenter=Marketing-002
- Finance resources: CostCenter=Finance-003
View costs by tag in Cost Management:
- Navigate to Cost Management + Billing
- Select "Cost analysis"
- Group by: Tag → CostCenter
- Result: See costs broken down by department
Create budgets per department:
- Create budget for IT-001: $10,000/month
- Create budget for Marketing-002: $5,000/month
- Set alerts at 80% and 100% of budget
Generate cost reports:
- Export monthly costs grouped by CostCenter
- Send to department managers for review

Detailed Example 3: Automation with Tags

Northwind uses tags to automate resource management:

Auto-Shutdown Script:

# Find all VMs with AutoShutdown=Yes tag
$vms = Get-AzVM | Where-Object {$_.Tags['AutoShutdown'] -eq 'Yes'}

# Check current time
$currentHour = (Get-Date).Hour

# Shutdown VMs after business hours (6 PM)
if ($currentHour -ge 18) {
    foreach ($vm in $vms) {
        Write-Host "Shutting down $($vm.Name)..."
        Stop-AzVM -ResourceGroupName $vm.ResourceGroupName -Name $vm.Name -Force
    }
}

This script runs nightly and automatically shuts down non-production VMs to save costs.

Tag Inheritance

What it is: Tag inheritance means tags applied at a higher scope (subscription or resource group) can be automatically applied to resources within that scope.

Important: By default, tags do NOT inherit. Resources do not automatically get tags from their resource group or subscription.

How to enable inheritance: Use Azure Policy with the "Inherit a tag from the resource group" or "Inherit a tag from the subscription" built-in policies.

Example:

Apply "Environment: Production" tag to resource group RG-WebApp-Prod
Assign policy: "Inherit a tag from the resource group if missing"
New resources in RG-WebApp-Prod automatically get "Environment: Production" tag
Existing resources can be remediated to add the tag

⭐ Must Know (Critical Facts):

No automatic inheritance: Tags don't automatically flow from resource groups to resources
50 tag limit: Each resource can have up to 50 tags
Case sensitivity: Tag names are case-insensitive, values are case-sensitive
Not all resources support tags: Some resource types don't support tagging
Subscription and RG tags: You can tag subscriptions and resource groups themselves
Cost tracking: Tags are the primary way to track costs across resource groups

When to use tags:

✅ Cost allocation and chargeback
✅ Resource organization across resource groups
✅ Automation and operations management
✅ Compliance and governance tracking
❌ Don't use for sensitive data (tags are visible to all users with read access)

💡 Tips for Understanding:

Plan your tagging strategy before creating resources
Use Azure Policy to enforce required tags
Keep tag names consistent (use naming conventions)
Document your tagging strategy for the organization

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Assuming tags automatically inherit from resource groups
- Why it's wrong: Inheritance must be explicitly configured with Azure Policy
- Correct understanding: Tags are independent unless you use policies to enforce inheritance
Mistake 2: Using inconsistent tag names (Environment vs Env vs environment)
- Why it's wrong: Makes filtering and reporting difficult
- Correct understanding: Standardize tag names across the organization
Mistake 3: Not enforcing required tags
- Why it's wrong: Users forget to add tags, making cost tracking incomplete
- Correct understanding: Use Azure Policy to require tags on all resources

Section 5: Cost Management

Introduction

The problem: Cloud costs can spiral out of control quickly. A developer creates a large VM for testing and forgets to delete it - $500/month wasted. Multiple teams create resources without tracking costs - the monthly bill is $50,000 but nobody knows where the money went.

The solution: Azure Cost Management provides tools to monitor, allocate, and optimize Azure spending. It helps you understand where money is going, set budgets, and receive alerts before overspending.

Why it's tested: Cost management is a critical responsibility for Azure administrators. The AZ-104 exam tests your ability to monitor costs, create budgets, and use cost management tools.

Azure Cost Management Tools

Cost Analysis

What it is: Cost Analysis is a tool in the Azure Portal that shows your Azure spending with various filters, groupings, and visualizations.

Key Features:

View costs by subscription, resource group, resource, service, location, or tags
Compare current month to previous months
Forecast future costs based on current usage
Export cost data to CSV or Excel
Create custom views and save them

Detailed Example 1: Analyzing Costs by Service

Scenario: You want to see which Azure services are costing the most.

Steps:

Navigate to Cost Management + Billing → Cost analysis
Select scope: Subscription
Group by: Service name
View: Column chart
Result: See costs broken down by service (VMs, Storage, Networking, etc.)

Insights:

Virtual Machines: $15,000 (60% of costs)
Storage: $5,000 (20%)
Networking: $3,000 (12%)
Databases: $2,000 (8%)

Action: Focus optimization efforts on VMs (largest cost driver).

Detailed Example 2: Analyzing Costs by Tag

Scenario: You want to see costs by department for chargeback.

Steps:

Navigate to Cost Management + Billing → Cost analysis
Select scope: Subscription
Group by: Tag → CostCenter
View: Table
Result: See costs by department

Results:

IT-001: $12,000
Marketing-002: $8,000
Finance-003: $5,000

Action: Send cost reports to department managers for review.

Budgets

What it is: Budgets allow you to set spending limits and receive alerts when costs approach or exceed those limits.

Budget Types:

Cost budget: Based on actual spending
Usage budget: Based on resource usage (e.g., GB of storage)

Alert Thresholds:

Set multiple thresholds (e.g., 50%, 80%, 100%, 120%)
Receive email notifications when thresholds are reached
Trigger action groups for automation (e.g., shut down resources)

Detailed Example 1: Creating a Monthly Budget

Scenario: Set a $10,000 monthly budget for the IT department.

Steps:

Navigate to Cost Management + Billing → Budgets
Click "Add"
Configure:
- Name: "IT Department Monthly Budget"
- Scope: Subscription (filtered by CostCenter=IT-001 tag)
- Amount: $10,000
- Reset period: Monthly
- Start date: First day of current month
Set alert conditions:
- 50% ($5,000): Email to IT manager
- 80% ($8,000): Email to IT manager and director
- 100% ($10,000): Email to IT manager, director, and CFO
- 120% ($12,000): Email + trigger action group to shut down dev resources
Click "Create"

Result:

Automatic email alerts as spending approaches limits
Proactive cost management before overspending
Visibility into spending trends

Detailed Example 2: Budget with Action Groups

Scenario: Automatically shut down development VMs when budget reaches 100%.

Steps:

Create action group:
- Name: "Shutdown-Dev-VMs"
- Action: Run Azure Automation runbook
- Runbook: Script to stop all VMs tagged with Environment=Development
Create budget with 100% threshold
Configure threshold to trigger action group
Result: When budget hits $10,000, dev VMs automatically shut down

Azure Advisor Cost Recommendations

What it is: Azure Advisor analyzes your resource usage and provides personalized recommendations to reduce costs.

Common Recommendations:

Right-size VMs: Downsize underutilized VMs
Shut down idle VMs: Stop VMs with low CPU usage
Delete unattached disks: Remove disks not attached to any VM
Use reserved instances: Save up to 72% by committing to 1 or 3 years
Delete unused resources: Remove public IPs, NICs, NSGs not in use

Detailed Example: Following Advisor Recommendations

Advisor shows:

VM vm-web-dev-001 is underutilized
- Current: Standard_D4s_v3 (4 vCPUs, 16 GB RAM) - $140/month
- Recommendation: Downsize to Standard_D2s_v3 (2 vCPUs, 8 GB RAM) - $70/month
- Savings: $70/month ($840/year)
- CPU usage: Average 5%, Max 15%
10 unattached disks found
- Total cost: $50/month
- Recommendation: Delete unused disks
- Savings: $50/month ($600/year)
5 VMs eligible for reserved instances
- Current pay-as-you-go: $500/month
- 3-year reserved instance: $200/month
- Savings: $300/month ($10,800 over 3 years)

Total potential savings: $420/month ($5,040/year)

Cost Optimization Best Practices

1. Right-Sizing:

Monitor VM CPU and memory usage
Downsize underutilized VMs
Use burstable B-series VMs for variable workloads

2. Auto-Shutdown:

Shut down dev/test VMs outside business hours
Use Azure Automation or DevTest Labs auto-shutdown
Savings: 50-70% for non-production VMs

3. Reserved Instances:

Commit to 1 or 3 years for predictable workloads
Save up to 72% vs pay-as-you-go
Best for: Production VMs, databases running 24/7

4. Spot VMs:

Use spare Azure capacity at up to 90% discount
Best for: Batch processing, dev/test, interruptible workloads
Risk: Can be evicted with 30-second notice

5. Storage Optimization:

Use appropriate storage tiers (Hot, Cool, Archive)
Enable lifecycle management to move old data to cheaper tiers
Delete old snapshots and backups

⭐ Must Know (Critical Facts):

Cost Analysis: View and analyze Azure spending
Budgets: Set spending limits and receive alerts
Azure Advisor: Get personalized cost optimization recommendations
Tags: Essential for cost allocation and chargeback
Reserved Instances: Save up to 72% for predictable workloads
Spot VMs: Save up to 90% for interruptible workloads

When to use each tool:

✅ Cost Analysis: Daily/weekly cost monitoring and reporting
✅ Budgets: Proactive cost control with alerts
✅ Azure Advisor: Monthly review for optimization opportunities
✅ Tags: Continuous cost allocation and tracking

💡 Tips for Understanding:

Review costs weekly to catch issues early
Set budgets with multiple alert thresholds
Act on Advisor recommendations monthly
Use tags consistently for accurate cost tracking

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Only checking costs at month-end
- Why it's wrong: By then it's too late to prevent overspending
- Correct understanding: Monitor costs weekly or daily for proactive management
Mistake 2: Not setting budgets
- Why it's wrong: No early warning system for overspending
- Correct understanding: Set budgets for all subscriptions and major cost centers
Mistake 3: Ignoring Azure Advisor recommendations
- Why it's wrong: Missing easy cost savings opportunities
- Correct understanding: Review and act on Advisor recommendations monthly

Chapter Summary

What We Covered

In this chapter, you learned the fundamentals of Azure identity and governance:

✅ Microsoft Entra ID:

Cloud-based identity and access management service
User types: Cloud-only, Synchronized, Guest (B2B)
Group types: Security groups, Microsoft 365 groups
Group membership: Assigned vs Dynamic

✅ Role-Based Access Control (RBAC):

Security Principal (who): Users, groups, service principals, managed identities
Role Definition (what): Owner, Contributor, Reader, custom roles
Scope (where): Management group, subscription, resource group, resource
Inheritance: Permissions flow downward from parent to child scopes

✅ Azure Policy:

Policy definitions: Rules for compliance
Policy effects: Deny, Audit, Modify, DeployIfNotExists, etc.
Policy assignments: Applying policies to scopes
Initiatives: Groups of related policies

✅ Tags and Organization:

Name-value pairs for resource categorization
Common strategies: Cost tracking, operational, organizational, automation
No automatic inheritance (use policies to enforce)
Essential for cost allocation and management

✅ Cost Management:

Cost Analysis: View and analyze spending
Budgets: Set limits and receive alerts
Azure Advisor: Get optimization recommendations
Best practices: Right-sizing, auto-shutdown, reserved instances

Critical Takeaways

Microsoft Entra ID is the foundation: All Azure access starts with identity. Understanding users, groups, and authentication is essential.
RBAC follows least privilege: Always assign the minimum permissions needed. Use groups for role assignments, not individual users.
Azure Policy enforces compliance: Use Audit effect first to understand impact, then switch to Deny for enforcement. Group related policies into initiatives.
Tags enable cost management: Without tags, cost tracking is nearly impossible. Enforce required tags with Azure Policy.
Proactive cost management saves money: Monitor costs weekly, set budgets with alerts, and act on Azure Advisor recommendations monthly.

Self-Assessment Checklist

Test yourself before moving to the next chapter:

Microsoft Entra ID:

I can explain the difference between cloud-only, synchronized, and guest users
I can create users and groups in the Azure Portal
I understand when to use security groups vs Microsoft 365 groups
I can create dynamic groups with membership rules

RBAC:

I can explain the three components of RBAC (who, what, where)
I understand the difference between Owner, Contributor, and Reader roles
I can assign roles at different scopes
I understand how scope inheritance works

Azure Policy:

I can explain what Azure Policy does
I understand the difference between Deny, Audit, and Modify effects
I can assign a built-in policy to a subscription
I understand what initiatives are and when to use them

Tags and Cost Management:

I can apply tags to resources
I understand how to use tags for cost tracking
I can create a budget with alert thresholds
I can use Cost Analysis to view spending by service or tag

Practice Exercises

📝 Exercise 1: Create Users and Groups

Create 3 cloud users in Microsoft Entra ID
Create a security group called "IT-Team"
Add the 3 users to the group
Assign "Reader" role to the group at subscription scope
Verify users can view resources but not modify them

📝 Exercise 2: Implement RBAC

Create a resource group: RG-Exercise
Assign "Contributor" role to yourself at RG-Exercise scope
Create a VM in RG-Exercise (should succeed)
Try to assign roles to others (should fail - Contributor can't manage permissions)
Have someone with Owner role assign you "Owner" on RG-Exercise
Now try to assign roles (should succeed)

📝 Exercise 3: Apply Azure Policy

Assign built-in policy "Allowed locations" to a resource group
Set allowed locations to your current region only
Try to create a storage account in a different region (should be denied)
Try to create a storage account in the allowed region (should succeed)
Change policy effect to "Audit" and try again (should succeed but be marked non-compliant)

📝 Exercise 4: Implement Tagging Strategy

Create a tagging strategy with 3 required tags: Environment, CostCenter, Owner
Apply tags to 5 existing resources
Create an Azure Policy to require these tags on new resources
Try to create a resource without tags (should be denied)
Create a resource with all required tags (should succeed)

📝 Exercise 5: Cost Management

Navigate to Cost Analysis and view costs by service
Create a budget for $100 with alerts at 50%, 80%, and 100%
Review Azure Advisor cost recommendations
Identify one resource you can optimize (right-size, delete, or shut down)

Quick Reference Card

Microsoft Entra ID:

Cloud user: Created in Entra ID, credentials in cloud
Synced user: From on-premises AD, synced to cloud
Guest user: External user from another organization
Security group: For permissions
M365 group: For collaboration

RBAC Roles:

Owner: Full access + manage permissions
Contributor: Full access, no permission management
Reader: View only
User Access Administrator: Manage permissions only

RBAC Scopes (high to low):

Management Group
Subscription
Resource Group
Resource

Policy Effects:

Deny: Block non-compliant resources
Audit: Report non-compliance only
Modify: Auto-fix properties
DeployIfNotExists: Auto-deploy missing resources

Cost Management:

Cost Analysis: View spending
Budgets: Set limits and alerts
Advisor: Get recommendations
Tags: Track costs by category

Common Exam Question Patterns

Pattern 1: RBAC Scope Selection

Question: "User needs to manage VMs in RG-WebApp but not other resource groups"
Answer: Assign Contributor role at RG-WebApp scope (not subscription)

Pattern 2: Policy Effect Selection

Question: "Prevent users from creating resources in certain regions"
Answer: Use "Allowed locations" policy with Deny effect

Pattern 3: Cost Tracking

Question: "Track costs by department for chargeback"
Answer: Apply CostCenter tags to resources, use Cost Analysis grouped by tag

Pattern 4: Group Membership

Question: "Automatically add users to group when they join Marketing department"
Answer: Create dynamic group with rule: user.department -eq "Marketing"

What's Next?

Now that you understand identity and governance, you're ready to learn about storage. The next chapter covers Domain 2: Implement and Manage Storage, where you'll learn about:

Storage account types and configuration
Blob storage and containers
Azure Files and file shares
Storage security (SAS tokens, access keys, firewalls)
Storage redundancy and replication
Data management tools

Recommended next step: Open file 03_domain_2_storage and begin Chapter 2.

Chapter 1 Complete! ✅

You've mastered Azure identity and governance - the foundation of Azure administration. Take a break, review your notes, and when you're ready, move on to Chapter 2.

Chapter 2: Implement and Manage Storage (18% of exam)

File: 03_domain_2_storage

Chapter Overview

This domain represents 18% of the AZ-104 exam. It covers Azure Storage services, which provide durable, highly available, and massively scalable cloud storage for data objects, files, and disks.

What you'll learn:

Storage account types and configuration
Blob storage and containers
Azure Files and file shares
Storage security (SAS tokens, access keys, firewalls)
Storage redundancy and replication options
Data management tools (Storage Explorer, AzCopy)

Time to complete: 6-8 hours

Prerequisites: Chapter 0 (Fundamentals), Chapter 1 (Identities & Governance)

Section 1: Storage Account Fundamentals

Introduction

The problem: Applications need to store data - files, images, videos, backups, logs, and more. Traditional on-premises storage requires buying hardware, managing capacity, ensuring redundancy, and handling backups. Scaling up requires purchasing more hardware, which takes weeks or months.

The solution: Azure Storage provides cloud-based storage that scales instantly, offers multiple redundancy options, and charges only for what you use. You can store petabytes of data without managing any hardware.

Why it's tested: Storage is fundamental to almost every Azure solution. The AZ-104 exam tests your ability to create storage accounts, configure security, manage data, and choose appropriate redundancy options.

What is a Storage Account?

What it is: A storage account is a container that provides a unique namespace for your Azure Storage data. Every object you store in Azure Storage has an address that includes your unique account name. The combination of the account name and the Azure Storage service endpoint forms the endpoints for your storage account.

Why it exists: Storage accounts provide a management boundary for storage resources. All storage services (blobs, files, queues, tables) within a storage account share the same configuration, security settings, and billing. This simplifies management and cost tracking.

Real-world analogy: A storage account is like a warehouse. The warehouse (storage account) has a unique address, and inside you can store different types of items (blobs, files, queues, tables) in different sections. All items in the warehouse share the same security system and are billed to the same account.

Storage Account Naming Rules:

Must be globally unique across all of Azure
3-24 characters long
Lowercase letters and numbers only
No hyphens, underscores, or special characters
Example: mystorageacct001 (valid), MyStorageAcct (invalid - uppercase), my-storage (invalid - hyphen)

Storage Account Endpoints:
Each storage account has unique endpoints for each service:

Blob storage: https://{account-name}.blob.core.windows.net
File storage: https://{account-name}.file.core.windows.net
Queue storage: https://{account-name}.queue.core.windows.net
Table storage: https://{account-name}.table.core.windows.net

Example for account "mystorageacct001":

Storage Account Types

Azure offers several types of storage accounts, each optimized for different scenarios:

Standard General-Purpose v2 (StorageV2)

What it is: The most common and recommended storage account type. Supports all storage services (blobs, files, queues, tables) with standard performance.

Why it exists: Provides a balance of features, performance, and cost for most scenarios. It's the default choice unless you have specific requirements for premium performance.

Performance: Standard (HDD-based)
Supported services: Blobs, Files, Queues, Tables
Redundancy options: LRS, ZRS, GRS, RA-GRS, GZRS, RA-GZRS (all options)
Use cases: General-purpose storage, web applications, backups, archives

Detailed Example 1: Web Application Storage

Contoso runs an e-commerce website that needs to store:

Product images (blobs)
User-uploaded files (blobs)
Application logs (blobs)
Session data (tables)
Background job queue (queues)

Solution: Create one Standard General-Purpose v2 storage account

Account name: contosoecommerceprod
Performance: Standard
Redundancy: GRS (geo-redundant for disaster recovery)
Cost: ~$0.02/GB/month for hot tier storage

Benefits:

All storage needs in one account
Simplified management and billing
Geo-redundancy protects against regional disasters
Cost-effective for large amounts of data

Premium Block Blobs (BlockBlobStorage)

What it is: Premium storage account optimized for block blobs and append blobs with SSD-based storage for low latency and high transaction rates.

Why it exists: Some applications require consistently low latency (single-digit milliseconds) and high throughput. Standard storage uses HDDs which can't meet these requirements.

Performance: Premium (SSD-based)
Supported services: Block blobs, append blobs only
Redundancy options: LRS, ZRS only (no geo-redundancy)
Use cases: Interactive applications, IoT data, real-time analytics, media streaming

Detailed Example 2: IoT Data Ingestion

Fabrikam has 10,000 IoT devices sending telemetry data every second:

Data rate: 100 MB/second
Latency requirement: <10ms write latency
Transaction rate: 10,000 writes/second

Solution: Create Premium Block Blob storage account

Account name: fabrikamiotpremium
Performance: Premium
Redundancy: ZRS (zone-redundant for high availability)
Cost: ~$0.15/GB/month + transaction costs

Benefits:

Consistent low latency (<10ms)
High throughput for massive data ingestion
SSD performance for real-time processing
Zone redundancy for high availability

Premium File Shares (FileStorage)

What it is: Premium storage account optimized for Azure Files with SSD-based storage for enterprise file shares requiring high performance.

Why it exists: Enterprise applications often require file shares with low latency and high IOPS that standard file shares can't provide.

Performance: Premium (SSD-based)
Supported services: Azure Files only
Redundancy options: LRS, ZRS only
Use cases: Enterprise file shares, databases, high-performance applications

Detailed Example 3: SQL Server File Shares

Northwind runs SQL Server on Azure VMs and needs shared storage for database files:

IOPS requirement: 100,000 IOPS
Latency requirement: <1ms
Throughput: 10 GB/second
Size: 10 TB

Solution: Create Premium File Share storage account

Account name: northwindsqlpremium
Performance: Premium
Redundancy: ZRS
Provisioned size: 10 TB
Cost: ~$2.00/GB/month (provisioned capacity)

Benefits:

Consistent sub-millisecond latency
High IOPS for database workloads
SMB 3.0 protocol support
Zone redundancy for high availability

Premium Page Blobs (StorageV2 with Premium)

What it is: Premium storage account optimized for page blobs, primarily used for Azure VM disks (managed disks).

Why it exists: Virtual machine disks require high IOPS and low latency. Premium page blobs provide SSD-based storage for VM disks.

Performance: Premium (SSD-based)
Supported services: Page blobs only
Redundancy options: LRS, ZRS only
Use cases: Azure VM disks, databases requiring high IOPS

Note: Most users should use Azure Managed Disks instead of page blobs directly. Managed Disks provide better management and features.

Storage Account Comparison

Feature	Standard v2	Premium Block Blobs	Premium Files	Premium Page Blobs
Performance	Standard (HDD)	Premium (SSD)	Premium (SSD)	Premium (SSD)
Latency	10-20ms	<10ms	<1ms	<10ms
IOPS	Up to 20,000	Up to 100,000	Up to 100,000	Up to 80,000
Blobs	✅ Yes	✅ Yes (block/append)	❌ No	✅ Yes (page only)
Files	✅ Yes	❌ No	✅ Yes	❌ No
Queues	✅ Yes	❌ No	❌ No	❌ No
Tables	✅ Yes	❌ No	❌ No	❌ No
Geo-redundancy	✅ Yes	❌ No	❌ No	❌ No
Cost	$	$$$	$$$	$$$
Use case	General purpose	High transactions	Enterprise files	VM disks

⭐ Must Know (Critical Facts):

Standard General-Purpose v2: Default choice for most scenarios, supports all services
Premium accounts: Use SSDs, provide low latency, cost more, no geo-redundancy
Account names: Must be globally unique, 3-24 characters, lowercase letters and numbers only
Endpoints: Each service has a unique endpoint based on account name
One account type: Cannot change account type after creation (must create new account and migrate data)

When to use each type:

✅ Standard v2: Web apps, backups, archives, general storage (90% of scenarios)
✅ Premium Block Blobs: IoT data, real-time analytics, high transaction rates
✅ Premium Files: Enterprise file shares, databases, high-performance apps
✅ Premium Page Blobs: Rarely used directly (use Managed Disks instead)

Storage Redundancy Options

What it is: Storage redundancy determines how many copies of your data Azure maintains and where those copies are located. Different redundancy options provide different levels of durability and availability.

Why it exists: Hardware fails, data centers experience outages, and natural disasters happen. Redundancy ensures your data remains available and durable even when failures occur. Different applications have different requirements for availability and cost.

Real-world analogy: Redundancy is like having backup copies of important documents. You might keep one copy at home (LRS), copies in different rooms (ZRS), a copy in a safe deposit box in another city (GRS), or copies in multiple cities with access to all of them (RA-GZRS).

Locally Redundant Storage (LRS)

What it is: LRS replicates your data three times within a single data center in the primary region. All three copies are in the same physical location.

How it works:

When you write data, Azure creates three copies
All three copies are stored in the same data center
Copies are on different storage nodes (different racks, different fault domains)
Write operation succeeds only after all three copies are written
If one copy fails, Azure automatically creates a new copy

Durability: 99.999999999% (11 nines) over a year
Availability: 99.9% (standard), 99.99% (premium)
Cost: Lowest cost option (~$0.02/GB/month)

Protects against:

✅ Individual disk failures
✅ Server rack failures
✅ Individual server failures

Does NOT protect against:

❌ Data center-wide failures (fire, flood, power outage)
❌ Regional disasters

When to use:

✅ Non-critical data that can be easily reconstructed
✅ Data that changes frequently (logs, temporary files)
✅ Cost is the primary concern
✅ Data residency requirements restrict data to single location
❌ Don't use for critical production data

Detailed Example 1: Application Logs

Scenario: Store application logs that are analyzed daily and deleted after 30 days.

Solution: LRS storage account

Data: Application logs (100 GB/day)
Retention: 30 days
Total storage: 3 TB
Cost: 3,000 GB × $0.02 = $60/month

Reasoning:

Logs can be regenerated if lost (not critical)
Logs are temporary (30-day retention)
Cost savings significant (LRS vs GRS saves $30/month)
If data center fails, logs for that day are lost but not critical

Zone-Redundant Storage (ZRS)

What it is: ZRS replicates your data synchronously across three Azure availability zones in the primary region. Each availability zone is a separate physical location with independent power, cooling, and networking.

How it works:

When you write data, Azure creates three copies
Each copy is stored in a different availability zone
Zones are physically separated (different buildings, different power grids)
Write operation succeeds only after all three copies are written
If one zone fails, data remains available from other zones

Durability: 99.9999999999% (12 nines) over a year
Availability: 99.99% (standard), 99.999% (premium)
Cost: Medium (~$0.025/GB/month, 25% more than LRS)

Protects against:

✅ Individual disk/server/rack failures
✅ Data center-wide failures
✅ Availability zone failures

Does NOT protect against:

❌ Regional disasters (entire region unavailable)

When to use:

✅ Production data requiring high availability
✅ Applications that need to stay online during data center failures
✅ Compliance requirements for zone redundancy
✅ Balance between cost and availability
❌ Don't use if geo-redundancy is required

Detailed Example 2: E-Commerce Database

Scenario: E-commerce application database that must stay online 24/7.

Solution: ZRS storage account for database backups

Data: Database backups (500 GB)
Requirement: Must survive data center failures
Cost: 500 GB × $0.025 = $12.50/month

Reasoning:

Database is critical (must be available 24/7)
ZRS ensures availability during data center failures
If one availability zone fails, database remains accessible
Cost increase (25% over LRS) is acceptable for high availability

Geo-Redundant Storage (GRS)

What it is: GRS replicates your data to a secondary region hundreds of miles away from the primary region. Data is replicated three times in the primary region (LRS) and three times in the secondary region (LRS).

How it works:

Data is written to primary region (3 copies with LRS)
Data is asynchronously replicated to secondary region
Secondary region stores 3 copies (LRS)
Total: 6 copies (3 in primary, 3 in secondary)
Secondary region is paired with primary (e.g., East US → West US)
Data in secondary region is NOT accessible unless you initiate failover

Durability: 99.99999999999999% (16 nines) over a year
Availability: 99.9% (standard), 99.99% (premium)
Cost: Higher (~$0.04/GB/month, 2x LRS cost)

Protects against:

✅ Individual disk/server/rack failures
✅ Data center-wide failures
✅ Regional disasters

Does NOT protect against:

❌ Simultaneous failures in both regions (extremely rare)

When to use:

✅ Critical production data
✅ Disaster recovery requirements
✅ Compliance requirements for geo-redundancy
✅ Data must survive regional disasters
❌ Don't use if you need read access to secondary region (use RA-GRS instead)

Detailed Example 3: Financial Records

Scenario: Store financial records that must be retained for 7 years and survive any disaster.

Solution: GRS storage account

Data: Financial records (10 TB)
Retention: 7 years
Requirement: Must survive regional disasters
Cost: 10,000 GB × $0.04 = $400/month

Reasoning:

Financial records are critical and irreplaceable
Regulatory requirements mandate disaster recovery
GRS ensures data survives even if entire region fails
Cost is justified by criticality of data

Read-Access Geo-Redundant Storage (RA-GRS)

What it is: Same as GRS, but with read access to the secondary region. You can read data from the secondary region at any time, even when the primary region is available.

How it works:

Same replication as GRS (6 copies total)
Secondary region endpoint is accessible for reads
Primary endpoint: https://{account}-primary.blob.core.windows.net
Secondary endpoint: https://{account}-secondary.blob.core.windows.net
Applications can read from secondary for load distribution or disaster recovery testing

Durability: 99.99999999999999% (16 nines) over a year
Availability: 99.99% (standard), 99.999% (premium)
Cost: Same as GRS (~$0.04/GB/month)

When to use:

✅ Applications that can tolerate slightly stale data (replication lag)
✅ Read-heavy workloads that need geographic distribution
✅ Disaster recovery testing without failover
✅ Load balancing across regions

Detailed Example 4: Global Content Delivery

Scenario: Serve product images to users worldwide with low latency.

Solution: RA-GRS storage account

Data: Product images (5 TB)
Users: Global (US, Europe, Asia)
Requirement: Low latency for all users

Architecture:

Primary region: East US (serves US users)
Secondary region: West Europe (serves European users)
Application reads from nearest region
If primary fails, all traffic goes to secondary

Benefits:

Reduced latency for European users (read from West Europe)
Disaster recovery (automatic failover to secondary)
Load distribution across regions

Geo-Zone-Redundant Storage (GZRS)

What it is: Combines ZRS in the primary region with GRS to the secondary region. Data is replicated across three availability zones in the primary region and three times (LRS) in the secondary region.

How it works:

Data written to primary region (3 copies across 3 zones)
Data asynchronously replicated to secondary region (3 copies with LRS)
Total: 6 copies (3 in primary zones, 3 in secondary region)
Highest level of redundancy and availability

Durability: 99.99999999999999% (16 nines) over a year
Availability: 99.99% (standard), 99.999% (premium)
Cost: Highest (~$0.05/GB/month, 2.5x LRS cost)

Protects against:

✅ Individual disk/server/rack failures
✅ Data center-wide failures
✅ Availability zone failures
✅ Regional disasters

When to use:

✅ Mission-critical data requiring maximum availability and durability
✅ Applications that cannot tolerate any downtime
✅ Compliance requirements for both zone and geo redundancy
❌ Don't use if cost is a concern (most expensive option)

Detailed Example 5: Healthcare Patient Records

Scenario: Store electronic health records (EHR) that must be available 24/7 with maximum durability.

Solution: GZRS storage account

Data: Patient records (50 TB)
Requirement: 99.999% availability, maximum durability
Compliance: HIPAA, must survive any disaster
Cost: 50,000 GB × $0.05 = $2,500/month

Reasoning:

Patient records are mission-critical (lives depend on access)
GZRS provides highest availability (survives zone and region failures)
Compliance requirements mandate maximum protection
Cost is justified by criticality and regulatory requirements

Read-Access Geo-Zone-Redundant Storage (RA-GZRS)

What it is: Same as GZRS, but with read access to the secondary region.

When to use: Same as RA-GRS, but when you also need zone redundancy in the primary region.

Redundancy Comparison

Redundancy	Copies	Locations	Durability (9s)	Availability	Cost	Use Case
LRS	3	1 data center	11	99.9%	$	Non-critical, temporary
ZRS	3	3 zones	12	99.99%	$$	Production, high availability
GRS	6	2 regions	16	99.9%	$$$	Critical, disaster recovery
RA-GRS	6	2 regions	16	99.99%	$$$	Critical + read distribution
GZRS	6	3 zones + 1 region	16	99.99%	$$$$	Mission-critical
RA-GZRS	6	3 zones + 1 region	16	99.99%	$$$$	Mission-critical + reads

⭐ Must Know (Critical Facts):

LRS: Cheapest, 3 copies in one data center, doesn't protect against data center failures
ZRS: 3 copies across 3 availability zones, protects against data center failures
GRS: 6 copies in 2 regions, protects against regional disasters, no read access to secondary
RA-GRS: Same as GRS but with read access to secondary region
GZRS: Combines ZRS + GRS, highest availability and durability
RA-GZRS: GZRS with read access to secondary
Replication lag: GRS/GZRS replication to secondary is asynchronous (typically <15 minutes)
Failover: Only GRS/RA-GRS/GZRS/RA-GZRS support failover to secondary region

Decision Framework:

Start: What are your requirements?

Can you afford to lose data if data center fails?
├─ Yes → LRS (cheapest)
└─ No → Continue

Need protection against regional disasters?
├─ No → ZRS (zone redundancy only)
└─ Yes → Continue

Need read access to secondary region?
├─ No → GRS or GZRS
└─ Yes → RA-GRS or RA-GZRS

Need zone redundancy in primary region?
├─ No → GRS or RA-GRS
└─ Yes → GZRS or RA-GZRS (most expensive, highest availability)

💡 Tips for Understanding:

Think of redundancy as insurance - more coverage costs more
LRS = basic insurance, GZRS = comprehensive insurance
Use LRS for temporary/non-critical data to save costs
Use GRS/GZRS for production/critical data
RA-GRS/RA-GZRS adds read access without extra cost

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Using LRS for critical production data
- Why it's wrong: Data center failure means data loss
- Correct understanding: Use at least ZRS for production, GRS for critical data
Mistake 2: Thinking GRS provides instant failover
- Why it's wrong: Failover is manual and takes time (hours)
- Correct understanding: GRS is for disaster recovery, not high availability
Mistake 3: Not understanding replication lag
- Why it's wrong: Secondary region data may be 15 minutes behind primary
- Correct understanding: GRS/GZRS use asynchronous replication, some data loss possible during failover

Section 2: Blob Storage

Introduction

The problem: Applications need to store unstructured data like images, videos, documents, backups, and logs. Traditional file systems don't scale well to petabytes of data, and managing storage infrastructure is complex and expensive.

The solution: Azure Blob Storage is a massively scalable object storage service for unstructured data. It can store any type of text or binary data and scale to petabytes without managing any infrastructure.

Why it's tested: Blob storage is one of the most commonly used Azure services. The AZ-104 exam tests your ability to create containers, upload blobs, configure access tiers, and manage blob lifecycle.

What is Blob Storage?

What it is: Blob (Binary Large Object) Storage is designed for storing massive amounts of unstructured data. Each blob is stored in a container, and containers are stored in storage accounts.

Why it exists: Traditional file systems have limitations on file size, number of files, and scalability. Blob storage removes these limitations and provides HTTP/HTTPS access to data from anywhere in the world.

Real-world analogy: Blob storage is like a massive digital warehouse with unlimited shelves. You can store any type of item (file), organize items in boxes (containers), and access items from anywhere using their address (URL).

Blob Storage Hierarchy:

Storage Account (mystorageacct001)
└── Container (images)
    ├── Blob (product1.jpg)
    ├── Blob (product2.jpg)
    └── Blob (logo.png)
└── Container (documents)
    ├── Blob (report.pdf)
    └── Blob (invoice.docx)

Blob Types

Block Blobs

What it is: Block blobs are optimized for uploading large amounts of data efficiently. Data is uploaded in blocks, and blocks are assembled into a blob.

How it works:

Large files are divided into blocks (up to 4,000 blocks)
Each block can be up to 4,000 MB
Blocks are uploaded in parallel for faster uploads
After all blocks are uploaded, they're committed to create the blob
Maximum blob size: 190.7 TB (4,000 blocks × 4,000 MB)

Use cases:

✅ Storing files (documents, images, videos)
✅ Streaming media
✅ Backup and archive
✅ Data for analysis
✅ Most common blob type (90% of scenarios)

Detailed Example 1: Video Upload

Scenario: Upload a 10 GB video file to blob storage.

Process:

Divide video into 100 MB blocks (100 blocks total)
Upload blocks in parallel (10 blocks at a time)
Each block gets a unique ID
After all blocks uploaded, commit the blob
Total time: ~5 minutes (vs 30 minutes for sequential upload)

Benefits:

Parallel upload is much faster
Can resume failed uploads (only re-upload failed blocks)
Can upload blocks in any order

Append Blobs

What it is: Append blobs are optimized for append operations. You can only add data to the end of an append blob, not modify existing data.

How it works:

Create an append blob
Append data to the end (append operations only)
Cannot modify or delete existing data
Maximum blob size: 195 GB
Maximum append size: 4 MB per append operation

Use cases:

✅ Log files
✅ Audit trails
✅ Streaming data
✅ Any append-only scenario

Detailed Example 2: Application Logging

Scenario: Application writes logs continuously throughout the day.

Process:

Create append blob: logs/app-2025-10-12.log
Application appends log entries every second
Each append adds new log line to end of blob
At end of day, blob contains complete day's logs
Next day, create new append blob for new day

Benefits:

Efficient for continuous appending
No need to download, modify, and re-upload entire file
Optimized for write-heavy workloads

Page Blobs

What it is: Page blobs are optimized for random read/write operations. They're divided into 512-byte pages and are primarily used for Azure VM disks.

How it works:

Blob is divided into 512-byte pages
Can read/write individual pages
Maximum blob size: 8 TB
Optimized for random access patterns

Use cases:

✅ Azure VM disks (managed disks use page blobs internally)
✅ Databases requiring random access
✅ Rarely used directly (use managed disks instead)

Note: Most users should use Azure Managed Disks for VM disks rather than page blobs directly.

Blob Access Tiers

What it is: Access tiers allow you to optimize storage costs based on how frequently you access data. Different tiers have different storage costs and access costs.

Hot Tier

What it is: Optimized for data that is accessed frequently.

Characteristics:

Highest storage cost (~$0.02/GB/month)
Lowest access cost
Immediate access (no retrieval delay)
Default tier for new blobs

Use cases:

✅ Active data accessed daily
✅ Website images and content
✅ Application data
✅ Data being actively processed

Detailed Example 1: E-Commerce Product Images

Scenario: Store product images for an e-commerce website.

Data characteristics:

Size: 500 GB
Access: Thousands of views per day
Requirement: Immediate access

Solution: Hot tier

Storage cost: 500 GB × $0.02 = $10/month
Access cost: Minimal (included in storage cost)
Total: ~$10/month

Reasoning: Images are accessed constantly, hot tier provides best performance and lowest total cost.

Cool Tier

What it is: Optimized for data that is infrequently accessed and stored for at least 30 days.

Characteristics:

Lower storage cost (~$0.01/GB/month, 50% less than hot)
Higher access cost
Immediate access (no retrieval delay)
Minimum storage duration: 30 days (early deletion fee applies)

Use cases:

✅ Short-term backup
✅ Older data accessed occasionally
✅ Data stored for 30-90 days
✅ Disaster recovery data

Detailed Example 2: Monthly Backups

Scenario: Store monthly database backups that are rarely accessed.

Data characteristics:

Size: 1 TB per month
Access: Only if restore needed (rare)
Retention: 90 days

Solution: Cool tier

Storage cost: 1,000 GB × $0.01 = $10/month
Access cost: Only if accessed (rare)
Total: ~$10/month (vs $20/month in hot tier)

Savings: 50% cost reduction compared to hot tier.

Cold Tier

What it is: Optimized for data that is rarely accessed and stored for at least 90 days.

Characteristics:

Even lower storage cost (~$0.004/GB/month, 80% less than hot)
Higher access cost than cool
Immediate access (no retrieval delay)
Minimum storage duration: 90 days

Use cases:

✅ Long-term backup
✅ Compliance data
✅ Data stored for 90-180 days
✅ Rarely accessed archives

Archive Tier

What it is: Optimized for data that is rarely accessed and stored for at least 180 days. Data must be rehydrated before access.

Characteristics:

Lowest storage cost (~$0.002/GB/month, 90% less than hot)
Highest access cost
Requires rehydration before access (hours to retrieve)
Minimum storage duration: 180 days
Offline storage (not immediately accessible)

Rehydration:

Standard priority: Up to 15 hours
High priority: Less than 1 hour (costs more)
Must rehydrate to hot or cool tier before accessing

Use cases:

✅ Long-term archives (7+ years)
✅ Compliance data rarely accessed
✅ Historical data for regulatory requirements
✅ Data that can tolerate hours of retrieval time

Detailed Example 3: Financial Records Archive

Scenario: Store financial records for 7 years (regulatory requirement).

Data characteristics:

Size: 10 TB
Access: Almost never (only for audits)
Retention: 7 years
Retrieval time: Can wait hours if needed

Solution: Archive tier

Storage cost: 10,000 GB × $0.002 = $20/month
Total over 7 years: $20 × 84 months = $1,680

Comparison:

Hot tier: $20,160 over 7 years
Cool tier: $10,080 over 7 years
Archive tier: $1,680 over 7 years

Savings: $18,480 (92% cost reduction) compared to hot tier.

Access Tier Comparison

Tier	Storage Cost	Access Cost	Retrieval Time	Min Duration	Use Case
Hot	$$$	$	Immediate	None	Active data
Cool	$$	$$	Immediate	30 days	Infrequent access
Cold	$	$$$	Immediate	90 days	Rare access
Archive	$	$$$$	Hours	180 days	Long-term archive

Blob Lifecycle Management

What it is: Lifecycle management automatically transitions blobs between access tiers or deletes them based on rules you define.

Why it exists: Manually moving blobs between tiers is time-consuming and error-prone. Lifecycle management automates this based on age or last access time.

How it works:

Define rules based on conditions (age, last access time)
Specify actions (move to cool/cold/archive, delete)
Azure automatically applies rules daily
Reduces costs by moving old data to cheaper tiers

Detailed Example 1: Automated Backup Lifecycle

Scenario: Manage backup retention automatically.

Requirements:

Keep backups in hot tier for 7 days (quick restore)
Move to cool tier after 7 days (occasional restore)
Move to archive tier after 90 days (compliance)
Delete after 7 years (end of retention)

Lifecycle Policy:

{
  "rules": [
    {
      "name": "move-to-cool",
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": ["blockBlob"],
          "prefixMatch": ["backups/"]
        },
        "actions": {
          "baseBlob": {
            "tierToCool": {
              "daysAfterModificationGreaterThan": 7
            }
          }
        }
      }
    },
    {
      "name": "move-to-archive",
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": ["blockBlob"],
          "prefixMatch": ["backups/"]
        },
        "actions": {
          "baseBlob": {
            "tierToArchive": {
              "daysAfterModificationGreaterThan": 90
            }
          }
        }
      }
    },
    {
      "name": "delete-old-backups",
      "type": "Lifecycle",
      "definition": {
        "filters": {
          "blobTypes": ["blockBlob"],
          "prefixMatch": ["backups/"]
        },
        "actions": {
          "baseBlob": {
            "delete": {
              "daysAfterModificationGreaterThan": 2555
            }
          }
        }
      }
    }
  ]
}

Benefits:

Automatic cost optimization
No manual intervention needed
Consistent policy enforcement
Significant cost savings over time

⭐ Must Know (Critical Facts):

Block blobs: Most common, for files and documents, up to 190.7 TB
Append blobs: For logs and append-only data, up to 195 GB
Page blobs: For VM disks, up to 8 TB (use managed disks instead)
Hot tier: Frequent access, highest storage cost, lowest access cost
Cool tier: Infrequent access, 30-day minimum, 50% cheaper storage
Archive tier: Rare access, 180-day minimum, 90% cheaper, requires rehydration
Lifecycle management: Automates tier transitions and deletions
Early deletion fees: Deleting blobs before minimum duration incurs fees

Decision Framework for Access Tiers:

How often is data accessed?
├─ Daily/Weekly → Hot tier
├─ Monthly → Cool tier
├─ Quarterly → Cold tier
└─ Rarely/Never → Archive tier

How long will data be stored?
├─ <30 days → Hot tier only
├─ 30-90 days → Hot or Cool
├─ 90-180 days → Hot, Cool, or Cold
└─ >180 days → Any tier including Archive

Can you wait hours for data retrieval?
├─ No → Hot, Cool, or Cold
└─ Yes → Archive tier acceptable

💡 Tips for Understanding:

Use hot tier for active data, archive tier for long-term storage
Lifecycle management saves money by automatically moving old data to cheaper tiers
Archive tier requires rehydration (hours) before access
Early deletion fees apply if you delete before minimum duration

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Using hot tier for all data
- Why it's wrong: Wastes money on infrequently accessed data
- Correct understanding: Use appropriate tier based on access frequency
Mistake 2: Not understanding archive tier rehydration
- Why it's wrong: Expecting immediate access to archived data
- Correct understanding: Archive tier requires hours to rehydrate before access
Mistake 3: Deleting cool/cold/archive blobs before minimum duration
- Why it's wrong: Incurs early deletion fees
- Correct understanding: Plan retention periods to avoid early deletion fees

🔗 Connections to Other Topics:

Relates to Cost Management because: Tier selection directly impacts storage costs
Builds on Storage Accounts by: Providing different performance/cost options for blob data
Often used with Lifecycle Management to: Automatically transition blobs between tiers

Section 2: Storage Redundancy and Data Protection

Introduction

The problem: Hardware failures, data center outages, and regional disasters can cause data loss and service interruptions.
The solution: Azure Storage provides multiple redundancy options that replicate your data to protect against failures at different scales.
Why it's tested: Understanding redundancy options is critical for designing resilient storage solutions and is heavily tested on AZ-104.

Core Concepts

Storage Redundancy Options Overview

What it is: Azure Storage redundancy determines how many copies of your data are maintained and where those copies are stored to protect against failures.

Why it exists: Different applications have different durability and availability requirements. A development environment might tolerate some data loss, while a production financial system cannot. Azure provides multiple redundancy options so you can choose the right balance of cost, durability, and availability for your needs.

Real-world analogy: Think of redundancy like backup strategies for important documents. You might keep one copy in your desk (LRS), copies in different filing cabinets in the same office (ZRS), or copies in a completely different office building across town (GRS). Each approach protects against different types of disasters.

How it works (Detailed step-by-step):

When you create a storage account, you select a redundancy option that determines the replication strategy
Azure automatically replicates your data according to the selected option without any additional configuration
For local redundancy (LRS), Azure maintains 3 copies within availability zones in a single region
For zone redundancy (ZRS), Azure maintains 3 copies across 3 separate availability zones in the primary region
For geo-redundancy (GRS/GZRS), Azure first replicates in the primary region (using LRS or ZRS), then asynchronously replicates to a paired secondary region hundreds of miles away
Write operations complete only after data is written to all required replicas in the primary region
Read operations can access data from the primary region (or secondary region if using RA-GRS/RA-GZRS)

📊 Storage Redundancy Options Diagram:

graph TB
    subgraph "Primary Region: East US"
        subgraph "LRS - Single Zone"
            LRS1[Copy 1]
            LRS2[Copy 2]
            LRS3[Copy 3]
        end
        
        subgraph "ZRS - Three Zones"
            Z1[Zone 1<br/>Copy 1]
            Z2[Zone 2<br/>Copy 2]
            Z3[Zone 3<br/>Copy 3]
        end
    end
    
    subgraph "Secondary Region: West US"
        subgraph "GRS/GZRS Secondary"
            SEC1[Copy 1]
            SEC2[Copy 2]
            SEC3[Copy 3]
        end
    end
    
    LRS1 -.Synchronous.-> LRS2
    LRS2 -.Synchronous.-> LRS3
    
    Z1 -.Synchronous.-> Z2
    Z2 -.Synchronous.-> Z3
    
    Z3 -.Asynchronous<br/>Geo-Replication.-> SEC1
    SEC1 -.Synchronous.-> SEC2
    SEC2 -.Synchronous.-> SEC3
    
    style LRS1 fill:#e1f5fe
    style LRS2 fill:#e1f5fe
    style LRS3 fill:#e1f5fe
    style Z1 fill:#c8e6c9
    style Z2 fill:#c8e6c9
    style Z3 fill:#c8e6c9
    style SEC1 fill:#fff3e0
    style SEC2 fill:#fff3e0
    style SEC3 fill:#fff3e0

See: diagrams/03_domain_2_storage_redundancy_overview.mmd

Diagram Explanation (detailed):

This diagram illustrates the three main categories of Azure Storage redundancy. On the left, Locally Redundant Storage (LRS) maintains three synchronous copies within availability zones in a single region (shown in blue). This protects against individual hardware failures but not against zone or region-level disasters. In the middle, Zone-Redundant Storage (ZRS) maintains three synchronous copies across three separate availability zones within the primary region (shown in green). Each zone is a physically separate data center with independent power, cooling, and networking, protecting against zone-level failures. On the right, Geo-Redundant Storage (GRS/GZRS) adds a secondary region (shown in orange) hundreds of miles away. Data is first replicated in the primary region (using LRS or ZRS), then asynchronously replicated to the secondary region where it's stored using LRS. The asynchronous replication means there's a small delay (typically under 15 minutes) between writes to primary and secondary regions. This protects against complete regional disasters but introduces a potential Recovery Point Objective (RPO) of up to 15 minutes.

Detailed Example 1: LRS for Development Environment

A software development team needs storage for their test environment where they store application logs, test data, and temporary build artifacts. They choose Locally Redundant Storage (LRS) for their storage account. Here's what happens: (1) When they upload a 100 MB log file, Azure immediately creates 3 copies within availability zones in the East US region. (2) The write operation completes in milliseconds because all copies are in the same region. (3) One day, a server rack experiences a power failure. Azure automatically serves data from one of the other two copies without any interruption. (4) The cost is minimal - only $0.018 per GB per month for hot tier storage. (5) If the entire East US region experiences an outage (rare but possible), their data would be unavailable until the region recovers. This is acceptable for a development environment where data can be regenerated and brief outages are tolerable. Total cost for 1 TB of storage: approximately $18/month.

Detailed Example 2: ZRS for Production Application

An e-commerce company runs a production web application that stores product images and customer uploads in Azure Blob Storage. They choose Zone-Redundant Storage (ZRS) for high availability. Here's the scenario: (1) When a customer uploads a product review photo (5 MB), Azure synchronously writes the data to three separate availability zones in the West Europe region. (2) Each zone is a physically separate data center with independent infrastructure. (3) During a planned maintenance event, Zone 1 is taken offline. The application continues serving images from Zones 2 and 3 without any downtime or performance degradation. (4) A few months later, Zone 2 experiences a cooling system failure and goes offline temporarily. The application still operates normally using Zones 1 and 3. (5) The cost is slightly higher than LRS - approximately $0.0225 per GB per month for hot tier. (6) The company achieves 99.9999999999% (12 nines) durability and 99.9% availability SLA. For 1 TB of storage, the cost is approximately $22.50/month - only $4.50 more than LRS but with significantly better availability.

Detailed Example 3: GZRS for Mission-Critical Financial Data

A financial services company stores transaction records and audit logs that must survive regional disasters. They choose Geo-Zone-Redundant Storage (GZRS) with Read-Access (RA-GZRS). Here's how it works: (1) When a transaction record is written (10 KB), Azure first synchronously replicates it across three availability zones in the primary region (East US 2) using ZRS. (2) The write operation completes and returns success to the application. (3) Within seconds to minutes, Azure asynchronously replicates the data to the paired secondary region (Central US) where it's stored using LRS (3 copies). (4) The application can read from either the primary region (for lowest latency) or the secondary region (for disaster recovery testing or geographic load distribution). (5) During a catastrophic hurricane that takes down the entire East US 2 region, the company initiates a failover to Central US. (6) After failover (typically 1-2 hours), Central US becomes the new primary region and the application resumes full read/write operations. (7) The RPO (Recovery Point Objective) is typically under 15 minutes, meaning they might lose up to 15 minutes of recent writes. (8) The cost is highest - approximately $0.045 per GB per month for hot tier. For 1 TB of storage, the cost is approximately $45/month, but this provides 99.99999999999999% (16 nines) durability and protection against regional disasters.

⭐ Must Know (Critical Facts):

LRS: 3 copies in one region, 11 nines durability, lowest cost ($0.018/GB/month hot tier)
ZRS: 3 copies across 3 zones in one region, 12 nines durability, protects against zone failures
GRS: LRS in primary + LRS in secondary region, 16 nines durability, asynchronous replication
GZRS: ZRS in primary + LRS in secondary region, 16 nines durability, best protection
RA-GRS/RA-GZRS: Adds read access to secondary region (without failover)
Synchronous replication: Primary region (LRS/ZRS) - no data loss
Asynchronous replication: Primary to secondary - potential 15-minute RPO
Failover: Required to write to secondary region (except RA-GRS/RA-GZRS for reads)
Paired regions: Secondary region is predetermined based on primary region
Archive tier: Only supports LRS, GRS, RA-GRS (not ZRS, GZRS, RA-GZRS)

When to use (Comprehensive):

✅ Use LRS when: Development/test environments, easily reconstructible data, cost is primary concern, single-zone failures are acceptable
✅ Use ZRS when: Production applications, high availability required, zone-level protection needed, data cannot be easily reconstructed
✅ Use GRS when: Business-critical data, regional disaster protection required, can tolerate up to 15-minute data loss, read access to secondary not needed
✅ Use GZRS when: Mission-critical data, need both zone AND regional protection, highest durability required, can tolerate up to 15-minute data loss
✅ Use RA-GRS/RA-GZRS when: Need to read from secondary region for disaster recovery testing, geographic load distribution, or low-latency reads from multiple regions
❌ Don't use LRS when: Data is irreplaceable and zone/region failures are unacceptable
❌ Don't use ZRS when: Budget is extremely limited and single-zone failures are acceptable
❌ Don't use GRS/GZRS when: Cannot tolerate any data loss (use synchronous replication solutions instead)

Limitations & Constraints:

Cannot change redundancy for some account types (must create new account and migrate data)
Conversion time: LRS to ZRS or GRS to GZRS can take hours to days depending on data size
Archive tier: Not supported with ZRS, GZRS, or RA-GZRS
Unmanaged disks: Don't support ZRS or GZRS (use managed disks instead)
Premium storage: Limited redundancy options (typically LRS or ZRS only)
Secondary region: Cannot choose - automatically paired based on primary region
Failover time: 1-2 hours for geo-redundant storage failover
RPO: Up to 15 minutes for GRS/GZRS (no SLA on exact time)

💡 Tips for Understanding:

Remember the pattern: More redundancy = higher durability + higher cost
Durability nines: LRS (11), ZRS (12), GRS/GZRS (16) - more nines = less likely to lose data
Synchronous vs Asynchronous: Synchronous = no data loss, Asynchronous = potential data loss
RA prefix: "Read-Access" means you can read from secondary without failover
Zone vs Geo: Zone protects against data center failures, Geo protects against regional disasters

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Assuming GRS provides instant failover
- Why it's wrong: Failover takes 1-2 hours and must be manually initiated (or Microsoft-initiated in major disasters)
- Correct understanding: GRS protects against data loss but doesn't provide instant failover. Plan for 1-2 hour RTO (Recovery Time Objective)
Mistake 2: Thinking RA-GRS allows writes to secondary region
- Why it's wrong: RA-GRS only provides READ access to secondary; writes still go to primary
- Correct understanding: RA-GRS is for read-only scenarios like disaster recovery testing or geographic load distribution. To write to secondary, you must initiate failover
Mistake 3: Believing ZRS protects against regional disasters
- Why it's wrong: ZRS only replicates within a single region across zones
- Correct understanding: ZRS protects against zone failures (data center outages) but not regional disasters. Use GRS/GZRS for regional protection
Mistake 4: Assuming all storage services support all redundancy options
- Why it's wrong: Some services have limitations (e.g., Premium Files only supports LRS/ZRS, Archive tier doesn't support ZRS/GZRS)
- Correct understanding: Check redundancy support for your specific storage service and tier before designing your solution

🔗 Connections to Other Topics:

Relates to Disaster Recovery because: Redundancy options determine your RPO and RTO
Builds on Storage Accounts by: Defining how data is protected within the account
Often used with Azure Site Recovery to: Provide comprehensive disaster recovery for VMs and applications
Connects to Cost Management because: Higher redundancy = higher costs, must balance protection vs budget

Section 3: Storage Security and Access Control

Introduction

The problem: Storage accounts contain sensitive data that must be protected from unauthorized access while still allowing legitimate users and applications to access it.
The solution: Azure Storage provides multiple security mechanisms including access keys, Shared Access Signatures (SAS), Azure AD authentication, and network security controls.
Why it's tested: Security is a critical aspect of Azure administration and is heavily tested on AZ-104, especially SAS tokens and access control.

Core Concepts

Shared Access Signatures (SAS)

What it is: A Shared Access Signature (SAS) is a URI that grants restricted access rights to Azure Storage resources without exposing your account keys. It's a secure way to grant temporary, limited access to storage resources.

Why it exists: You often need to give clients access to storage resources without giving them your storage account keys (which would grant full access to everything). For example, a mobile app needs to upload photos to blob storage, or a partner needs to download specific files. SAS tokens solve this by providing time-limited, permission-specific access to exactly the resources needed.

Real-world analogy: Think of SAS tokens like temporary visitor badges at an office building. Instead of giving someone your employee badge (account key) which grants access to everything, you give them a visitor badge that only works for specific floors, during specific hours, and expires at the end of the day. If the badge is lost or stolen, it's only valid for a limited time and limited areas.

How it works (Detailed step-by-step):

You create a SAS token by specifying: the resource (container, blob, file), permissions (read, write, delete, list), start time, expiry time, and optionally IP restrictions
Azure generates a signed URI that includes all these parameters plus a cryptographic signature created using your storage account key
You give the SAS URI to the client (application, user, partner) who needs access
The client makes requests to Azure Storage using the SAS URI
Azure validates the SAS by checking: the signature is valid, the current time is within the start/expiry window, the requested operation matches the granted permissions, and the client IP is allowed (if specified)
If validation passes, Azure processes the request; if it fails, Azure returns a 403 Forbidden error
The SAS automatically expires at the specified expiry time without any action needed

📊 SAS Token Types and Flow Diagram:

graph TB
    subgraph "SAS Token Types"
        UD[User Delegation SAS<br/>Signed with Azure AD credentials<br/>Most secure]
        SVC[Service SAS<br/>Signed with account key<br/>Container/blob/file/queue level]
        ACC[Account SAS<br/>Signed with account key<br/>Account-wide access]
    end
    
    subgraph "SAS Creation Flow"
        A[Administrator] -->|1. Creates SAS| B[Azure Storage]
        B -->|2. Returns SAS URI| A
        A -->|3. Shares SAS URI| C[Client Application]
    end
    
    subgraph "SAS Usage Flow"
        C -->|4. Request with SAS| D[Azure Storage]
        D -->|5. Validates| E{Valid?}
        E -->|Yes| F[Grant Access]
        E -->|No| G[403 Forbidden]
    end
    
    subgraph "Stored Access Policy"
        SAP[Stored Access Policy<br/>on Container]
        SAP -->|Defines| SAP1[Start Time]
        SAP -->|Defines| SAP2[Expiry Time]
        SAP -->|Defines| SAP3[Permissions]
        SAP -->|Can be| SAP4[Modified/Revoked]
    end
    
    style UD fill:#c8e6c9
    style SVC fill:#e1f5fe
    style ACC fill:#fff3e0
    style F fill:#c8e6c9
    style G fill:#ffebee

See: diagrams/03_domain_2_sas_overview.mmd

Diagram Explanation (detailed):

This diagram illustrates the three types of SAS tokens and their lifecycle. At the top, we see User Delegation SAS (green) which is the most secure option because it's signed with Azure AD credentials instead of the account key. This means even if the SAS is compromised, the attacker doesn't have access to your account key. Service SAS (blue) is signed with the account key and provides access to specific services like blob containers or file shares. Account SAS (orange) is also signed with the account key but provides broader access across multiple services in the storage account.

The middle section shows the creation flow: (1) An administrator creates a SAS token by specifying permissions, expiry time, and resources. (2) Azure Storage generates a signed URI containing all these parameters plus a cryptographic signature. (3) The administrator shares this URI with the client application that needs access.

The bottom left shows the usage flow: (4) The client makes a request to Azure Storage including the SAS token in the URI. (5) Azure validates the token by checking the signature, expiry time, permissions, and IP restrictions. If valid, access is granted (green); if invalid, a 403 Forbidden error is returned (red).

The bottom right shows Stored Access Policies, which are optional but recommended. A stored access policy is defined on a container and specifies start time, expiry time, and permissions. When you create a SAS associated with a stored access policy, the SAS inherits these constraints. The key benefit is that you can modify or revoke the policy later, which immediately affects all SAS tokens associated with it. Without a stored access policy, the only way to revoke a SAS is to regenerate the account key, which breaks all SAS tokens and applications using that key.

Detailed Example 1: Mobile App Photo Upload with Service SAS

A photo-sharing mobile app needs to allow users to upload photos directly to Azure Blob Storage without routing through your web server. Here's how you implement this with SAS: (1) When a user wants to upload a photo, your web API generates a Service SAS token for a specific blob in the user's container. (2) The SAS grants only "write" permission (not read or delete) and expires in 1 hour. (3) The SAS is scoped to a specific blob path like /users/user123/photos/photo456.jpg. (4) Your API returns the SAS URI to the mobile app. (5) The mobile app uploads the photo directly to Azure Storage using the SAS URI, bypassing your web server entirely. (6) After 1 hour, the SAS expires automatically. If the user tries to use it again, they get a 403 Forbidden error. (7) This approach saves bandwidth on your web server and provides better performance since uploads go directly to Azure Storage. (8) Even if an attacker intercepts the SAS token, they can only write to that specific blob for 1 hour - they can't read other users' photos or delete anything.

Detailed Example 2: Partner File Download with Stored Access Policy

Your company needs to share monthly reports with a business partner. You want to give them access for 30 days but retain the ability to revoke access if needed. Here's the solution: (1) Create a blob container called "partner-reports" and upload the monthly report files. (2) Create a Stored Access Policy on the container named "partner-access" with: start time = today, expiry time = 30 days from now, permissions = read + list. (3) Generate a Service SAS token associated with this stored access policy. (4) Share the SAS URI with your partner. (5) The partner can list files in the container and download them for 30 days. (6) After 2 weeks, you discover a security concern and need to revoke access immediately. (7) You modify the stored access policy to set the expiry time to "now" (or delete the policy entirely). (8) Within 30 seconds, the partner's SAS token stops working - they get 403 Forbidden errors. (9) You didn't need to regenerate your storage account key, so all your other applications continue working normally. (10) This demonstrates the key advantage of stored access policies: you can revoke SAS tokens without regenerating account keys.

Detailed Example 3: User Delegation SAS for Maximum Security

A healthcare application stores patient medical records in blob storage and needs to provide secure, temporary access to doctors. Here's the most secure approach using User Delegation SAS: (1) The application uses Azure AD authentication (not account keys). (2) When Dr. Smith logs in with her Azure AD credentials, the application requests a user delegation key from Azure Storage. (3) The application creates a User Delegation SAS signed with this key (not the account key) that grants Dr. Smith read access to her patients' records for 8 hours. (4) Dr. Smith uses the SAS to access patient records throughout her shift. (5) The key advantage: even if the SAS token is compromised, the attacker doesn't have the storage account key. (6) The user delegation key automatically expires after 7 days maximum (usually set much shorter). (7) If Dr. Smith's Azure AD account is disabled or her permissions are revoked, new SAS tokens can't be generated. (8) This provides defense-in-depth: Azure AD controls who can generate SAS tokens, and the SAS tokens themselves are time-limited and permission-specific. (9) For compliance requirements (HIPAA, GDPR), this approach provides better audit trails since access is tied to Azure AD identities, not anonymous account keys.

⭐ Must Know (Critical Facts):

Three SAS types: User Delegation (most secure, Azure AD), Service (container/blob level), Account (account-wide)
User Delegation SAS: Signed with Azure AD credentials, maximum 7-day expiry, most secure option
Service SAS: Signed with account key, can use stored access policy for revocation
Account SAS: Signed with account key, provides access across multiple services, no stored access policy support
Stored Access Policy: Allows modifying/revoking SAS without regenerating account key, max 5 policies per container
SAS permissions: Read (r), Write (w), Delete (d), List (l), Add (a), Create (c), Update (u), Process (p)
SAS expiry: Always set expiry time, short duration recommended (hours/days, not months)
Revocation: Ad hoc SAS can only be revoked by regenerating account key; stored access policy SAS can be revoked by modifying/deleting policy
IP restrictions: Can limit SAS to specific IP addresses or ranges
Protocol restrictions: Can require HTTPS only for SAS requests

When to use (Comprehensive):

✅ Use User Delegation SAS when: Maximum security required, Azure AD authentication available, compliance requirements demand identity-based access
✅ Use Service SAS with stored access policy when: Need ability to revoke access without regenerating keys, multiple clients share same access pattern, long-term access needed
✅ Use Service SAS (ad hoc) when: One-time access needed, very short duration (minutes/hours), stored access policy overhead not justified
✅ Use Account SAS when: Need access across multiple services (blobs + files + queues), service-level SAS too restrictive
✅ Use stored access policies when: Need to revoke access without key regeneration, multiple SAS tokens share same constraints, need to modify expiry/permissions after SAS creation
❌ Don't use SAS when: Internal applications can use Azure AD authentication directly, managed identities available, permanent access needed (use RBAC instead)
❌ Don't use Account SAS when: Service SAS provides sufficient access (principle of least privilege)
❌ Don't use long expiry times when: Shorter durations are feasible (reduces risk if SAS is compromised)

Limitations & Constraints:

User Delegation SAS: Maximum 7-day expiry, only for blob and Data Lake Storage Gen2, requires Azure AD authentication
Stored Access Policy: Maximum 5 policies per container/share/queue/table, changes take up to 30 seconds to propagate
Account SAS: Cannot use stored access policy, provides broad access (security risk)
Revocation: Ad hoc SAS can only be revoked by regenerating account key (breaks all applications using that key)
Signature validation: SAS signed with account key becomes invalid if key is regenerated
Time synchronization: Client and server clocks must be synchronized (within 15 minutes) for SAS validation
No audit trail: Cannot track who generated a SAS token (only who used it)

💡 Tips for Understanding:

Remember the hierarchy: User Delegation (most secure) > Service SAS with policy > Service SAS (ad hoc) > Account SAS (least secure)
Stored access policy = revocation power: If you might need to revoke access, always use stored access policy
Short expiry = better security: Even if SAS is compromised, damage is limited by expiry time
User Delegation = Azure AD: If you see "Azure AD" or "identity-based", think User Delegation SAS
Account key regeneration = nuclear option: Breaks all SAS tokens and applications using that key

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Using Account SAS for everything
- Why it's wrong: Violates principle of least privilege, provides unnecessary broad access
- Correct understanding: Use Service SAS scoped to specific containers/blobs whenever possible
Mistake 2: Creating SAS tokens with very long expiry times (months/years)
- Why it's wrong: If SAS is compromised, attacker has long-term access
- Correct understanding: Use shortest feasible expiry time (hours/days), regenerate SAS as needed
Mistake 3: Thinking stored access policy immediately revokes SAS
- Why it's wrong: Changes to stored access policy take up to 30 seconds to propagate
- Correct understanding: Plan for 30-second delay when revoking access via stored access policy
Mistake 4: Believing you can audit who generated a SAS token
- Why it's wrong: Azure Storage doesn't log SAS token generation, only usage
- Correct understanding: Implement application-level logging if you need to track SAS generation
Mistake 5: Using ad hoc SAS when you might need to revoke access
- Why it's wrong: Only way to revoke ad hoc SAS is to regenerate account key (breaks everything)
- Correct understanding: Always use stored access policy if there's any chance you'll need to revoke access

🔗 Connections to Other Topics:

Relates to Azure AD because: User Delegation SAS uses Azure AD credentials for signing
Builds on Storage Accounts by: Providing granular access control without exposing account keys
Often used with Azure Functions/Logic Apps to: Provide temporary access to storage for serverless workflows
Connects to Security Best Practices because: SAS implements principle of least privilege and defense-in-depth

Section 4: Azure Files and File Shares

Introduction

The problem: Organizations need shared file storage that can be accessed from multiple machines, supports standard file protocols (SMB/NFS), and integrates with existing identity systems.
The solution: Azure Files provides fully managed file shares in the cloud that are accessible via industry-standard SMB and NFS protocols.
Why it's tested: Azure Files is a key storage service for lift-and-shift scenarios and is frequently tested on AZ-104, especially identity-based authentication.

Core Concepts

Azure Files Overview

What it is: Azure Files offers fully managed file shares in the cloud that are accessible via the Server Message Block (SMB) protocol or Network File System (NFS) protocol. Azure file shares can be mounted concurrently by cloud or on-premises deployments.

Why it exists: Traditional file servers require hardware, maintenance, patching, and backup management. Organizations moving to the cloud need a way to replace on-premises file servers without rewriting applications or changing how users access files. Azure Files provides a cloud-native file share solution that works exactly like traditional file servers but without the infrastructure overhead.

Real-world analogy: Think of Azure Files like a network drive (like the H: drive at work) but hosted in the cloud instead of on a local file server. Users can map it as a drive letter, applications can access it using standard file paths, and it supports the same permissions and authentication as traditional file servers.

How it works (Detailed step-by-step):

You create a storage account with the appropriate performance tier (Standard or Premium)
You create a file share within the storage account, specifying the protocol (SMB or NFS) and quota (maximum size)
For SMB shares, you optionally enable identity-based authentication (AD DS, Azure AD DS, or Azure AD Kerberos)
You configure share-level permissions using Azure RBAC roles (Storage File Data SMB Share Reader, Contributor, or Elevated Contributor)
You configure file/folder-level permissions using Windows ACLs (for SMB) or POSIX permissions (for NFS)
Clients mount the file share using standard SMB or NFS mount commands
Applications and users access files using standard file I/O operations (read, write, delete, etc.)
Azure handles all infrastructure including replication, backup, patching, and high availability

📊 Azure Files Architecture Diagram:

graph TB
    subgraph "Client Access"
        WIN[Windows Client<br/>SMB 3.x]
        LIN[Linux Client<br/>SMB 3.x or NFS 4.1]
        APP[Applications<br/>Standard File I/O]
    end
    
    subgraph "Authentication Layer"
        AD[On-premises AD DS]
        AADDS[Azure AD Domain Services]
        AADKERB[Azure AD Kerberos]
        KEY[Storage Account Key]
    end
    
    subgraph "Azure Files Service"
        SHARE[File Share<br/>SMB or NFS]
        SNAP[Snapshots]
        BACKUP[Azure Backup]
    end
    
    subgraph "Storage Backend"
        STD[Standard Storage<br/>HDD-based]
        PREM[Premium Storage<br/>SSD-based]
    end
    
    WIN -->|Mount| SHARE
    LIN -->|Mount| SHARE
    APP -->|File I/O| SHARE
    
    AD -.Identity.-> SHARE
    AADDS -.Identity.-> SHARE
    AADKERB -.Identity.-> SHARE
    KEY -.Fallback.-> SHARE
    
    SHARE --> SNAP
    SHARE --> BACKUP
    SHARE --> STD
    SHARE --> PREM
    
    style WIN fill:#e1f5fe
    style LIN fill:#e1f5fe
    style SHARE fill:#c8e6c9
    style AD fill:#fff3e0
    style AADDS fill:#fff3e0
    style AADKERB fill:#fff3e0

See: diagrams/03_domain_2_azure_files_architecture.mmd

Diagram Explanation (detailed):

This diagram illustrates the complete Azure Files architecture. At the top, we see three types of clients that can access Azure Files: Windows clients using SMB 3.x protocol (blue), Linux clients using either SMB 3.x or NFS 4.1 protocol (blue), and Applications using standard file I/O operations (blue). All three mount the file share and access it like a local or network drive.

In the middle, the Authentication Layer shows four options for authenticating to Azure Files. On-premises AD DS (orange) allows you to use your existing Active Directory credentials - users sign in with their domain accounts and access files with the same permissions they have on-premises. Azure AD Domain Services (orange) provides a managed domain controller in Azure for organizations that don't want to maintain on-premises domain controllers. Azure AD Kerberos (orange) is the newest option that allows hybrid identities (synced from on-premises AD to Azure AD) to authenticate without requiring network connectivity to domain controllers. Storage Account Key (orange) is the fallback option that provides full access but doesn't support identity-based permissions.

The Azure Files Service layer (green) shows the file share itself, which can be either SMB or NFS protocol. The service includes Snapshots for point-in-time recovery and Azure Backup for long-term retention and disaster recovery.

At the bottom, the Storage Backend shows two performance tiers: Standard Storage (HDD-based) for cost-effective general-purpose file shares, and Premium Storage (SSD-based) for high-performance workloads requiring low latency and high IOPS.

Detailed Example 1: Replacing On-Premises File Server with SMB Share

A company has an on-premises Windows file server hosting departmental shares (HR, Finance, Engineering). They want to migrate to Azure Files. Here's the process: (1) They create a Standard storage account in Azure with LRS redundancy. (2) They create three SMB file shares: "hr-share", "finance-share", and "engineering-share", each with a 1 TB quota. (3) They enable identity-based authentication using their on-premises Active Directory (AD DS). This requires running the AzFilesHybrid PowerShell module to domain-join the storage account. (4) They configure share-level permissions using Azure RBAC: HR group gets "Storage File Data SMB Share Contributor" on hr-share, Finance group on finance-share, etc. (5) They use Robocopy to migrate files from the on-premises server to Azure Files, preserving all ACLs and timestamps. (6) They configure file-level permissions (Windows ACLs) on folders and files, just like on the old file server. (7) Users map the Azure file shares as network drives using their domain credentials: net use H: \\storageaccount.file.core.windows.net\hr-share. (8) Users access files exactly as before - no training needed, no application changes required. (9) The company decommissions the old file server, saving hardware costs and eliminating maintenance overhead. (10) Azure handles all replication, patching, and high availability automatically.

Detailed Example 2: Linux Application with NFS Share

A software development team runs a Linux-based build system that needs shared storage for source code and build artifacts. They choose Azure Files with NFS protocol. Here's the implementation: (1) They create a Premium storage account (required for NFS) with ZRS redundancy for high availability. (2) They create an NFS 4.1 file share named "build-artifacts" with a 5 TB quota. (3) NFS shares don't support identity-based authentication, so they configure network security instead: they restrict access to specific virtual networks using service endpoints. (4) They mount the NFS share on their Linux build servers using standard mount command: sudo mount -t nfs storageaccount.file.core.windows.net:/storageaccount/build-artifacts /mnt/build. (5) They configure POSIX permissions on directories: build servers have read/write access, developer workstations have read-only access. (6) The build system writes compiled binaries and artifacts to the NFS share. (7) Developers access the artifacts from their workstations for testing and deployment. (8) The Premium SSD storage provides low latency (single-digit milliseconds) required for build performance. (9) They configure Azure Backup to take daily snapshots of the share for disaster recovery. (10) The solution scales to handle thousands of concurrent file operations during peak build times.

Detailed Example 3: Hybrid Cloud with Azure AD Kerberos

A healthcare organization has hybrid infrastructure with on-premises AD synced to Azure AD. They want remote workers to access file shares without VPN. Here's the solution: (1) They create a Standard storage account with GRS redundancy for disaster recovery. (2) They create an SMB file share named "patient-records" with a 10 TB quota. (3) They enable Azure AD Kerberos authentication on the storage account. This allows hybrid identities (users synced from on-premises AD to Azure AD) to authenticate. (4) They configure share-level permissions: "Healthcare-Staff" Azure AD group gets "Storage File Data SMB Share Contributor" role. (5) They configure file-level ACLs on folders: doctors have full access to their patients' folders, nurses have read-only access, billing staff have access only to billing documents. (6) Remote workers on Azure AD-joined laptops (no VPN) can mount the file share using their Azure AD credentials: net use P: \\storageaccount.file.core.windows.net\patient-records. (7) Azure AD issues Kerberos tickets for authentication - no need for network connectivity to on-premises domain controllers. (8) The solution provides secure access to sensitive patient data with full audit trails (Azure AD logs all authentication attempts). (9) If a user's Azure AD account is disabled, they immediately lose access to the file share. (10) The organization meets HIPAA compliance requirements with encryption at rest and in transit, identity-based access control, and comprehensive audit logging.

⭐ Must Know (Critical Facts):

Two protocols: SMB (Windows/Linux) and NFS (Linux only), cannot mix on same share
SMB versions: 3.1.1, 3.0, 2.1 supported; SMB 1.0 not supported (security risk)
NFS version: 4.1 only, requires Premium storage (SSD), no identity-based auth
Performance tiers: Standard (HDD, cost-effective) and Premium (SSD, low latency)
Identity-based auth: Only for SMB shares, three options (AD DS, Azure AD DS, Azure AD Kerberos)
Share-level permissions: Azure RBAC roles (Reader, Contributor, Elevated Contributor)
File-level permissions: Windows ACLs for SMB, POSIX permissions for NFS
Maximum share size: 100 TiB (Standard), 100 TiB (Premium)
Snapshots: Up to 200 snapshots per share, point-in-time recovery
Soft delete: Retains deleted shares for 1-365 days (configurable)

When to use (Comprehensive):

✅ Use Azure Files when: Replacing on-premises file servers, lift-and-shift scenarios, shared storage for applications, user home directories, configuration files
✅ Use SMB protocol when: Windows clients, need identity-based authentication, existing applications expect SMB, mixed Windows/Linux environment
✅ Use NFS protocol when: Linux-only environment, POSIX semantics required, high-performance workloads (with Premium)
✅ Use Standard tier when: General-purpose file shares, cost is primary concern, moderate performance acceptable (up to 10,000 IOPS per share)
✅ Use Premium tier when: Low latency required (<10ms), high IOPS needed (up to 100,000 IOPS per share), NFS protocol required
✅ Use identity-based auth when: Need Windows ACLs, replacing on-premises file server, compliance requires identity-based access control
❌ Don't use Azure Files when: Need block storage for VMs (use Azure Disks), need object storage (use Blob Storage), need database storage (use Azure SQL/Cosmos DB)
❌ Don't use NFS when: Windows clients need access, identity-based authentication required

Limitations & Constraints:

NFS limitations: No identity-based auth, Premium storage only, Linux clients only, no Windows support
SMB limitations: Port 445 must be open (often blocked by ISPs), requires secure transfer (SMB 3.x with encryption)
Identity-based auth: Only one identity source per storage account, applies to all shares in account
Snapshot limits: Maximum 200 snapshots per share
File size limits: Maximum 4 TiB per file (Standard), 4 TiB per file (Premium)
Performance limits: Standard (up to 10,000 IOPS per share), Premium (up to 100,000 IOPS per share)
Redundancy for NFS: Only LRS and ZRS supported (no GRS/GZRS)

💡 Tips for Understanding:

SMB = Windows-friendly: Supports identity-based auth, Windows ACLs, works with AD
NFS = Linux-native: POSIX permissions, no identity auth, Premium only
Identity-based auth = three options: AD DS (on-prem), Azure AD DS (managed), Azure AD Kerberos (hybrid, no VPN)
Share-level = Azure RBAC: Controls who can mount the share
File-level = ACLs/POSIX: Controls who can access specific files/folders
Premium = SSD = NFS required: NFS only works with Premium tier

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Trying to use NFS with Standard storage
- Why it's wrong: NFS requires Premium storage (SSD-based)
- Correct understanding: NFS is only available with Premium file shares
Mistake 2: Expecting identity-based authentication with NFS
- Why it's wrong: NFS doesn't support identity-based authentication
- Correct understanding: NFS uses host-based authentication and POSIX permissions, not AD/Azure AD
Mistake 3: Thinking storage account key provides identity-based permissions
- Why it's wrong: Storage account key provides full access to everything, bypassing all ACLs
- Correct understanding: Storage account key is for administrative access only; use identity-based auth for user access
Mistake 4: Assuming Azure Files works like Blob Storage
- Why it's wrong: Azure Files uses file system semantics (directories, ACLs), Blob Storage uses object semantics (containers, blobs)
- Correct understanding: Azure Files is for file shares (SMB/NFS), Blob Storage is for object storage (REST API)

🔗 Connections to Other Topics:

Relates to Active Directory because: SMB shares can use AD DS for identity-based authentication
Builds on Storage Accounts by: Providing file share service within storage accounts
Often used with Azure Backup to: Protect file shares with automated backups
Connects to Virtual Networks because: NFS shares require service endpoints for network security

Chapter Summary

What We Covered

✅ Storage Account Fundamentals: Types, performance tiers, replication options
✅ Blob Storage: Block blobs, access tiers (hot/cool/cold/archive), lifecycle management
✅ Storage Redundancy: LRS, ZRS, GRS, GZRS, RA-GRS, RA-GZRS options and use cases
✅ Storage Security: Shared Access Signatures (SAS), stored access policies, user delegation SAS
✅ Azure Files: SMB and NFS file shares, identity-based authentication, performance tiers

Critical Takeaways

Storage Redundancy: LRS (11 nines, cheapest), ZRS (12 nines, zone protection), GRS/GZRS (16 nines, regional protection)
Blob Access Tiers: Hot (frequent access), Cool (30-day minimum), Cold (90-day minimum), Archive (180-day minimum, requires rehydration)
SAS Tokens: User Delegation (most secure, Azure AD), Service SAS (container-level), Account SAS (account-wide)
Stored Access Policies: Enable SAS revocation without regenerating account keys, max 5 per container
Azure Files Authentication: AD DS (on-prem), Azure AD DS (managed), Azure AD Kerberos (hybrid, no VPN)
NFS vs SMB: NFS requires Premium storage and doesn't support identity-based auth; SMB supports both Standard/Premium and identity-based auth

Self-Assessment Checklist

Test yourself before moving on:

I can explain the difference between LRS, ZRS, GRS, and GZRS
I understand when to use hot, cool, cold, and archive blob tiers
I can describe the three types of SAS tokens and when to use each
I know how to revoke a SAS token using stored access policies
I understand the difference between SMB and NFS file shares
I can explain the three identity-based authentication options for Azure Files
I know when to use Standard vs Premium storage for file shares

Practice Questions

Try these from your practice test bundles:

Domain 2 Bundle 1: Questions 1-20 (Storage fundamentals)
Domain 2 Bundle 2: Questions 21-40 (Advanced storage scenarios)
Expected score: 70%+ to proceed

If you scored below 70%:

Review sections: Storage redundancy, SAS tokens, Azure Files authentication
Focus on: Decision frameworks for choosing redundancy options and access tiers
Practice: Creating SAS tokens with different permissions and expiry times

Quick Reference Card

[One-page summary of chapter - copy to your notes]

Storage Redundancy Options:

LRS: 3 copies in one zone, 11 nines, $0.018/GB/month
ZRS: 3 copies across 3 zones, 12 nines, $0.0225/GB/month
GRS: LRS + async replication to secondary region, 16 nines, $0.036/GB/month
GZRS: ZRS + async replication to secondary region, 16 nines, $0.045/GB/month

Blob Access Tiers:

Hot: Frequent access, highest storage cost, lowest access cost
Cool: Infrequent access, 30-day minimum, 50% cheaper storage
Cold: Rare access, 90-day minimum, 70% cheaper storage
Archive: Very rare access, 180-day minimum, 90% cheaper storage, requires rehydration

SAS Token Types:

User Delegation: Azure AD credentials, most secure, 7-day max expiry
Service SAS: Account key, container/blob level, supports stored access policy
Account SAS: Account key, account-wide, no stored access policy

Azure Files Protocols:

SMB: Windows/Linux, identity-based auth, Standard/Premium
NFS: Linux only, no identity auth, Premium only

Decision Points:

Need zone protection? → Use ZRS or GZRS
Need regional protection? → Use GRS or GZRS
Need to revoke SAS? → Use stored access policy
Need identity-based file access? → Use SMB with AD DS/Azure AD DS/Azure AD Kerberos
Need high-performance file shares? → Use Premium tier

Chapter 3: Deploy and Manage Azure Compute Resources (20-25% of exam)

File: 04_domain_3_compute

Chapter Overview

What you'll learn:

Virtual machine creation, configuration, and management
VM availability options (availability sets, availability zones)
VM disks and encryption
Azure Virtual Machine Scale Sets
Container services (Azure Container Instances, Container Apps, Container Registry)
Azure App Service deployment and configuration
Infrastructure as Code with ARM templates and Bicep

Time to complete: 10-12 hours
Prerequisites: Chapter 0 (Fundamentals), Chapter 1 (Identities and Governance)

Section 1: Azure Virtual Machines Fundamentals

Introduction

The problem: Organizations need compute resources to run applications, but managing physical servers is expensive, time-consuming, and inflexible.
The solution: Azure Virtual Machines provide on-demand, scalable compute resources without the overhead of physical hardware management.
Why it's tested: VMs are fundamental to Azure infrastructure and are heavily tested on AZ-104, especially availability options and disk management.

Core Concepts

Virtual Machine Basics

What it is: An Azure Virtual Machine (VM) is an on-demand, scalable computing resource that provides the flexibility of virtualization without having to buy and maintain physical hardware.

Why it exists: Organizations need compute resources for various workloads - web servers, application servers, databases, development environments, etc. Buying and maintaining physical servers requires significant capital investment, space, power, cooling, and ongoing maintenance. Azure VMs provide compute resources on-demand, paying only for what you use, with the ability to scale up or down as needed.

Real-world analogy: Think of Azure VMs like renting an apartment instead of buying a house. You get the space you need, pay monthly, can move to a bigger or smaller place easily, and don't worry about maintenance, repairs, or property taxes. The landlord (Azure) handles all the infrastructure.

How it works (Detailed step-by-step):

You select a VM size based on your workload requirements (CPU, memory, storage, network bandwidth)
You choose an operating system from Azure Marketplace (Windows Server, Ubuntu, Red Hat, etc.) or upload your own custom image
Azure provisions the VM by allocating compute resources, attaching storage (OS disk and optional data disks), and configuring networking
You configure the VM by installing applications, configuring settings, and applying security policies
The VM runs continuously (or you can stop/deallocate it to save costs when not needed)
You connect to the VM using RDP (Windows) or SSH (Linux) to manage it
Azure handles the underlying infrastructure including hypervisor, physical servers, networking, and storage
You pay for compute time based on the VM size and running duration (per-second billing)

📊 VM Architecture Diagram:

graph TB
    subgraph "VM Components"
        VM[Virtual Machine]
        OS[OS Disk<br/>Managed Disk]
        DATA[Data Disks<br/>Optional]
        NIC[Network Interface<br/>Private IP]
        PIP[Public IP<br/>Optional]
        NSG[Network Security Group<br/>Firewall Rules]
    end
    
    subgraph "Availability Options"
        AS[Availability Set<br/>99.95% SLA]
        AZ[Availability Zone<br/>99.99% SLA]
        VMSS[VM Scale Set<br/>99.95% SLA]
    end
    
    subgraph "Storage Backend"
        STD[Standard HDD<br/>Cost-effective]
        STDSSD[Standard SSD<br/>Balanced]
        PREMSSD[Premium SSD<br/>High performance]
        ULTRA[Ultra Disk<br/>Extreme performance]
    end
    
    VM --> OS
    VM --> DATA
    VM --> NIC
    NIC --> PIP
    NIC --> NSG
    
    VM -.Deployed in.-> AS
    VM -.Deployed in.-> AZ
    VM -.Deployed in.-> VMSS
    
    OS --> STD
    OS --> STDSSD
    OS --> PREMSSD
    DATA --> ULTRA
    
    style VM fill:#c8e6c9
    style OS fill:#e1f5fe
    style NIC fill:#fff3e0
    style AS fill:#f3e5f5
    style AZ fill:#f3e5f5
    style VMSS fill:#f3e5f5

See: diagrams/04_domain_3_vm_architecture.mmd

Diagram Explanation (detailed):

This diagram illustrates the complete architecture of an Azure Virtual Machine. At the center, the Virtual Machine (green) is the compute resource that runs your workload. Connected to it are several key components:

The OS Disk (blue) is a managed disk that contains the operating system and is required for every VM. It's typically 127 GB or larger depending on the OS. Data Disks (blue) are optional additional disks you can attach for application data, databases, or other storage needs. You can attach up to 64 data disks per VM depending on the VM size.

The Network Interface (orange) provides network connectivity and is assigned a private IP address from the virtual network subnet. Optionally, you can attach a Public IP (orange) to enable internet access to the VM. The Network Security Group (orange) acts as a firewall, controlling inbound and outbound traffic with rules based on source/destination IP, port, and protocol.

The middle section shows Availability Options (purple). An Availability Set provides 99.95% SLA by distributing VMs across multiple fault domains (separate racks) and update domains (for planned maintenance). Availability Zones provide 99.99% SLA by distributing VMs across physically separate datacenters within a region. VM Scale Sets provide 99.95% SLA and add auto-scaling capabilities.

The bottom section shows Storage Backend options. Standard HDD is the most cost-effective option for dev/test workloads. Standard SSD provides balanced performance for general-purpose workloads. Premium SSD offers high performance with low latency for production workloads. Ultra Disk provides extreme performance with sub-millisecond latency for the most demanding workloads like SAP HANA and SQL Server.

Detailed Example 1: Creating a Web Server VM

A company needs to deploy a web server to host their corporate website. Here's the complete process: (1) They navigate to Azure Portal and click "Create a resource" → "Virtual Machine". (2) They select the subscription and create a new resource group called "web-servers-rg". (3) They name the VM "web-vm-01" and select the region "East US". (4) For availability, they choose "Availability zone" and select "Zone 1" for 99.99% SLA. (5) They select the image "Windows Server 2022 Datacenter" from Azure Marketplace. (6) They choose VM size "Standard_D2s_v3" (2 vCPUs, 8 GB RAM) which is appropriate for a small web server. (7) They configure administrator credentials: username "webadmin" and a strong password. (8) For disks, they keep the default OS disk (127 GB Premium SSD) and add one data disk (256 GB Premium SSD) for website files. (9) For networking, they select an existing virtual network "web-vnet" and subnet "web-subnet". They enable a public IP address so the website is accessible from the internet. (10) They configure the Network Security Group to allow inbound traffic on ports 80 (HTTP) and 443 (HTTPS) from the internet, and port 3389 (RDP) from their office IP address only. (11) They enable Azure Backup with daily backups retained for 30 days. (12) They review the configuration and click "Create". Azure provisions the VM in about 5 minutes. (13) They connect via RDP, install IIS web server, configure the website on the data disk, and the site is live. Total monthly cost: approximately $70 for the VM + $20 for disks + $5 for backup = $95/month.

Detailed Example 2: Database Server with High Availability

A financial services company needs a SQL Server database with high availability. Here's their implementation: (1) They create two VMs in an Availability Set to achieve 99.95% SLA. (2) VM configuration: "Standard_E4s_v3" size (4 vCPUs, 32 GB RAM, optimized for memory-intensive workloads). (3) They select "SQL Server 2022 Enterprise on Windows Server 2022" image from Azure Marketplace. (4) For storage, they attach 4 Premium SSD data disks (1 TB each) and configure them in a storage pool for better performance. (5) They place both VMs in the same Availability Set but Azure automatically distributes them across different fault domains (separate racks) and update domains. (6) They configure SQL Server Always On Availability Groups between the two VMs for automatic failover. (7) They place an Azure Load Balancer in front of the VMs to route database connections to the active primary replica. (8) For security, they configure Network Security Groups to allow SQL Server traffic (port 1433) only from the application tier subnet, and RDP access only from a management subnet. (9) They enable Azure Disk Encryption on all disks to encrypt data at rest. (10) They configure automated backups using SQL Server native backup to Azure Blob Storage. (11) During a planned maintenance event, Azure updates one VM at a time (different update domains), so the database remains available. (12) If a hardware failure occurs in one fault domain, the other VM continues serving requests with automatic failover. Total monthly cost: approximately $600 per VM × 2 = $1,200 + $400 for disks + $50 for load balancer = $1,650/month.

Detailed Example 3: Development Environment with Cost Optimization

A software development team needs VMs for development and testing. Here's their cost-optimized approach: (1) They create VMs using "Standard_B2ms" size (2 vCPUs, 8 GB RAM) which is a burstable VM type - perfect for dev/test workloads that don't need consistent high performance. (2) They select "Ubuntu 20.04 LTS" image (free, no Windows licensing costs). (3) For disks, they use Standard SSD (not Premium) since dev/test doesn't require ultra-low latency. (4) They configure the VMs to automatically shut down at 7 PM every day using Azure DevTest Labs auto-shutdown feature. (5) They start the VMs manually when developers arrive in the morning. (6) On weekends, the VMs remain stopped (deallocated), so they only pay for storage, not compute. (7) They use Azure Spot VMs for non-critical test environments, getting up to 90% discount but accepting that Azure can evict the VMs with 30 seconds notice when capacity is needed. (8) They configure Azure Advisor to monitor for underutilized VMs and right-size them. (9) After 3 months, they discover one VM is consistently using only 10% CPU, so they downsize it to "Standard_B1ms" (1 vCPU, 2 GB RAM), cutting costs in half. (10) Result: Instead of paying $140/month per VM running 24/7, they pay approximately $50/month per VM (running only 10 hours/day, 5 days/week), saving 64% on compute costs.

⭐ Must Know (Critical Facts):

VM sizes: Categorized by family (General Purpose, Compute Optimized, Memory Optimized, Storage Optimized, GPU)
Billing: Per-second billing when running, no compute charges when stopped (deallocated), always pay for storage
OS disk: Required, typically 127 GB+, can be Standard HDD, Standard SSD, or Premium SSD
Data disks: Optional, up to 64 per VM (depending on size), can be different disk types
Availability Set: 99.95% SLA, distributes VMs across fault domains and update domains
Availability Zone: 99.99% SLA, distributes VMs across physically separate datacenters
Single VM: 99.9% SLA only if using Premium SSD or Ultra Disk for all disks
VM states: Running (billed), Stopped (still billed), Stopped (deallocated) (not billed for compute)
Managed disks: Azure-managed storage, recommended over unmanaged disks
VM extensions: Add-ons for configuration management, monitoring, security (e.g., Custom Script Extension, Azure Monitor Agent)

When to use (Comprehensive):

✅ Use Azure VMs when: Need full control over OS and applications, lift-and-shift migrations, running Windows/Linux workloads, custom software installations
✅ Use Availability Sets when: Need 99.95% SLA, VMs in same region, protection against hardware failures and planned maintenance
✅ Use Availability Zones when: Need 99.99% SLA, protection against datacenter-level failures, can tolerate cross-zone latency (1-2ms)
✅ Use Premium SSD when: Production workloads, need low latency (<10ms), need 99.9% single-instance SLA
✅ Use Standard SSD when: Dev/test workloads, balanced performance and cost, can tolerate higher latency
✅ Use Standard HDD when: Backup storage, infrequent access, cost is primary concern
❌ Don't use VMs when: Serverless options available (Azure Functions, Logic Apps), containerized workloads better suited for AKS/Container Apps
❌ Don't use Availability Sets when: Need datacenter-level protection (use Availability Zones instead)

Limitations & Constraints:

VM size limits: Maximum 416 vCPUs, 12 TB RAM (M-series VMs)
Data disk limits: Up to 64 data disks per VM (varies by VM size)
Disk size limits: Maximum 32 TiB per managed disk
Network bandwidth: Varies by VM size, not guaranteed (best-effort)
Availability Set limits: Maximum 3 fault domains, 20 update domains per set
Availability Zone limits: Not all regions support zones, not all VM sizes available in all zones
Regional limits: Default quota limits per region (can be increased via support request)

💡 Tips for Understanding:

Stopped vs Deallocated: "Stopped" still incurs compute charges, "Stopped (deallocated)" does not
Availability Set vs Zone: Set = rack-level protection (99.95%), Zone = datacenter-level protection (99.99%)
VM families: D-series (general purpose), E-series (memory optimized), F-series (compute optimized)
Disk types: HDD < Standard SSD < Premium SSD < Ultra Disk (performance and cost increase)
Managed disks: Always use managed disks (Azure handles storage accounts automatically)

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Stopping a VM from the OS and expecting no charges
- Why it's wrong: Stopping from within the OS leaves the VM in "Stopped" state, still incurring compute charges
- Correct understanding: Must stop (deallocate) from Azure Portal/CLI/PowerShell to stop compute charges
Mistake 2: Thinking Availability Sets protect against regional disasters
- Why it's wrong: Availability Sets only protect against failures within a single datacenter
- Correct understanding: Use Availability Zones for datacenter-level protection, or geo-redundant solutions for regional disasters
Mistake 3: Assuming all VM sizes are available in all regions and zones
- Why it's wrong: Newer VM sizes and specialized sizes (GPU, HPC) have limited regional availability
- Correct understanding: Check VM size availability in your target region/zone before designing your solution
Mistake 4: Using Standard HDD for production databases
- Why it's wrong: Standard HDD has high latency (10-20ms) and low IOPS, causing poor database performance
- Correct understanding: Use Premium SSD or Ultra Disk for production databases requiring low latency

🔗 Connections to Other Topics:

Relates to Virtual Networks because: VMs require network interfaces and subnets
Builds on Storage by: Using managed disks for OS and data storage
Often used with Load Balancer to: Distribute traffic across multiple VMs
Connects to Azure Backup because: VMs can be backed up for disaster recovery

Section 2: Azure Virtual Machine Scale Sets

Introduction

The problem: Applications need to scale dynamically based on demand, but manually creating and managing multiple VMs is time-consuming and error-prone.
The solution: Azure Virtual Machine Scale Sets automatically create and manage a group of load-balanced VMs that can scale in or out based on demand or schedule.
Why it's tested: VM Scale Sets are essential for building scalable, highly available applications and are frequently tested on AZ-104.

Core Concepts

VM Scale Sets Overview

What it is: Azure Virtual Machine Scale Sets let you create and manage a group of load-balanced VMs that can automatically increase or decrease in number based on demand or a defined schedule.

Why it exists: Modern applications experience variable load - high traffic during business hours, low traffic at night; seasonal spikes during holidays; unpredictable viral events. Manually scaling by creating/deleting VMs is slow and inefficient. VM Scale Sets automate this process, ensuring you have the right number of VMs to handle current load while minimizing costs.

Real-world analogy: Think of VM Scale Sets like a restaurant that automatically adjusts its staff based on customer traffic. During lunch rush, more servers appear automatically. During slow periods, staff is reduced. You don't manually hire and fire people throughout the day - the system handles it automatically based on demand.

How it works (Detailed step-by-step):

You create a scale set by defining a VM configuration (size, image, disks, networking) and capacity settings (min, max, default instance count)
You configure autoscaling rules based on metrics (CPU, memory, custom metrics) or schedules (scale up at 8 AM, scale down at 6 PM)
Azure monitors the metrics continuously (e.g., average CPU across all instances)
When a threshold is exceeded (e.g., CPU > 75% for 5 minutes), Azure triggers a scale-out action
Azure creates new VM instances automatically using the defined configuration, adds them to the load balancer, and starts routing traffic to them
When load decreases (e.g., CPU < 25% for 10 minutes), Azure triggers a scale-in action
Azure removes VM instances gracefully (drains connections, waits for termination notice period), deletes them, and stops billing
The process repeats continuously, ensuring optimal capacity at all times

⭐ Must Know (Critical Facts):

Two orchestration modes: Flexible (recommended, up to 1,000 VMs) and Uniform (legacy, up to 1,000 VMs)
Flexible orchestration: Supports mixed VM types, Spot + regular instances, full VM lifecycle control
Uniform orchestration: All VMs identical, managed as a group, limited individual VM control
Autoscaling: Based on metrics (CPU, memory, custom) or schedules (time-based)
Scale-out: Adding VM instances when demand increases
Scale-in: Removing VM instances when demand decreases
Minimum instances: Ensures baseline capacity (e.g., min 2 for high availability)
Maximum instances: Caps scaling to control costs (e.g., max 10)
Cooldown period: Delay between scaling actions to prevent flapping (default 5 minutes)
Health monitoring: Application Health Extension or Load Balancer probes detect unhealthy instances
Automatic instance repair: Replaces unhealthy instances automatically

When to use (Comprehensive):

✅ Use VM Scale Sets when: Need automatic scaling, variable workload patterns, high availability required, stateless applications
✅ Use Flexible orchestration when: Need mixed VM types, Spot instances, full VM control, modern deployments (recommended)
✅ Use Uniform orchestration when: Legacy deployments, all VMs must be identical, Service Fabric integration
✅ Use metric-based autoscaling when: Load varies unpredictably, need to respond to real-time demand
✅ Use schedule-based autoscaling when: Load patterns are predictable (business hours, seasonal)
❌ Don't use Scale Sets when: Stateful applications requiring persistent identity, single VM sufficient, manual control preferred

Limitations & Constraints:

Maximum instances: 1,000 VMs per scale set (Flexible or Uniform)
Autoscaling limits: Minimum 1 instance, maximum 1,000 instances
Scaling speed: Takes 5-10 minutes to provision new instances
Cooldown period: Minimum 1 minute between scaling actions
Flexible orchestration: Requires explicit outbound connectivity (NAT Gateway, Load Balancer, or Public IP)
Uniform orchestration: All VMs must use same configuration (size, image, disks)

💡 Tips for Understanding:

Flexible = Recommended: Use Flexible orchestration for all new deployments
Scale-out = Add VMs: Happens when demand increases (CPU high, queue length long)
Scale-in = Remove VMs: Happens when demand decreases (CPU low, queue empty)
Cooldown = Wait period: Prevents rapid scaling up and down (flapping)
Health probes = Automatic repair: Unhealthy VMs are automatically replaced

⚠️ Common Mistakes & Misconceptions:

Mistake 1: Setting minimum instances to 0
- Why it's wrong: Scale-out from 0 takes longer (cold start), no instances available during scaling
- Correct understanding: Set minimum to at least 2 for high availability and faster scale-out
Mistake 2: Using very short cooldown periods
- Why it's wrong: Causes rapid scaling up and down (flapping), wasting resources and money
- Correct understanding: Use appropriate cooldown (5-10 minutes) to allow metrics to stabilize
Mistake 3: Not configuring health probes
- Why it's wrong: Unhealthy instances continue receiving traffic, causing errors
- Correct understanding: Always configure Application Health Extension or Load Balancer probes

🔗 Connections to Other Topics:

Relates to Load Balancer because: Scale Sets require load balancer to distribute traffic
Builds on Virtual Machines by: Automating VM creation and management
Often used with Azure Monitor to: Collect metrics for autoscaling decisions
Connects to Availability Zones because: Scale Sets can distribute VMs across zones

Section 3: Azure Container Services

Introduction

The problem: Traditional VMs require OS management, patching, and configuration, adding overhead for containerized applications.
The solution: Azure provides multiple container services (ACI, Container Apps, AKS) that run containers without managing VMs.
Why it's tested: Containers are increasingly popular, and AZ-104 tests understanding of when to use each container service.

Core Concepts

Azure Container Instances (ACI)

What it is: Azure Container Instances is the fastest and simplest way to run a container in Azure without managing VMs or orchestrators.

Why it exists: Sometimes you just need to run a single container or a small group of containers for a short time - batch jobs, CI/CD tasks, event-driven processing. Setting up a full Kubernetes cluster or managing VMs is overkill. ACI provides on-demand containers that start in seconds and bill per second.

Real-world analogy: ACI is like renting a car for a few hours from a car-sharing service. You don't buy the car, maintain it, or pay for it when you're not using it. You just grab it when needed, use it, and return it. Perfect for short-term, simple needs.

⭐ Must Know (Critical Facts):

Fastest container deployment: Containers start in seconds
No VM management: Azure manages underlying infrastructure
Per-second billing: Pay only for running time
Public or private networking: Can deploy in VNet for private access
Persistent storage: Can mount Azure Files shares
Container groups: Multiple containers sharing resources (like Kubernetes pods)
Use cases: Batch jobs, CI/CD tasks, event-driven processing, dev/test

Azure Container Apps

What it is: Azure Container Apps is a fully managed serverless container service for building and deploying modern apps and microservices using containers.

Why it exists: Developers want to deploy containerized applications without managing Kubernetes clusters or infrastructure. Container Apps provides automatic scaling (including scale-to-zero), built-in load balancing, and integrated monitoring without the complexity of Kubernetes.

⭐ Must Know (Critical Facts):

Serverless containers: No infrastructure management
Scale to zero: Automatically scales down to 0 instances when idle (saves costs)
Built-in ingress: Automatic HTTPS and load balancing
Dapr integration: Distributed application runtime for microservices
Revisions: Built-in versioning and traffic splitting
Use cases: Microservices, APIs, event-driven apps, background workers

Azure Container Registry (ACR)

What it is: Azure Container Registry is a managed Docker registry service for storing and managing private container images.

Why it exists: Public registries like Docker Hub are great for public images, but organizations need private registries for proprietary images with security, compliance, and performance requirements. ACR provides geo-replication, security scanning, and integration with Azure services.

⭐ Must Know (Critical Facts):

Private registry: Store proprietary container images securely
Three tiers: Basic (dev/test), Standard (production), Premium (geo-replication, security scanning)
Geo-replication: Replicate images across multiple regions (Premium only)
Security scanning: Vulnerability scanning with Microsoft Defender (Premium only)
Azure integration: Works with ACI, Container Apps, AKS, App Service
Authentication: Azure AD, service principals, admin account (not recommended)

When to use each container service:

✅ Use ACI when: Simple containers, batch jobs, short-lived tasks, no orchestration needed
✅ Use Container Apps when: Microservices, APIs, need auto-scaling, serverless containers
✅ Use AKS when: Complex orchestration, full Kubernetes features, large-scale deployments
✅ Use ACR when: Need private container registry, geo-replication, security scanning

Chapter Summary

What We Covered

✅ Virtual Machines: Creation, configuration, availability options (sets, zones), disk types
✅ VM Scale Sets: Autoscaling, Flexible vs Uniform orchestration, health monitoring
✅ Container Services: ACI (simple containers), Container Apps (serverless), ACR (private registry)

Critical Takeaways

VM Availability: Availability Set (99.95%, rack-level), Availability Zone (99.99%, datacenter-level)
VM Billing: Per-second when running, no compute charges when stopped (deallocated)
Disk Types: Standard HDD < Standard SSD < Premium SSD < Ultra Disk (performance and cost)
Scale Sets: Flexible orchestration recommended, autoscaling based on metrics or schedules
Container Services: ACI (simple), Container Apps (serverless), AKS (full Kubernetes)
ACR Tiers: Basic (dev/test), Standard (production), Premium (geo-replication + security scanning)

Self-Assessment Checklist

I understand the difference between Availability Sets and Availability Zones
I know when to use Standard HDD vs Premium SSD
I can explain how VM Scale Sets autoscaling works
I understand the difference between Flexible and Uniform orchestration
I know when to use ACI vs Container Apps vs AKS
I understand ACR tiers and their features

Practice Questions

Try these from your practice test bundles:

Domain 3 Bundle 1: Questions 1-25 (VM fundamentals)
Domain 3 Bundle 2: Questions 26-50 (Scale Sets and containers)
Expected score: 70%+ to proceed

Chapter 4: Implement and Manage Virtual Networking (15-20% of exam)

File: 05_domain_4_networking

Chapter Overview

What you'll learn:

Virtual networks and subnets
Network security groups (NSGs) and application security groups
Virtual network peering
Azure Bastion for secure VM access
Service endpoints and private endpoints
Azure DNS configuration
Load balancing options

Time to complete: 8-10 hours
Prerequisites: Chapter 0 (Fundamentals), Chapter 3 (Compute)

Section 1: Virtual Networks and Subnets

Core Concepts

Virtual Networks (VNets)

What it is: An Azure Virtual Network (VNet) is a logically isolated network in Azure that provides secure communication between Azure resources, the internet, and on-premises networks.

Why it exists: Resources in the cloud need to communicate securely. VNets provide network isolation, IP address management, and connectivity options similar to traditional on-premises networks but with the scale and availability of Azure.

How it works:

You create a VNet with an address space (e.g., 10.0.0.0/16)
You divide the VNet into subnets (e.g., 10.0.1.0/24, 10.0.2.0/24)
You deploy resources (VMs, App Services, etc.) into subnets
Resources communicate using private IP addresses
Azure handles routing between subnets automatically
You control traffic with Network Security Groups

⭐ Must Know:

Address space: CIDR notation (e.g., 10.0.0.0/16 = 65,536 addresses)
Subnets: Divide VNet into smaller networks for organization and security
Reserved IPs: Azure reserves 5 IPs per subnet (.0, .1, .2, .3, .255)
Default routing: Azure automatically routes between subnets in same VNet
DNS: Azure provides default DNS (168.63.129.16) or use custom DNS
VNet peering: Connect VNets for cross-VNet communication

Network Security Groups (NSGs)

What it is: NSGs are Azure firewalls that filter network traffic to and from Azure resources based on rules.

Why it exists: Every network needs security controls to allow legitimate traffic and block malicious traffic. NSGs provide stateful firewall capabilities at the subnet and network interface level.

How it works:

You create NSG rules with priority (100-4096, lower = higher priority)
Each rule specifies: source, destination, port, protocol, action (allow/deny)
Azure evaluates rules in priority order
First matching rule determines action (allow or deny)
Default rules allow VNet traffic, Azure Load Balancer traffic, deny all inbound internet traffic

⭐ Must Know:

Inbound rules: Control traffic coming into resources
Outbound rules: Control traffic leaving resources
Priority: 100-4096, lower number = higher priority
Default rules: Cannot be deleted, priority 65000+
Service tags: Predefined groups (Internet, VirtualNetwork, AzureLoadBalancer)
Application Security Groups: Group VMs by application role for easier NSG management

Section 2: Connectivity Options

Virtual Network Peering

What it is: VNet peering connects two VNets, allowing resources to communicate using private IP addresses as if they were in the same network.

Why it exists: Organizations often have multiple VNets for different environments (dev, test, prod) or different applications. Peering enables secure communication between VNets without internet exposure.

⭐ Must Know:

Regional peering: Connect VNets in same region
Global peering: Connect VNets across regions
Non-transitive: If VNet A peers with B, and B peers with C, A cannot reach C (unless A also peers with C)
No downtime: Peering established without disrupting existing resources
Low latency: Traffic uses Microsoft backbone network
Pricing: Charged per GB transferred across peering

Azure Bastion

What it is: Azure Bastion provides secure RDP/SSH connectivity to VMs directly from Azure Portal without exposing VMs to the internet.

Why it exists: Traditional VM access requires public IPs and open RDP/SSH ports, creating security risks. Bastion eliminates these risks by providing secure access through Azure Portal over SSL.

⭐ Must Know:

No public IP needed: VMs don't need public IPs for management access
No NSG changes: No need to open RDP/SSH ports to internet
SSL/TLS: All connections encrypted over port 443
Deployed per VNet: One Bastion per VNet, requires dedicated subnet (AzureBastionSubnet)
Two SKUs: Basic (standard features) and Standard (additional features like native client support)

Service Endpoints vs Private Endpoints

Service Endpoints:

Extend VNet identity to Azure services (Storage, SQL, Key Vault)
Traffic stays on Microsoft backbone network
Free, no additional charges
Service-level access control (entire service accessible from VNet)

Private Endpoints:

Bring Azure service into your VNet with private IP
Complete network isolation
Resource-level access control (specific storage account, SQL database)
Charged per endpoint per hour
Requires Private Link service

⭐ Must Know:

Service Endpoints: Free, service-level access, traffic to public endpoint
Private Endpoints: Paid, resource-level access, traffic to private IP
Use Service Endpoints when: Cost-sensitive, service-level access sufficient
Use Private Endpoints when: Need complete isolation, resource-level control required

Section 3: Load Balancing and DNS

Azure Load Balancer

What it is: Azure Load Balancer distributes network traffic across multiple VMs to ensure high availability and reliability.

Why it exists: Single VMs are single points of failure. Load balancers distribute traffic across multiple VMs, ensuring applications remain available even if individual VMs fail.

⭐ Must Know:

Two SKUs: Basic (free, limited features) and Standard (paid, production features)
Two types: Public (internet-facing) and Internal (private, within VNet)
Layer 4: Operates at transport layer (TCP/UDP), not application layer
Health probes: Monitor backend VM health (HTTP, HTTPS, TCP)
Load balancing rules: Define how traffic is distributed
Session persistence: Sticky sessions based on source IP, source IP + protocol, or 5-tuple

Azure DNS

What it is: Azure DNS hosts DNS domains and provides name resolution using Microsoft Azure infrastructure.

Why it exists: Every application needs DNS for name resolution. Azure DNS provides reliable, secure DNS hosting with global availability and integration with Azure services.

⭐ Must Know:

Public DNS zones: Host public domains (example.com)
Private DNS zones: Host private domains for VNet name resolution
Record types: A, AAAA, CNAME, MX, TXT, SRV, PTR
Alias records: Point to Azure resources (Public IP, Traffic Manager, CDN)
Auto-registration: VMs automatically register in private DNS zones
VNet linking: Link private DNS zones to VNets for name resolution

Chapter Summary

What We Covered

✅ Virtual Networks: VNets, subnets, address spaces, routing
✅ Network Security: NSGs, rules, priorities, service tags, ASGs
✅ Connectivity: VNet peering, Azure Bastion, service/private endpoints
✅ Load Balancing: Azure Load Balancer SKUs, types, health probes
✅ DNS: Public and private DNS zones, record types, alias records

Critical Takeaways

VNet Address Space: Use RFC 1918 private ranges (10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16)
NSG Priority: Lower number = higher priority (100-4096)
VNet Peering: Non-transitive, low latency, charged per GB
Azure Bastion: Secure RDP/SSH without public IPs, requires AzureBastionSubnet
Service Endpoints vs Private Endpoints: Service = free + service-level, Private = paid + resource-level
Load Balancer SKUs: Basic (free, limited), Standard (paid, production)

Self-Assessment Checklist

I understand VNet address spaces and subnet planning
I can create NSG rules with correct priorities
I know the difference between regional and global VNet peering
I understand when to use Azure Bastion
I can explain service endpoints vs private endpoints
I know the difference between Basic and Standard Load Balancer

Practice Questions

Try these from your practice test bundles:

Domain 4 Bundle 1: Questions 1-20 (Networking fundamentals)
Domain 4 Bundle 2: Questions 21-40 (Advanced networking)
Expected score: 70%+ to proceed

Chapter 5: Monitor and Maintain Azure Resources (10-15% of exam)

File: 06_domain_5_monitoring

Chapter Overview

What you'll learn:

Azure Monitor fundamentals
Metrics and logs
Log Analytics and KQL queries
Alerts and action groups
Azure Backup configuration
Azure Site Recovery for disaster recovery
Network Watcher tools

Time to complete: 6-8 hours
Prerequisites: All previous chapters

Section 1: Azure Monitor

Core Concepts

Azure Monitor Overview

What it is: Azure Monitor is a comprehensive monitoring solution that collects, analyzes, and acts on telemetry from Azure and on-premises environments.

Why it exists: Modern applications generate massive amounts of telemetry data (metrics, logs, traces). Without proper monitoring, issues go undetected until users complain. Azure Monitor provides centralized monitoring, alerting, and diagnostics.

How it works:

Data sources (VMs, App Services, Storage, etc.) emit telemetry
Azure Monitor collects metrics (numerical time-series data) and logs (text-based events)
Metrics are stored in time-series database for fast queries
Logs are stored in Log Analytics workspace for complex queries
Alerts trigger when conditions are met (CPU > 80%, error rate high)
Action groups execute responses (email, SMS, webhook, runbook)
Visualizations display data in dashboards, workbooks, and Power BI

⭐ Must Know:

Metrics: Numerical time-series data (CPU %, memory, disk I/O)
Logs: Text-based events (application logs, security logs, audit logs)
Log Analytics workspace: Central repository for log data
KQL: Kusto Query Language for querying logs
Alerts: Trigger actions based on conditions
Action groups: Define what happens when alert fires
Retention: Metrics (93 days default), Logs (30-730 days configurable)

Metrics vs Logs

Metrics:

Lightweight, near real-time
Stored for 93 days
Fast queries, limited analysis
Examples: CPU %, memory %, request count

Logs:

Rich, detailed information
Stored for 30-730 days (configurable)
Complex queries with KQL
Examples: Application logs, security events, audit logs

⭐ Must Know:

Use metrics for: Real-time monitoring, performance dashboards, quick alerts
Use logs for: Troubleshooting, root cause analysis, compliance auditing
Metrics Explorer: Visualize and analyze metrics
Log Analytics: Query and analyze logs with KQL

Section 2: Alerts and Action Groups

Alerts

What it is: Azure Monitor alerts proactively notify you when conditions are met in your monitoring data.

How it works:

Alert rule defines: target resource, condition, action group
Condition specifies: metric/log query, threshold, time window
Azure Monitor evaluates condition continuously
When condition met, alert fires and triggers action group
Action group executes: notifications (email, SMS) and actions (webhook, runbook, Logic App)

⭐ Must Know:

Alert types: Metric alerts, log alerts, activity log alerts
Alert states: New, Acknowledged, Closed
Severity levels: 0 (Critical) to 4 (Informational)
Action groups: Reusable sets of notification and action preferences
Alert processing rules: Suppress alerts during maintenance windows

Action Groups

What it is: Action groups define what happens when an alert fires - who gets notified and what actions are taken.

⭐ Must Know:

Notification types: Email, SMS, push notification, voice call
Action types: Webhook, Azure Function, Logic App, Automation Runbook, ITSM
Reusable: One action group can be used by multiple alert rules
Rate limiting: Max 1 SMS per 5 minutes, 1 voice call per 5 minutes per phone number

Section 3: Azure Backup

Core Concepts

Azure Backup Overview

What it is: Azure Backup provides simple, secure, and cost-effective solutions to back up and recover data from Microsoft Azure cloud.

Why it exists: Data loss can occur from accidental deletion, corruption, ransomware, or disasters. Azure Backup provides automated, reliable backup with long-term retention and easy recovery.

How it works:

Create Recovery Services vault (backup storage location)
Configure backup policy (schedule, retention)
Enable backup on resources (VMs, SQL databases, file shares)
Azure Backup agent takes snapshots/backups automatically
Backups stored in Recovery Services vault (geo-redundant by default)
Restore from any recovery point when needed

⭐ Must Know:

Recovery Services vault: Storage location for backups
Backup policies: Define schedule (daily, weekly) and retention (days, weeks, months, years)
Backup types: Full, incremental, differential
Supported workloads: Azure VMs, SQL in Azure VMs, SAP HANA, Azure Files, on-premises (via MARS agent)
Retention: Up to 9,999 recovery points, long-term retention up to 99 years
Soft delete: Deleted backups retained for 14 days (protection against accidental deletion)
Encryption: All backups encrypted at rest and in transit

VM Backup

What it is: Azure Backup for VMs provides application-consistent backups without impacting VM performance.

⭐ Must Know:

Application-consistent: Uses VSS (Windows) or pre/post scripts (Linux) to ensure consistent state
Crash-consistent: If app-consistent fails, crash-consistent backup taken
Instant restore: Restore from snapshots (faster) or vault (slower but longer retention)
Selective disk backup: Backup only specific disks to save costs
Cross-region restore: Restore to secondary region (if vault is GRS)

Section 4: Azure Site Recovery

Core Concepts

Site Recovery Overview

What it is: Azure Site Recovery (ASR) orchestrates replication, failover, and recovery of workloads to ensure business continuity during outages.

Why it exists: Disasters happen - datacenter failures, regional outages, ransomware attacks. ASR provides automated disaster recovery with minimal RTO (Recovery Time Objective) and RPO (Recovery Point Objective).

How it works:

Enable replication for VMs to secondary region or on-premises to Azure
ASR replicates VM disks continuously (initial full copy, then incremental changes)
Recovery plans define failover order and automation scripts
During disaster, initiate failover to secondary site
VMs start in secondary site from replicated disks
After primary recovers, failback to primary site

⭐ Must Know:

Replication: Continuous replication of VM disks
RPO: Recovery Point Objective, typically 5 minutes (how much data loss acceptable)
RTO: Recovery Time Objective, typically 2 hours (how long to recover)
Recovery plans: Orchestrate failover of multiple VMs
Test failover: Test disaster recovery without impacting production
Supported scenarios: Azure to Azure, VMware to Azure, Hyper-V to Azure, Physical servers to Azure

Section 5: Network Watcher

Core Concepts

Network Watcher Overview

What it is: Network Watcher provides tools to monitor, diagnose, and gain insights into network performance and health.

⭐ Must Know:

IP flow verify: Check if traffic is allowed/denied by NSG rules
Next hop: Determine next hop for traffic from a VM
Connection troubleshoot: Test connectivity between VMs or to external endpoints
Packet capture: Capture network traffic for analysis
NSG flow logs: Log all traffic through NSGs for analysis and compliance
Traffic Analytics: Visualize and analyze NSG flow logs

Chapter Summary

What We Covered

✅ Azure Monitor: Metrics, logs, Log Analytics, KQL queries
✅ Alerts: Alert rules, action groups, severity levels
✅ Azure Backup: Recovery Services vault, backup policies, VM backup
✅ Site Recovery: Replication, failover, recovery plans, RPO/RTO
✅ Network Watcher: IP flow verify, connection troubleshoot, NSG flow logs

Critical Takeaways

Metrics vs Logs: Metrics = real-time numbers, Logs = detailed events
Log Analytics: Use KQL to query logs
Alerts: Metric alerts (real-time), log alerts (complex queries), activity log alerts (management operations)
Backup Retention: Up to 9,999 recovery points, long-term retention up to 99 years
Site Recovery RPO: Typically 5 minutes (how much data loss acceptable)
Network Watcher: Essential tools for network troubleshooting

Self-Assessment Checklist

I understand the difference between metrics and logs
I can write basic KQL queries
I know how to create alert rules and action groups
I understand Azure Backup policies and retention
I can explain Site Recovery replication and failover
I know when to use Network Watcher tools

Practice Questions

Try these from your practice test bundles:

Domain 5 Bundle 1: Questions 1-25 (Monitoring fundamentals)
Domain 5 Bundle 2: Questions 26-50 (Backup and recovery)
Expected score: 70%+ to proceed

Integration & Cross-Domain Scenarios

File: 07_integration

Cross-Domain Scenarios

Scenario 1: Secure Web Application Deployment

What it tests: Understanding of VMs, networking, storage, monitoring, and security across multiple domains.

Common pattern:

Deploy web application VMs in availability zones
Configure NSGs to allow only HTTPS traffic
Use Azure Load Balancer for traffic distribution
Store application data in Azure Storage with private endpoints
Enable Azure Monitor for performance monitoring
Configure Azure Backup for disaster recovery

How to approach:

Identify primary requirement: High availability, security, performance, cost
Consider constraints: Compliance, budget, RTO/RPO
Evaluate options: Compare availability sets vs zones, Standard vs Premium storage
Choose best fit: Balance requirements with constraints

Scenario 2: Hybrid Cloud Connectivity

What it tests: Understanding of networking, identity, and governance.

Common pattern:

Connect on-premises network to Azure VNet (VPN or ExpressRoute)
Extend Active Directory to Azure (Azure AD Connect)
Configure Azure Files with AD authentication
Implement Azure Policy for compliance
Set up Azure Monitor for hybrid monitoring

Scenario 3: Cost Optimization

What it tests: Understanding of compute, storage, and governance.

Common pattern:

Right-size VMs based on utilization
Use Azure Advisor recommendations
Implement auto-shutdown for dev/test VMs
Use Azure Spot VMs for non-critical workloads
Configure storage lifecycle management
Set up budgets and cost alerts

Common Question Patterns

Pattern 1: "Which service should you use?"

How to recognize:

Question mentions: "You need to...", "The solution must..."
Multiple Azure services as options

What they're testing:

Understanding of service capabilities and limitations
Ability to match requirements to appropriate services

How to answer:

Identify key requirements (availability, performance, cost, security)
Eliminate options that don't meet requirements
Choose option that best fits all requirements

Pattern 2: "What should you configure?"

How to recognize:

Question describes existing setup
Asks for specific configuration change

What they're testing:

Understanding of service configuration options
Knowledge of best practices

How to answer:

Understand current state
Identify gap between current and desired state
Choose configuration that bridges the gap

Pattern 3: "You need to ensure..."

How to recognize:

Question states requirement with "ensure", "guarantee", "must"
Often related to SLAs, security, or compliance

What they're testing:

Understanding of SLAs and guarantees
Knowledge of security and compliance features

How to answer:

Identify the guarantee required (99.9% SLA, encryption, etc.)
Eliminate options that don't provide the guarantee
Choose simplest option that meets requirement

Decision Frameworks

Choosing Compute Options

Need full OS control?
├─ Yes → Virtual Machines
└─ No → Containers or PaaS
    ├─ Simple container? → Azure Container Instances
    ├─ Serverless containers? → Azure Container Apps
    ├─ Full orchestration? → Azure Kubernetes Service
    └─ Web app only? → Azure App Service

Choosing Storage Options

What type of data?
├─ Files (SMB/NFS) → Azure Files
├─ Objects (REST API) → Blob Storage
├─ Disks (VMs) → Managed Disks
└─ Structured data → Azure SQL/Cosmos DB

Choosing Networking Options

Need to connect VNets?
├─ Same region → Regional VNet peering
├─ Different regions → Global VNet peering
└─ On-premises → VPN Gateway or ExpressRoute

Study Strategies & Test-Taking Techniques

File: 08_study_strategies

Effective Study Techniques

The 3-Pass Method

Pass 1: Understanding (Weeks 1-6)

Read each chapter thoroughly
Take notes on ⭐ items
Complete practice exercises
Focus on understanding WHY, not just WHAT

Pass 2: Application (Week 7-8)

Review chapter summaries only
Focus on decision frameworks
Practice full-length tests
Identify weak areas

Pass 3: Reinforcement (Week 9-10)

Review flagged items
Memorize critical facts
Final practice tests
Focus on exam patterns

Active Learning Techniques

Teach Someone: Explain concepts out loud to solidify understanding
Draw Diagrams: Visualize architectures and data flows
Write Scenarios: Create your own questions based on real-world situations
Compare Options: Use comparison tables to understand differences

Memory Aids

Mnemonics for NSG Priority:

"Lower numbers win" (100 beats 200)

Mnemonics for Storage Redundancy:

LRS = Local (1 zone)
ZRS = Zones (3 zones)
GRS = Geographic (2 regions)
GZRS = Geographic + Zones (best of both)

Test-Taking Strategies

Time Management

Total time: 120 minutes (150 for non-native speakers)
Total questions: ~50 questions
Time per question: ~2-3 minutes

Strategy:

First pass (60 min): Answer all easy questions
Second pass (30 min): Tackle flagged questions
Final pass (30 min): Review marked answers

Question Analysis Method

Step 1: Read the scenario (30 seconds)

Identify: Company, situation, current state
Note: Key requirements and constraints

Step 2: Identify constraints (15 seconds)

Cost requirements (minimize cost, cost-effective)
Performance needs (low latency, high throughput)
Compliance requirements (encryption, auditing)
Administrative overhead (minimize management)

Step 3: Eliminate wrong answers (30 seconds)

Remove options that violate constraints
Eliminate technically incorrect options
Cross out options that don't meet requirements

Step 4: Choose best answer (45 seconds)

Select option that best meets ALL requirements
If tied, choose simpler/cheaper option
Trust your first instinct

Handling Difficult Questions

When stuck:

Eliminate obviously wrong answers
Look for constraint keywords (must, ensure, guarantee)
Choose most commonly recommended solution
Flag and move on if unsure (don't waste time)

⚠️ Never: Spend more than 3 minutes on one question initially

Common Exam Tricks

Trick 1: "All of the above" options

Carefully verify EACH option is correct
If even one is wrong, eliminate this choice

Trick 2: Similar-sounding services

Azure AD vs Azure AD DS vs Azure AD Domain Services
Read carefully to distinguish

Trick 3: Unnecessary complexity

Exam often includes overly complex options
Choose simplest solution that meets requirements

Trick 4: Keyword traps

"Minimize cost" → Choose cheapest option
"Minimize administrative effort" → Choose most automated option
"Ensure" or "Guarantee" → Choose option with SLA/guarantee

Final Week Strategy

7 Days Before Exam

Day 7: Full Practice Test 1 (target: 60%+)
Day 6: Review mistakes, study weak areas
Day 5: Full Practice Test 2 (target: 70%+)
Day 4: Review mistakes, focus on patterns
Day 3: Domain-focused tests for weak domains
Day 2: Full Practice Test 3 (target: 75%+)
Day 1: Review cheat sheet, relax, early sleep

Day Before Exam

Do:

Review cheat sheet (1 hour)
Skim chapter summaries (1 hour)
Review flagged items (30 min)
Get 8 hours sleep
Prepare exam day materials

Don't:

Try to learn new topics
Cram all night
Panic about gaps in knowledge
Change study methods

Final Week Checklist

File: 09_final_checklist

7 Days Before Exam

Knowledge Audit

Go through this checklist and mark items you're confident about:

Domain 1: Identities and Governance

Microsoft Entra ID users and groups
RBAC roles and scope
Azure Policy definitions and assignments
Resource tags and organization
Cost management and budgets
Management groups hierarchy

Domain 2: Storage

Storage account types and tiers
Blob access tiers (hot, cool, cold, archive)
Storage redundancy options (LRS, ZRS, GRS, GZRS)
SAS tokens (User Delegation, Service, Account)
Azure Files authentication options
Storage security (firewalls, private endpoints)

Domain 3: Compute

VM sizes and families
Availability sets vs availability zones
VM disk types (Standard HDD, SSD, Premium SSD, Ultra)
VM Scale Sets (Flexible vs Uniform)
Container services (ACI, Container Apps, ACR)
Azure App Service configuration

Domain 4: Networking

VNet address spaces and subnets
NSG rules and priorities
VNet peering (regional and global)
Azure Bastion
Service endpoints vs private endpoints
Load Balancer (Basic vs Standard)
Azure DNS (public and private zones)

Domain 5: Monitoring

Azure Monitor (metrics vs logs)
Log Analytics and KQL queries
Alert rules and action groups
Azure Backup policies and retention
Azure Site Recovery (RPO/RTO)
Network Watcher tools

If you checked fewer than 80%: Review those specific chapters

Practice Test Marathon

Week Before Exam Schedule

Day 7: Full Practice Test 1

Target score: 60%+
Time yourself (120 minutes)
Note questions you flagged

Day 6: Review Day

Review all incorrect answers
Study related chapter sections
Understand WHY you got them wrong

Day 5: Full Practice Test 2

Target score: 70%+
Focus on time management
Note improvement areas

Day 4: Focused Review

Review mistakes from Test 2
Focus on question patterns
Practice decision frameworks

Day 3: Domain-Focused Tests

Take tests for your weakest domains
Target score: 75%+ per domain

Day 2: Full Practice Test 3

Target score: 75%+
Simulate real exam conditions
Build confidence

Day 1: Light Review

Review cheat sheet only
Skim chapter summaries
Relax and prepare mentally

Day Before Exam

Final Review (2-3 hours max)

Hour 1: Cheat Sheet Review

Read through entire cheat sheet
Focus on ⭐ items
Don't try to memorize everything

Hour 2: Chapter Summaries

Skim "Critical Takeaways" from each chapter
Review decision frameworks
Refresh memory on key concepts

Hour 3: Flagged Items

Review items you marked during study
Focus on areas you struggled with
Don't stress about gaps

Don't: Try to learn new topics or cram

Mental Preparation

Get 8 hours sleep
Prepare exam day materials (ID, confirmation)
Review testing center policies
Set multiple alarms
Plan route to testing center (if in-person)

Exam Day

Morning Routine

3 hours before exam:

Light breakfast
Quick cheat sheet review (30 min max)
Arrive 30 minutes early

At testing center:

Use restroom before exam
Store all personal items
Take deep breaths, stay calm

Brain Dump Strategy

When exam starts, immediately write down on provided materials:

Storage redundancy options (LRS, ZRS, GRS, GZRS)
NSG priority rules (100-4096, lower = higher)
VM SLA requirements (Availability Set 99.95%, Zone 99.99%)
Blob access tiers (hot, cool, cold, archive)
Key service limits you struggle to remember

During Exam

Time Management:

Spend 2-3 minutes per question
Flag difficult questions, move on
Don't get stuck on one question

Question Strategy:

Read scenario carefully
Identify constraints (cost, performance, security)
Eliminate wrong answers
Choose best fit

Stay Calm:

Don't panic if questions seem hard
Trust your preparation
Use process of elimination
Make educated guesses if needed

Post-Exam

If You Pass

Celebrate! 🎉
Download certificate from Microsoft Learn
Update LinkedIn profile
Plan next certification (AZ-305, AZ-500, etc.)

If You Don't Pass

Don't be discouraged (many people need 2 attempts)
Review exam feedback report
Identify weak areas
Study those specific topics
Retake after 24 hours (first retake)
You've got this! 💪

Final Reminders

✅ You've prepared thoroughly
✅ Trust your knowledge
✅ Read questions carefully
✅ Manage your time
✅ Stay calm and confident

Good luck on your AZ-104 exam!

Appendices

File: 99_appendices

Appendix A: Quick Reference Tables

Storage Redundancy Comparison

Option	Copies	Location	Durability	Availability (Read)	Cost	Use Case
LRS	3	Single zone	11 nines	99.9%	$	Dev/test, easily reconstructible data
ZRS	3	3 zones in region	12 nines	99.9%	$$	Production, zone-level protection
GRS	6	2 regions (LRS each)	16 nines	99.9%	$$$	Business-critical, regional protection
RA-GRS	6	2 regions (LRS each)	16 nines	99.99%	$$$$	Read from secondary, regional protection
GZRS	6	2 regions (ZRS + LRS)	16 nines	99.9%	$$$$	Mission-critical, zone + regional protection
RA-GZRS	6	2 regions (ZRS + LRS)	16 nines	99.99%	$$$$$	Best protection, read from secondary

Blob Access Tiers Comparison

Tier	Access Frequency	Storage Cost	Access Cost	Minimum Duration	Rehydration	Use Case
Hot	Frequent	Highest	Lowest	None	N/A	Active data, frequent access
Cool	Infrequent	50% lower	Higher	30 days	N/A	Backups, short-term archives
Cold	Rare	70% lower	Higher	90 days	N/A	Long-term backups
Archive	Very rare	90% lower	Highest	180 days	Hours	Compliance archives, rarely accessed

VM Availability Options Comparison

Option	SLA	Protection Level	Max VMs	Cost	Use Case
Single VM (Premium SSD)	99.9%	None	1	$	Single instance, Premium SSD required
Availability Set	99.95%	Rack-level	200	$	Same datacenter, fault/update domains
Availability Zone	99.99%	Datacenter-level	Unlimited	$	Cross-datacenter, best availability
VM Scale Set	99.95%	Rack-level	1,000	$	Auto-scaling, load-balanced

NSG Default Rules

Priority	Name	Direction	Source	Destination	Port	Protocol	Action
65000	AllowVnetInBound	Inbound	VirtualNetwork	VirtualNetwork	Any	Any	Allow
65001	AllowAzureLoadBalancerInBound	Inbound	AzureLoadBalancer	Any	Any	Any	Allow
65500	DenyAllInBound	Inbound	Any	Any	Any	Any	Deny
65000	AllowVnetOutBound	Outbound	VirtualNetwork	VirtualNetwork	Any	Any	Allow
65001	AllowInternetOutBound	Outbound	Any	Internet	Any	Any	Allow
65500	DenyAllOutBound	Outbound	Any	Any	Any	Any	Deny

Azure Load Balancer SKU Comparison

Feature	Basic	Standard
Backend pool size	Up to 300	Up to 1,000
Backend pool endpoints	Single VNet	Multiple VNets, VMs, Scale Sets, Availability Sets
Health probes	HTTP, TCP	HTTP, HTTPS, TCP
Availability zones	No	Yes
SLA	None	99.99%
Secure by default	No	Yes (closed to inbound unless NSG allows)
Cost	Free	Paid (per rule + data processed)
Use case	Dev/test	Production

SAS Token Types Comparison

Type	Signed With	Scope	Revocation	Max Expiry	Use Case
User Delegation	Azure AD credentials	Blob/Data Lake only	Via Azure AD	7 days	Most secure, identity-based
Service SAS	Account key	Container/blob/file/queue	Via stored access policy	Unlimited	Container-level access
Account SAS	Account key	Account-wide	Account key regeneration only	Unlimited	Cross-service access

Azure Files Authentication Options

Option	Identity Source	Network Requirement	Use Case
Storage Account Key	N/A (key-based)	None	Administrative access only
On-premises AD DS	On-premises Active Directory	Network connectivity to DC	Hybrid environments, existing AD
Azure AD DS	Azure AD Domain Services	None (managed DC in Azure)	Cloud-only, managed domain
Azure AD Kerberos	Azure AD (hybrid identities)	None (no DC connectivity needed)	Hybrid identities, no VPN required

Appendix B: Important Limits and Quotas

Virtual Machines

Resource	Limit
VMs per subscription per region	25,000 (default quota, can be increased)
VM cores per subscription per region	Varies by VM family (can be increased)
Data disks per VM	Up to 64 (depends on VM size)
Max disk size	32 TiB (managed disk)
Max VM size	416 vCPUs, 12 TB RAM (M-series)
Availability Set fault domains	2-3 (depends on region)
Availability Set update domains	5-20 (configurable)
VMs per Availability Set	200
VMs per Scale Set	1,000

Storage

Resource	Limit
Storage accounts per subscription per region	250 (default)
Max storage account capacity	5 PiB
Max blob size (block blob)	190.7 TiB
Max blob size (page blob)	8 TiB
Max file share size (Standard)	100 TiB
Max file share size (Premium)	100 TiB
Max file size	4 TiB
Snapshots per blob	200
Snapshots per file share	200
SAS token max expiry	Unlimited (but short duration recommended)
User Delegation SAS max expiry	7 days

Networking

Resource	Limit
VNets per subscription per region	1,000
Subnets per VNet	3,000
VNet peerings per VNet	500
Private IP addresses per VNet	65,536 (depends on address space)
Public IP addresses per subscription	1,000 (Standard SKU)
NSGs per subscription per region	5,000
NSG rules per NSG	1,000 (inbound + outbound)
Application Security Groups per subscription	3,000
Load Balancers per subscription	1,000
Load Balancer rules per Load Balancer	150

Azure Monitor

Resource	Limit
Metric retention	93 days
Log Analytics workspace retention	30-730 days (configurable)
Log Analytics workspace data ingestion	10 GB/day (free tier), unlimited (paid)
Alert rules per subscription	5,000
Action groups per subscription	2,000
Notifications per action group	10

Appendix C: Common Formulas and Calculations

Subnet Calculations

Formula: Usable IPs = 2^(32 - prefix) - 5

Examples:

/24 subnet: 2^(32-24) - 5 = 256 - 5 = 251 usable IPs
/25 subnet: 2^(32-25) - 5 = 128 - 5 = 123 usable IPs
/26 subnet: 2^(32-26) - 5 = 64 - 5 = 59 usable IPs
/27 subnet: 2^(32-27) - 5 = 32 - 5 = 27 usable IPs

Azure Reserved IPs (per subnet):

.0: Network address
.1: Default gateway
.2, .3: Azure DNS
.255: Broadcast address

Cost Estimation

VM Cost = (VM size hourly rate) × (hours running) + (disk cost) + (bandwidth cost)

Storage Cost = (capacity GB) × (storage tier rate) + (operations cost) + (data transfer cost)

Example VM Cost:

Standard_D2s_v3: $0.096/hour
Running 730 hours/month: $70.08
Premium SSD 128 GB: $19.71/month
Total: ~$90/month

Appendix D: Glossary

A

Availability Set: Logical grouping of VMs that distributes them across fault domains and update domains for high availability (99.95% SLA).

Availability Zone: Physically separate datacenter within an Azure region with independent power, cooling, and networking (99.99% SLA).

Azure AD (Microsoft Entra ID): Cloud-based identity and access management service.

Azure Bastion: Secure RDP/SSH connectivity to VMs without exposing them to the internet.

B

Blob Storage: Object storage for unstructured data (files, images, videos, backups).

Burstable VM: VM type (B-series) that accumulates credits during low usage and bursts during high usage.

C

CIDR: Classless Inter-Domain Routing, notation for IP address ranges (e.g., 10.0.0.0/16).

Container: Lightweight, standalone executable package that includes everything needed to run an application.

Cool Tier: Blob storage tier for infrequently accessed data with 30-day minimum storage duration.

D

Data Disk: Additional disk attached to VM for application data (optional, up to 64 per VM).

Deallocated: VM state where compute resources are released and no compute charges apply (storage still charged).

F

Fault Domain: Group of VMs that share a common power source and network switch (rack-level isolation).

Flexible Orchestration: VM Scale Set mode that supports mixed VM types and full VM lifecycle control (recommended).

G

GRS (Geo-Redundant Storage): Storage redundancy that replicates data to a secondary region hundreds of miles away (16 nines durability).

H

Hot Tier: Blob storage tier for frequently accessed data with highest storage cost but lowest access cost.

K

KQL (Kusto Query Language): Query language used in Log Analytics to analyze log data.

L

LRS (Locally Redundant Storage): Storage redundancy that maintains 3 copies within a single zone (11 nines durability).

Log Analytics: Azure Monitor component for collecting and analyzing log data using KQL queries.

M

Managed Disk: Azure-managed storage for VM disks (recommended over unmanaged disks).

Metrics: Numerical time-series data collected by Azure Monitor (CPU %, memory, disk I/O).

N

NSG (Network Security Group): Azure firewall that filters network traffic based on rules with priorities.

NFS: Network File System protocol for Linux file shares (requires Premium storage).

O

OS Disk: Required disk containing the operating system for a VM (typically 127 GB+).

P

Premium SSD: High-performance SSD storage with low latency (<10ms) for production workloads.

Private Endpoint: Brings Azure service into your VNet with a private IP address for complete network isolation.

R

RBAC (Role-Based Access Control): Authorization system that manages access to Azure resources based on roles.

Recovery Services Vault: Storage location for Azure Backup data with geo-redundant storage by default.

RPO (Recovery Point Objective): Maximum acceptable data loss measured in time (e.g., 5 minutes).

RTO (Recovery Time Objective): Maximum acceptable downtime measured in time (e.g., 2 hours).

S

SAS (Shared Access Signature): URI that grants restricted access to Azure Storage resources without exposing account keys.

Service Endpoint: Extends VNet identity to Azure services, keeping traffic on Microsoft backbone network.

SMB: Server Message Block protocol for Windows file shares (supports identity-based authentication).

Standard SSD: Balanced performance SSD storage for general-purpose workloads.

U

Uniform Orchestration: VM Scale Set mode where all VMs must be identical (legacy, use Flexible instead).

Update Domain: Group of VMs that are updated together during planned maintenance.

V

VNet (Virtual Network): Logically isolated network in Azure for secure communication between resources.

VNet Peering: Connects two VNets allowing resources to communicate using private IP addresses.

Z

ZRS (Zone-Redundant Storage): Storage redundancy that maintains 3 copies across 3 availability zones (12 nines durability).

Appendix E: Additional Resources

Official Microsoft Resources

Microsoft Learn:

AZ-104 Learning Path: https://learn.microsoft.com/training/courses/az-104t00
Azure Documentation: https://learn.microsoft.com/azure/
Azure Architecture Center: https://learn.microsoft.com/azure/architecture/

Practice Resources:

Microsoft Learn Practice Assessment (Free)
Azure Free Account: https://azure.microsoft.com/free/
Azure Sandbox Environments (Microsoft Learn)

Community Resources:

Microsoft Tech Community: https://techcommunity.microsoft.com/
Azure Friday Videos: https://learn.microsoft.com/shows/azure-friday/
Azure Updates: https://azure.microsoft.com/updates/

Exam Information

Exam Details:

Exam Code: AZ-104
Duration: 120 minutes (150 for non-native speakers)
Passing Score: 700 (out of 1000)
Question Types: Multiple choice, case studies, drag-and-drop, hot area
Cost: $165 USD (varies by region)

Renewal:

Required every 12 months through Microsoft Learn
Free renewal assessment
Keeps certification active

Next Steps After AZ-104:

AZ-305: Azure Solutions Architect Expert
AZ-400: Azure DevOps Engineer Expert
AZ-500: Azure Security Engineer Associate
AZ-700: Azure Network Engineer Associate

Appendix F: Exam Day Checklist

What to Bring

Valid government-issued ID (name must match registration)
Exam confirmation email/number
Arrive 30 minutes early

What NOT to Bring

Mobile phones (must be stored)
Watches (must be stored)
Bags/backpacks (must be stored)
Food/drinks (not allowed in testing room)
Notes/study materials (not allowed)

Testing Center Rules

No personal items in testing room
Provided: Scratch paper/whiteboard and marker
Breaks: Allowed but time continues
Restroom: Before exam starts (time doesn't stop during exam)

Online Proctored Exam (if applicable)

Quiet, private room
Stable internet connection
Webcam and microphone
Clear desk (no papers, books, devices)
Government-issued ID ready
Check system requirements before exam day

Final Words

You're Ready When...

You score 75%+ on all practice tests
You can explain key concepts without notes
You recognize question patterns instantly
You make decisions quickly using frameworks
You understand WHY, not just WHAT

Remember

Trust your preparation: You've studied thoroughly
Read questions carefully: Don't rush, understand what's being asked
Manage your time: 2-3 minutes per question
Don't overthink: Your first instinct is usually correct
Stay calm: Take deep breaths if you feel stressed

Confidence Builders

✅ You've completed a comprehensive study guide
✅ You understand all five exam domains
✅ You've practiced with realistic questions
✅ You know the decision frameworks
✅ You're prepared for success

Good luck on your AZ-104 exam! You've got this! 🎯

AZ-104 学习指南

AZ-104: Microsoft Azure Administrator Comprehensive Study Guide

Overview

File Organization

Study Chapters (Read in Order)

Diagrams Folder

Study Plan Overview

Recommended Timeline: 8 Weeks (2-3 hours daily)

Alternative Timeline: 6 Weeks (Accelerated)

Alternative Timeline: 10 Weeks (Thorough)

Learning Approach

The 3-Pass Method

Active Learning Techniques

Progress Tracking

Legend & Visual Markers

How to Navigate This Guide

For Complete Beginners

For IT Professionals with Some Cloud Experience

For Visual Learners

For Hands-On Learners

Study Resources

Included in This Package

External Resources (Optional)

Tools You'll Need

Exam Details

Exam Information

What to Expect

Scoring

Renewal

Tips for Success

Before You Start

During Your Study

Final Week

Exam Day

Common Pitfalls to Avoid

Study Mistakes

Exam Mistakes

Motivation & Mindset

Why This Certification Matters

Staying Motivated

Growth Mindset

Ready to Begin?

Chapter 0: Essential Background & Prerequisites

What You Need to Know First

Core Concepts Foundation

What is Cloud Computing?

Cloud Service Models

Infrastructure as a Service (IaaS)

Platform as a Service (PaaS)

Software as a Service (SaaS)

Understanding the Azure Hierarchy

Level 1: Management Groups

Level 2: Subscriptions

Level 3: Resource Groups

Level 4: Resources

Azure Management Tools

Azure Portal

Azure CLI

Azure PowerShell

Azure Cloud Shell

Groups

Section 2: Role-Based Access Control (RBAC)

Introduction

What is Azure RBAC?

RBAC Components

1. Security Principal (Who)

2. Role Definition (What)

3. Scope (Where)

RBAC in Action: Detailed Examples

Section 3: Azure Policy

Introduction

What is Azure Policy?

Policy Components

1. Policy Definition

2. Policy Effects

3. Policy Assignment

4. Initiative Definitions (Policy Sets)

Section 4: Tags and Resource Organization

Introduction

What are Tags?