完整的考试准备指南
Complete Learning Path for Certification Success
This comprehensive study guide provides a structured learning path from fundamentals to exam readiness for the Microsoft Power BI Data Analyst Associate certification (PL-300). Designed for complete novices, it teaches all concepts progressively while focusing exclusively on exam-relevant content. Extensive diagrams and visual aids are integrated throughout to enhance understanding and retention.
What makes this guide different:
Study Sections (in order):
Total Time: 6-10 weeks (2-3 hours per day)
Weeks 1-8: Core Content
Week 9: Integration & Practice
Week 10: Final Preparation
The 4-Step Method:
Visual Learning:
Use checkboxes to track completion:
Sequential Study (Recommended):
Topic-Focused Study (If you have some experience):
Quick Review (Final week):
PL-300 Certification Details:
Domain Breakdown:
Included with this guide (in ):
Difficulty-Based (6 bundles):
Full Practice Tests (3 bundles):
Domain-Focused (8 bundles):
Service-Focused (5 bundles):
Target Scores:
Effective Learning:
Common Pitfalls to Avoid:
Do This Instead:
What you need before starting:
If you're missing prerequisites:
Before You Start:
During Study:
After Each Chapter:
Official Microsoft Resources:
Practice Environments:
Support:
You're ready for the exam when:
Start here:
Remember:
Ready to begin? Turn to 01_fundamentals to start your certification journey!
This certification assumes you understand basic data concepts. This chapter will build the essential foundation you need for Power BI Data Analyst certification success.
Prerequisites checklist:
If you're missing any: Don't worry! This chapter will explain everything from first principles.
What it is: Business Intelligence is the process of transforming raw data into meaningful insights that help organizations make better decisions.
Why it matters: Organizations collect massive amounts of data every day (sales transactions, customer interactions, inventory movements, website clicks, etc.). Without BI, this data is just numbers in spreadsheets or databases. BI tools like Power BI turn that raw data into visual dashboards, reports, and analytics that reveal patterns, trends, and opportunities.
Real-world analogy: Think of raw data as ingredients in a kitchen. Business Intelligence is like having a skilled chef who knows how to combine those ingredients into delicious meals (insights) that people can actually use and enjoy. Just as a chef transforms flour, eggs, and sugar into a cake, BI transforms rows of numbers into actionable insights.
Why Power BI specifically: Power BI is Microsoft's BI platform that allows you to:
What it is: Data is information stored in a structured format, typically organized into tables with rows and columns.
Why it exists: Organizations need to track and record information to operate effectively. Every business transaction, customer interaction, product sale, or website visit generates data that can provide insights.
Key data concepts you must understand:
Tables: A collection of related data organized into rows and columns
Rows (Records): Each row represents a single item or transaction
Columns (Fields): Each column represents a specific attribute or property
Data Types: The kind of information stored in each column
Real-world example:
Imagine a sales table:
What it is: Power BI is Microsoft's business analytics platform that allows you to connect to data, transform it, build data models, create visualizations, and share insights.
Why it exists: Before Power BI, business analysts needed multiple tools: databases for storage, Excel for analysis, and presentation software for reports. Power BI combines all these capabilities into one integrated platform. It democratizes data analysis - you don't need to be a data scientist or programmer to create powerful analytics.
The Power BI ecosystem consists of three main components:
Power BI Desktop (Windows application, FREE):
Power BI Service (Cloud platform, powerbi.com):
Power BI Mobile (iOS, Android, Windows apps):
How they work together:
What it is: The typical process of creating a Power BI solution follows a consistent pattern: Connect → Transform → Model → Visualize → Share.
Why this order matters: Each step builds on the previous one. You can't visualize data you haven't connected to. You can't build accurate reports with messy, untransformed data. Understanding this workflow helps you approach any BI problem systematically.
The 5-step workflow explained:
Step 1: Connect to Data (Prepare the Data - Domain 1)
Step 2: Transform & Clean Data (Prepare the Data - Domain 1)
Step 3: Model the Data (Model the Data - Domain 2)
Step 4: Visualize & Analyze (Visualize the Data - Domain 3)
Step 5: Manage & Share (Manage and Secure - Domain 4)
📊 Power BI Workflow Diagram:
graph LR
A[1. Connect to Data] --> B[2. Transform & Clean]
B --> C[3. Model the Data]
C --> D[4. Visualize & Analyze]
D --> E[5. Manage & Share]
E -.Iterate.-> A
style A fill:#e1f5fe
style B fill:#fff3e0
style C fill:#f3e5f5
style D fill:#e8f5e9
style E fill:#fce4ec
See: diagrams/01_fundamentals_workflow.mmd
Diagram Explanation:
This diagram shows the five sequential steps of the Power BI workflow. Step 1 (light blue) is where you establish connections to your data sources - this could be databases, files, or cloud services. Step 2 (orange) is the transformation phase in Power Query where you clean and shape the data. Step 3 (purple) is where you build the data model, creating relationships and calculations. Step 4 (green) is visualization where you create charts and reports. Step 5 (pink) is publishing and sharing with stakeholders. The dotted line back to Step 1 shows that this is an iterative process - as requirements change or new data sources are added, you may need to revisit earlier steps. Understanding this flow is critical because the PL-300 exam tests your ability to work through this entire pipeline.
What they are: Power BI offers different ways to connect to data, each with trade-offs between performance, data freshness, and resource usage.
Why they exist: Different business scenarios have different requirements. Some need lightning-fast dashboards with slightly older data (Import). Others need real-time data but can tolerate slower visuals (DirectQuery). Some need to leverage existing enterprise models (Live Connection). Power BI provides flexibility to choose the right approach.
Real-world analogy:
What it is: Import mode copies data from the source into Power BI's internal in-memory database (called VertiPaq). All your data is stored locally in the .pbix file and in the Power BI service after publishing.
Why it exists: Import mode provides the fastest possible performance because all data is stored in Power BI's highly optimized columnar compression engine. Queries don't need to go back to the source - everything is in memory.
How it works (Detailed step-by-step):
Initial Connection: You connect to a data source (SQL Server, Excel file, etc.) and select tables/queries to import.
Data Transform: In Power Query, you can apply transformations (filter rows, change types, merge tables). These transformations define the data extraction logic.
Data Load: Power BI executes the Power Query logic, extracts data from source, and compresses it into VertiPaq columnar format. A 100MB Excel file might compress to 10MB in Power BI!
Storage: The compressed data is stored inside the .pbix file (Desktop) or in Power BI service (after publishing). This becomes your "semantic model."
Query Execution: When you create a visual, Power BI queries the in-memory data at lightning speed (milliseconds). No network calls to the source.
Refresh: Data becomes stale over time. You must manually refresh in Desktop or schedule automatic refresh in Service (up to 8x daily with Pro, 48x with Premium).
Detailed Example 1: Sales Data Import
You have a SQL Server database with 1 million sales transactions. You connect using Import mode and Power Query filters to last 2 years only. Power BI loads those 500,000 rows, compresses them from 200MB to 20MB using VertiPaq compression, and stores in your .pbix file. Now when users view sales dashboards, visuals render in milliseconds because all data is in memory. However, yesterday's sales won't appear until you refresh the dataset. You schedule daily refresh at 6 AM, so reports always show data through yesterday.
Detailed Example 2: Excel File Import
You maintain a product catalog in Excel with 1,000 products. You import this into Power BI. The entire Excel table is copied into Power BI's data model. When you build a product slicer, it loads instantly because those 1,000 products are in memory. If you update the Excel file (add new products), Power BI Desktop won't see them until you click "Refresh" on the Home ribbon, which re-imports the Excel data.
Detailed Example 3: Multiple Source Import
You import data from SQL Server (sales transactions), SharePoint (customer feedback), and Azure Blob Storage (product images). All three sources are imported and stored in a single Power BI semantic model. Visuals can combine data from all three sources instantly because everything is in the same in-memory model. If any source updates, you need to refresh the entire model to see changes.
⭐ Must Know (Critical Facts):
When to use (Comprehensive):
Limitations & Constraints:
💡 Tips for Understanding:
⚠️ Common Mistakes & Misconceptions:
Mistake 1: "Import mode means my reports update automatically when source data changes"
Mistake 2: "My .pbix file is huge, but I'm importing small tables"
Mistake 3: "I need real-time data, so I'll just refresh every minute"
🔗 Connections to Other Topics:
Troubleshooting Common Issues:
What it is: DirectQuery mode establishes a live connection to the source database without importing any data. Every time a visual refreshes, Power BI sends a query to the underlying data source to retrieve current data.
Why it exists: Some scenarios require up-to-the-minute data (stock trading dashboards, real-time IoT monitoring, operational reports). Other scenarios have data too large to import (multi-terabyte data warehouses). DirectQuery keeps data at the source and queries it on-demand, ensuring you always see the latest data without importing anything.
Real-world analogy: Streaming music from Spotify vs downloading songs. With DirectQuery (streaming), you always hear the latest version of a song, but it requires internet connection and can buffer if connection is slow. With Import (downloading), it plays instantly but might be an older version.
How it works (Detailed step-by-step):
Connection Establishment: You connect to a DirectQuery-supported source (SQL Server, Azure SQL Database, Azure Synapse, etc.) and select DirectQuery mode in the connection dialog.
Schema Import (not data): Power BI imports only the metadata (table names, column names, data types) - no actual data rows. The Data pane in Desktop shows table/column structure but contains zero data.
Visual Creation: When you add a visual (e.g., bar chart of Sales by Region), Power BI doesn't have local data to display.
Query Generation: Power BI translates your visual into native SQL (or other query language) and sends it to the source. For example, a "Sales by Region" visual becomes: SELECT Region, SUM(SalesAmount) FROM Sales GROUP BY Region
Source Execution: The database runs the query, performs aggregations, and returns only the aggregated results (not raw data).
Visual Rendering: Power BI receives the query results and renders the visual. This process happens every time the visual refreshes or filter changes.
Query Caching: Power BI caches query results briefly (configurable, default 1 hour) to avoid re-querying for identical requests.
Detailed Example 1: Real-Time Sales Dashboard
Your organization has a SQL Server database with live sales data updated every second as transactions occur. You build a DirectQuery report showing current day's sales. When a manager opens the dashboard at 10:00 AM, Power BI sends a query like SELECT SUM(Amount) FROM Sales WHERE Date = GETDATE() to SQL Server, retrieves the result, and displays it. At 10:30 AM, when they refresh the visual, Power BI sends the same query again, now returning 30 minutes worth of additional sales. The manager always sees the absolute latest data without any refresh scheduled in Power BI - the source is the truth.
Detailed Example 2: Large Data Warehouse
You have a 5 TB Azure Synapse data warehouse that far exceeds Power BI's import limits. Using DirectQuery, you can build reports without importing anything. When users slice by Year and Product Category, Power BI generates a query: SELECT Year, ProductCategory, SUM(Revenue) FROM FactSales GROUP BY Year, ProductCategory and sends it to Synapse. Synapse's powerful compute processes the query across billions of rows and returns just the aggregated summary. Power BI displays it - only a few KB of results, not terabytes of raw data.
Detailed Example 3: Security-Sensitive Data
Healthcare data must remain in HIPAA-compliant database, cannot be exported. Using DirectQuery, Power BI users can analyze patient data without data ever leaving the secure database. When they filter to a specific patient ID, Power BI sends a WHERE clause to the database. The database applies its row-level security rules, returns only authorized records, and Power BI visualizes them. Data stays secure at source.
⭐ Must Know (Critical Facts):
When to use (Comprehensive):
Limitations & Constraints:
💡 Tips for Understanding:
⚠️ Common Mistakes & Misconceptions:
Mistake 1: "DirectQuery means I don't need to worry about data modeling"
Mistake 2: "My DirectQuery report is slow, so I'll add more visuals"
Mistake 3: "I'll just use DirectQuery for everything - why import?"
🔗 Connections to Other Topics:
Troubleshooting Common Issues:
Issue 1: "Visuals take 30+ seconds to load"
Issue 2: "DAX measure works in Import but fails in DirectQuery"
What it is: Live Connection creates a direct link to an existing Power BI semantic model (dataset) published in Power BI Service or an Analysis Services model. Unlike Import or DirectQuery which connect to raw data sources, Live Connection connects to an already-built data model.
Why it exists: Organizations invest significant effort in creating certified, governed enterprise data models (semantic models). Instead of every analyst rebuilding the same model, they can connect to the centralized model. This ensures consistency (everyone uses same calculations), reduces duplication (one model, many reports), and leverages IT-managed data quality and security.
How it works (Detailed step-by-step):
Existing Model: An enterprise semantic model already exists in Power BI Service (published by IT/BI team) or Analysis Services server, containing cleaned data, relationships, measures, and security rules.
Connection: You create a new Power BI report and choose "Power BI semantic models" or "Analysis Services" as data source instead of SQL/Excel/etc.
Model Reference: Power BI Desktop connects to the published model and displays its structure (tables, fields, measures) in the Data pane. Zero data is copied to your local machine - just metadata.
Report Building: You create visuals using the connected model's fields and measures. All calculations (DAX measures) execute in the remote model.
Security Enforcement: The published model's row-level security (RLS) automatically applies. If the model restricts you to "Western Region" data, you only see Western Region in your visuals.
Query Execution: When you build a visual, Power BI sends the visual definition to the Service/Analysis Services, which queries its model and returns aggregated results. Similar to DirectQuery, but connecting to a model instead of raw tables.
Publishing: When you publish your report to Service, it remains connected to the source semantic model. Changes to the source model (new measures, refreshed data) automatically reflect in your report.
Detailed Example 1: Enterprise Sales Model
Your organization's BI team publishes a certified "Corporate Sales" semantic model to Power BI Service with 3 years of sales data, 50+ DAX measures, and row-level security. As a regional analyst, you create a Live Connection to this model. You build a report showing your region's performance using the model's "Total Sales" and "Sales Growth %" measures. The model's RLS automatically filters data to your region only. When the BI team adds a new "Customer Lifetime Value" measure to the model, it automatically appears in your report's field list. When they refresh the model's data, your report shows updated data without you doing anything.
Detailed Example 2: Analysis Services Connection
Your company runs SQL Server Analysis Services (on-premises tabular model) with financial data. You connect Power BI Desktop to SSAS using Live Connection. The model contains complex financial calculations built by finance team. You create executive dashboards using these pre-built measures without needing to understand the underlying DAX. When finance recalculates budgets in SSAS, your Power BI reports reflect updates immediately because they're querying the live model.
Detailed Example 3: Shared Dataset Across Teams
Marketing team publishes a Power BI semantic model with customer segmentation, campaign performance, and attribution models. Sales, Product, and Executive teams each create separate Power BI reports with Live Connection to this marketing model. All teams use the same definitions of "Customer Lifetime Value" and "Campaign ROI" ensuring consistent metrics across organization. Marketing team owns the model, refreshing it daily, while other teams just consume the data through their specialized reports.
⭐ Must Know (Critical Facts):
When to use (Comprehensive):
Limitations & Constraints:
💡 Tips for Understanding:
⚠️ Common Mistakes & Misconceptions:
Mistake 1: "I'll connect live and import some additional tables from Excel"
Mistake 2: "I need to refresh my Live Connection report's data"
Mistake 3: "I'll build a Live Connection, then disconnect and import the data"
🔗 Connections to Other Topics:
Troubleshooting Common Issues:
Issue 1: "Can't see all fields from source model"
Issue 2: "My Live Connection report shows different data than colleague's"
Let's compare all three modes side-by-side to help you choose the right one:
| Feature | Import | DirectQuery | Live Connection |
|---|---|---|---|
| Data Storage | Copied to Power BI (VertiPaq) | Remains at source | Remains in source model |
| Query Performance | ⚡ Fastest (in-memory) | ⏱️ Depends on source | ⏱️ Depends on source model |
| Data Freshness | As of last refresh | 🔴 Real-time | As of source model refresh |
| Size Limits | 1 GB (Pro), larger (Premium) | ✅ No limit | ✅ No limit |
| Offline Access | ✅ Yes (Desktop) | ❌ No | ❌ No |
| DAX Support | ✅ All functions | ⚠️ Limited | ✅ Depends on source model |
| Power Query | ✅ Full functionality | ⚠️ Limited (query folding) | ❌ Not available |
| Data Modeling | ✅ Full control | ✅ Full control | ❌ Read-only |
| Security | Defined in model | ✅ Source enforces | ✅ Inherited from model |
| Typical Use Cases | Standard BI reports | Real-time dashboards | Enterprise governed reports |
| Refresh Required | ✅ Yes (scheduled) | ❌ No (always live) | Source model refreshes |
| Best For | Performance-critical, complex calculations | Large data, real-time needs | Leveraging existing models |
📊 Storage Modes Decision Tree:
graph TD
A[Choose Storage Mode] --> B{Existing certified <br/>model available?}
B -->|Yes| C[Live Connection]
B -->|No| D{Data size manageable<br/> and real-time<br/> not required?}
D -->|Yes| E[Import Mode]
D -->|No| F{Need real-time<br/> or data too large?}
F -->|Yes| G[DirectQuery]
F -->|No| H[Consider Composite/<br/>Hybrid Models]
C --> C1[✅ Use existing model<br/>Consistent definitions<br/>RLS inherited]
E --> E1[✅ Best performance<br/>Full DAX support<br/>All transformations]
G --> G1[✅ Real-time data<br/>No size limit<br/>Secure at source]
H --> H1[✅ Mix modes<br/>Import aggregations<br/>DirectQuery details]
style C fill:#c8e6c9
style E fill:#c8e6c9
style G fill:#c8e6c9
style H fill:#fff3e0
See: diagrams/01_fundamentals_storage_modes_decision.mmd
Decision Tree Explanation:
This diagram helps you choose the right storage mode for your scenario. Start at the top by asking if a certified model already exists in your organization - if yes, use Live Connection to leverage it. If building from scratch, next consider if your data size is manageable (under 1 GB for Pro) and you don't need real-time updates - if yes, Import mode gives best performance. If data is too large or you need real-time, DirectQuery is your choice. For complex scenarios (like needing both performance and some real-time data), explore Composite or Hybrid models which combine multiple modes. The green boxes indicate recommended modes, orange indicates advanced hybrid approaches.
What it is: Power Query is the data transformation engine built into Power BI Desktop (also available in Excel). It provides a visual interface to connect to data sources, clean, reshape, and transform data before loading it into the Power BI data model.
Why it exists: Raw data is rarely analysis-ready. It has formatting issues, missing values, wrong data types, unnecessary columns, and inconsistent structures. Power Query solves this by providing a code-free (or low-code) way to prepare data. Instead of writing SQL or Python scripts, you use a visual interface to apply transformations.
Real-world analogy: Think of Power Query as a food processor for data. Just as a food processor chops, blends, and mixes ingredients into usable form for cooking, Power Query cleans, transforms, and reshapes messy data into analysis-ready structure for reporting.
The Power Query Interface:
How transformations work:
What it is: M (also called Power Query Formula Language) is the programming language that records every transformation you make in Power Query. When you click "Remove Duplicates" or "Split Column," Power BI writes M code behind the scenes.
Why it exists: While the visual interface handles 90% of transformation needs, M language provides unlimited flexibility. You can write custom logic, create functions, handle complex scenarios, and automate repetitive tasks. Even if you never write M manually, understanding it helps troubleshoot and optimize queries.
Real-world example of M code:
When you filter a table to show only rows where Country = "USA", Power Query writes:
= Table.SelectRows(PreviousStep, each [Country] = "USA")
When you change a column data type to Date, it writes:
= Table.TransformColumnTypes(#"Previous Step", {{"OrderDate", type date}})
⭐ Must Know (Critical M Concepts):
= Table.SelectRows(...)// This step filters USA only💡 Tip: Click "View" ribbon → "Advanced Editor" to see all M code for a query. Great for learning and debugging!
What it is: Star schema is a data warehouse design methodology where data is organized into fact tables (containing measurements/transactions) and dimension tables (containing descriptive attributes). When visualized, the design looks like a star - fact table in the center, dimension tables radiating outward.
Why it exists: Star schema optimizes query performance and simplifies reporting. It separates what happened (facts) from who/what/when/where/why (dimensions). This structure is proven over decades to provide fast query times, easy-to-understand models, and efficient storage.
Real-world analogy: Think of a star schema like a receipt system. The receipt itself (fact) records the transaction: items bought, quantities, prices, total amount. The receipt references customers (who), products (what), stores (where), and dates (when) - these are dimensions. The receipt doesn't repeat customer's full address or product description every time - it just references them.
Fact Tables (The Center of the Star):
Dimension Tables (The Points of the Star):
📊 Star Schema Example Diagram:
graph TB
subgraph "Dimension Tables"
D1[Date Dimension<br/>DateKey<br/>Date<br/>Year<br/>Quarter<br/>Month<br/>Day]
D2[Customer Dimension<br/>CustomerKey<br/>Name<br/>City<br/>Country<br/>Segment]
D3[Product Dimension<br/>ProductKey<br/>Name<br/>Category<br/>Subcategory<br/>Price]
D4[Store Dimension<br/>StoreKey<br/>StoreName<br/>City<br/>Region<br/>Manager]
end
subgraph "Fact Table - Center of Star"
F[Sales Fact Table<br/>SalesKey<br/>DateKey FK<br/>CustomerKey FK<br/>ProductKey FK<br/>StoreKey FK<br/>Quantity<br/>SalesAmount<br/>CostAmount]
end
D1 -.1:Many.-> F
D2 -.1:Many.-> F
D3 -.1:Many.-> F
D4 -.1:Many.-> F
style F fill:#fff3e0
style D1 fill:#e1f5fe
style D2 fill:#e1f5fe
style D3 fill:#e1f5fe
style D4 fill:#e1f5fe
See: diagrams/01_fundamentals_star_schema.mmd
Diagram Explanation:
This star schema diagram shows the classic data warehouse pattern used in Power BI. The central orange box is the Fact Table (Sales) containing the quantitative data - actual sales amounts, quantities, costs. Each sale is one row. The blue boxes are Dimension Tables providing context: Date (when the sale happened), Customer (who bought), Product (what was bought), and Store (where it was sold). The dotted arrows show one-to-many relationships: one Date can have many Sales, one Customer can have many Sales, etc. Power BI uses these relationships to filter fact data when you slice by dimensions. For example, if you filter to "Electronics" in Product dimension, Power BI automatically filters the Sales fact table to show only Electronics sales. This structure is the foundation of efficient Power BI models.
Key Relationships in Star Schema:
This one-to-many pattern is the foundation of Power BI relationships and will be covered in detail in Domain 2.
What it is: DAX (Data Analysis Expressions) is Power BI's formula language for creating calculations, measures, and calculated columns. It looks similar to Excel formulas but is far more powerful for working with relational data models.
Why it exists: While Power Query prepares and shapes data, DAX analyzes it. You need DAX to create business metrics (Total Sales, Profit Margin %), time-based calculations (Year-to-Date, Prior Year), and complex analytical logic. DAX is what turns your data model into actionable insights.
Key DAX Concepts:
⭐ Must Know: Measures are computed at query time and change based on slicers/filters. Calculated columns are computed at refresh time and stored in the model. Use measures for aggregations (SUM, AVERAGE, COUNT), calculated columns for row-level logic.
| Term | Definition | Power BI Context |
|---|---|---|
| Semantic Model | Collection of tables, relationships, and calculations | Previously called "Dataset" - your data model in Power BI |
| Fact Table | Table containing measurable, quantitative data | Sales transactions, web clicks, financial records |
| Dimension Table | Table containing descriptive attributes | Customers, products, dates, locations |
| Relationship | Connection between tables based on matching columns | Links dimension to fact tables for filtering |
| Cardinality | Type of relationship (1:1, 1:Many, Many:Many) | Defines how rows relate between tables |
| Filter Context | Set of filters applied to a calculation | Determines what data a measure aggregates |
| Row Context | Current row being evaluated | Used in calculated columns and iterator functions |
| Measure | Dynamic calculation aggregating data | Total Sales, Average Price, Customer Count |
| Calculated Column | Row-level calculation creating new column | Full Name = First Name & Last Name |
| Query Folding | Power Query steps translating to source queries | Improves performance by pushing work to database |
| VertiPaq | Power BI's columnar compression engine | Stores imported data in highly compressed format |
Power BI follows a clear pipeline from data to insights:
Key Principle: Data flows one direction through this pipeline. Quality issues should be fixed at the earliest stage - clean in Power Query, not with DAX workarounds.
Before moving to Domain 1, ensure you can answer these questions:
If you answered "no" to any question, review that section before proceeding.
Scenario: You're building a sales analytics solution for a retail company.
Questions:
Answers:
Next Steps: Proceed to 02_domain1_prepare_data to learn Power Query data transformation in depth.
What you'll learn:
Time to complete: 10-12 hours
Prerequisites: Chapter 0 (Fundamentals)
Exam weight: 25-30% (approximately 13-15 questions)
Why this domain matters: Data preparation is the foundation of every Power BI solution. Poor data quality or inefficient transformations lead to inaccurate reports and slow performance. This domain tests your ability to get data from various sources, identify and fix quality issues, and shape data for optimal analysis.
The problem: Business data exists in dozens of formats and locations - SQL databases, Excel files, cloud services, web APIs, SharePoint lists. Each source has different connection methods, authentication requirements, and performance characteristics.
The solution: Power BI provides 100+ data connectors with unified Power Query interface. Understanding connection modes, credentials management, and when to use parameters enables flexible, maintainable solutions.
Why it's tested: The exam verifies you can select appropriate data sources, configure connections securely, and choose the right storage mode for each scenario.
Data source connections are the entry points that allow Power BI to access external data. Each connection type has specific configuration options for authentication, privacy, and refresh capabilities.
Organizations store data across multiple systems - on-premises databases, cloud services, files, APIs. Power BI needs a standardized way to connect to these diverse sources while maintaining security and enabling refresh schedules.
Think of data connections like different keys on a keyring. Each key (connector) is designed for a specific lock (data source). Some keys are simple (file paths), others require special permissions (OAuth tokens), and some need security codes (database credentials). Power BI's connector library is your master keyring.
User initiates connection (Get Data): You select the connector type from 100+ available options. Power BI loads the appropriate connector driver and displays the connection dialog specific to that source type. For example, SQL Server shows server/database fields, while Excel shows file browser.
Configure connection parameters: You provide source-specific information like server address, file path, or API endpoint. Power BI validates the format and availability of the source. For parameterized connections, you can use Power Query parameters to make connections dynamic.
Authentication occurs: Power BI prompts for credentials based on the source type. Options include Windows authentication, database credentials, OAuth tokens, API keys, or anonymous access. Credentials are encrypted and stored separately from the report file for security.
Privacy levels evaluated: Power BI checks privacy level settings (Private, Organizational, Public) to prevent accidental data leakage when combining sources. If privacy levels conflict, you'll get a firewall error that must be resolved before data flows.
Data preview loads: Power Query connects to the source and retrieves sample data (typically first 1000 rows). You see the Navigator window showing available tables, views, or files. This preview uses minimal data transfer to keep response fast.
Connection finalized: Once you select tables and click "Transform Data" or "Load", Power BI creates a query object with connection metadata. This includes the M code defining the connection, which you can view and edit in Advanced Editor.
📊 Data Connection Flow Diagram:
graph TB
Start[User: Get Data] --> SelectConnector[Select Connector Type]
SelectConnector --> ConfigParams[Configure Parameters<br/>Server, File Path, URL]
ConfigParams --> Auth{Authentication<br/>Required?}
Auth -->|Yes| ProvideCredentials[Provide Credentials<br/>Windows/Database/OAuth/API Key]
Auth -->|No| AnonymousAccess[Anonymous Access]
ProvideCredentials --> Privacy[Set Privacy Levels<br/>Private/Organizational/Public]
AnonymousAccess --> Privacy
Privacy --> ValidatePrivacy{Privacy<br/>Compatible?}
ValidatePrivacy -->|No| PrivacyError[Firewall Error<br/>Adjust Privacy Levels]
ValidatePrivacy -->|Yes| Preview[Load Data Preview<br/>Navigator Window]
PrivacyError --> Privacy
Preview --> SelectTables[Select Tables/Objects]
SelectTables --> Choice{Transform or Load?}
Choice -->|Transform| PowerQuery[Open Power Query Editor<br/>Create Query with M Code]
Choice -->|Load| DirectLoad[Load to Model<br/>Skip Transformations]
PowerQuery --> Complete[Connection Complete]
DirectLoad --> Complete
style Start fill:#e1f5fe
style Complete fill:#c8e6c9
style PrivacyError fill:#ffebee
style PowerQuery fill:#f3e5f5
style Auth fill:#fff3e0
See: diagrams/02_domain1_connection_flow.mmd
Diagram Explanation (Detailed):
This flowchart illustrates the complete data connection process in Power BI Desktop. The journey begins when a user clicks "Get Data" (blue start node) and selects from 100+ available connectors. The flow then branches based on whether authentication is required - some sources like public web pages allow anonymous access, while databases and cloud services require credentials. The orange authentication decision diamond is critical because it determines the security model. After authentication, privacy level assignment (organizational data policy compliance) occurs. If privacy levels are incompatible between combined sources, a firewall error (red node) forces you to adjust settings - this prevents accidentally mixing private and public data. Once privacy validates, the Navigator preview window loads sample data. Users then face another choice: transform data in Power Query (purple node) for cleaning and shaping, or load directly for simple scenarios. The green completion node indicates a successful connection with query object created. This entire flow executes in seconds but understanding each step prevents common connection errors.
Detailed Example 1: Connecting to SQL Server Database
Your company stores sales data in SQL Server 2019 on server "SALES-DB-01", database "AdventureWorks". Here's the step-by-step process:
Source = Sql.Database("SALES-DB-01", "AdventureWorks")Why this works: Windows authentication leverages your domain credentials (no password storage needed). Organizational privacy level allows combining with other internal sources. The M code Sql.Database() function creates a query-folding capable connection meaning transformations can push down to SQL Server for better performance.
Detailed Example 2: Connecting to Excel File with Parameters
You have monthly sales files named "Sales_YYYY_MM.xlsx" in SharePoint folder, and need flexible file selection:
Source = Excel.Workbook(File.Contents("C:\SharePoint\Sales_2024_01.xlsx"))Source = Excel.Workbook(File.Contents("C:\SharePoint\Sales_" & SelectedMonth & ".xlsx"))Why this works: Parameters make connections dynamic. The M expression concatenates the parameter value into the file path. When SelectedMonth changes from "2024_01" to "2024_02", Power Query automatically connects to different file. This avoids creating dozens of queries for each month's file.
Detailed Example 3: Web API with OAuth Authentication
Connecting to Salesforce API to extract CRM data requires OAuth token-based authentication:
Why this works: OAuth is more secure than embedding passwords - tokens expire and can be revoked. Power BI's Web connector automatically handles token refresh using refresh tokens. The JSON-to-table conversion happens in Power Query where you expand nested fields into columns. This pattern works for any OAuth-enabled API (Microsoft Graph, Google Analytics, etc.).
⭐ Must Know (Critical Facts):
When to use (Comprehensive):
Limitations & Constraints:
💡 Tips for Understanding:
⚠️ Common Mistakes & Misconceptions:
🔗 Connections to Other Topics:
Storage modes determine where data physically resides and how Power BI accesses it. Import stores data in compressed VertiPaq engine. DirectQuery leaves data at source and queries on-demand. Live Connection uses another Power BI dataset or Analysis Services model.
Different business scenarios have conflicting requirements - some need blazing fast performance (Import), others need up-to-the-second data (DirectQuery), while some need centralized governance (Live Connection). Storage modes let you choose the right trade-off for your specific situation.
Think of a library: Import is like checking out books and taking them home (fast access, but you have a copy that might get outdated). DirectQuery is like going to the library every time you need to read (always current, but slower and depends on library being open). Live Connection is like using another library's online catalog that they maintain (they manage the collection, you just search it).
Import Mode Process:
DirectQuery Mode Process:
Live Connection Process:
📊 Storage Mode Comparison Diagram:
graph TB
subgraph "Import Mode"
I1[Data Source] -->|Extract All Data| I2[Power Query Transforms]
I2 -->|Load| I3[VertiPaq Engine<br/>Compressed In-Memory Storage]
I3 -->|DAX Query<br/>Milliseconds| I4[Visual Renders]
I5[Scheduled Refresh] -.->|Update Data<br/>8x Daily Max Pro| I3
I3 -.->|10:1 Compression| I6[1GB Limit Pro<br/>10GB Premium]
end
subgraph "DirectQuery Mode"
D1[Data Source] -->|Metadata Only| D2[Power BI Schema]
D3[User Interaction] -->|Generate SQL| D2
D2 -->|Execute Query| D1
D1 -->|Result Set<br/>1M Row Limit| D4[Visual Renders]
D4 -.->|Cache 1hr| D5[Query Cache]
D5 -.-> D4
end
subgraph "Live Connection"
L1[Published Dataset<br/>or Analysis Services] -->|Connection| L2[Power BI Report]
L3[User Interaction] -->|DAX Query| L1
L1 -->|Result| L4[Visual Renders]
L5[Source Model Refresh] -.->|Updates All Reports| L1
end
style I3 fill:#c8e6c9
style D1 fill:#fff3e0
style L1 fill:#e1f5fe
style I6 fill:#ffebee
See: diagrams/02_domain1_storage_modes.mmd
Diagram Explanation (Detailed):
This diagram contrasts the three storage modes' data flow and architecture. In Import Mode (top, green VertiPaq engine), data flows from source through Power Query transformations into compressed in-memory storage. The VertiPaq engine achieves 10:1 compression using dictionary encoding, but this introduces a 1GB size limit on Pro licenses (10GB on Premium). DAX queries execute in milliseconds against this in-memory data. The dotted arrow shows scheduled refresh updating the dataset up to 8 times daily on Pro. In DirectQuery Mode (middle, orange data source), only metadata (table/column schema) gets stored. Each user interaction generates a native SQL query sent to the source database, which returns results limited to 1 million rows. A 1-hour query cache (dotted) reduces source load. In Live Connection (bottom, blue published dataset), the report connects to an existing Power BI dataset or Analysis Services model. DAX queries forward to the source model's engine, and the source model's refresh schedule updates all connected reports simultaneously. This enables centralized governance where one model serves many reports.
Detailed Example 1: Import Mode for Sales Dashboard
Your retail company has 3 years of sales history (500K transactions, 800MB uncompressed). Dashboard updates nightly:
Setup:
Why it works: Import mode's in-memory columnar storage is optimized for aggregations. The 800MB source data compresses to 80MB (well under 1GB limit). Nightly refresh is acceptable since sales data doesn't need real-time updates. Users get instant visual interactions because DAX queries execute against compressed in-memory data, not hitting the source database.
Detailed Example 2: DirectQuery for Real-Time Inventory
Warehouse management system requires real-time inventory levels (database: 50 million rows, 20GB):
Setup:
SELECT WarehouseID, SUM(StockLevel) FROM Inventory WHERE WarehouseID = 5 GROUP BY WarehouseIDWhy it works: DirectQuery eliminates data size limits since no data stored locally. Every visual query hits the live source database, ensuring inventory levels reflect current state within cache refresh (1 hour). The trade-off is slower performance (2-3 seconds vs milliseconds) because queries travel over network to database. This is acceptable for real-time monitoring where accuracy matters more than speed.
Detailed Example 3: Live Connection for Enterprise BI
IT department publishes centralized Sales dataset, 20 report developers need to create departmental reports:
Setup:
Selected Period Sales = [Total Sales]Why it works: Live Connection enables "single source of truth" - one dataset, many reports. IT maintains data model quality, security (RLS), and refresh schedule centrally. Report developers focus on visualization and storytelling without managing data refresh. When source dataset updates, all connected reports reflect new data immediately without separate refreshes. This scales better than 20 independent Import models.
📊 Storage Mode Decision Tree:
graph TD
Start[Start: Choose Storage Mode] --> Q1{Data size<br/>compressed?}
Q1 -->|< 1GB| Q2{Real-time<br/>needed?}
Q1 -->|> 1GB| Q3{Have Premium<br/>capacity?}
Q2 -->|No| Import[✅ Import Mode<br/>Best performance<br/>8x daily refresh]
Q2 -->|Yes| Q4{Updates within<br/>minutes?}
Q3 -->|Yes| Q5{Size < 10GB?}
Q3 -->|No| DirectQuery[✅ DirectQuery<br/>No size limit<br/>Near real-time]
Q4 -->|Yes| DirectQuery
Q4 -->|No| ImportAuto[✅ Import + Auto Refresh<br/>Up to 48x daily Premium]
Q5 -->|Yes| Import
Q5 -->|No| DirectQuery
Start --> Q6{Connect to<br/>existing dataset?}
Q6 -->|Yes| LiveConn[✅ Live Connection<br/>Centralized model<br/>No local refresh]
Q6 -->|No| Q1
Start --> Q7{Mix Import +<br/>DirectQuery?}
Q7 -->|Yes| Composite[✅ Composite Model<br/>Import dimensions<br/>DirectQuery facts]
style Import fill:#c8e6c9
style DirectQuery fill:#fff3e0
style LiveConn fill:#e1f5fe
style Composite fill:#f3e5f5
style Start fill:#e0e0e0
See: diagrams/02_domain1_storage_decision.mmd
Decision Logic Explained:
Size-Based Decision Path (left side): Start by estimating compressed data size. If under 1GB (fits Pro license), consider Import for best performance unless real-time needed. If 1-10GB, Premium capacity required for Import. Over 10GB, DirectQuery becomes necessary regardless of license.
Real-Time Requirements Path (middle): If updates needed within minutes, DirectQuery is only option (Import max 48x daily = every 30 min on Premium). If hourly updates acceptable, Import with automatic page refresh works better.
Existing Dataset Path (top right): If connecting to published Power BI dataset or Analysis Services, Live Connection is the answer - you cannot Import from another dataset.
Hybrid Needs Path (bottom right): When you need both real-time and performance (e.g., real-time sales facts with historical customer dimensions), Composite Model combines Import and DirectQuery in same model.
Key Decision Factors:
The problem: Real-world data is messy - null values, duplicates, inconsistent formats, unexpected values, and import errors plague every data source. Loading dirty data leads to inaccurate reports and user mistrust.
The solution: Power Query provides data profiling tools to assess quality and transformation functions to clean issues at source before loading into the model.
Why it's tested: The exam validates you can identify data quality issues, understand their impact, and apply appropriate cleaning techniques. This is critical because "garbage in = garbage out" applies to all BI solutions.
Data profiling is the process of examining data to understand its structure, content, quality, and relationships. Power Query provides three profiling tools: Column Quality, Column Distribution, and Column Profile.
You cannot fix data problems you don't know about. Before transforming data, you need visibility into: How many rows have errors? Are there unexpected null values? What's the range of values? Which values occur most frequently? Profiling answers these questions.
Think of data profiling like a home inspection before buying a house. The inspector checks foundation (structure), plumbing (flow), electrical (connections), and surfaces (quality). They provide a report listing all issues found. Similarly, data profiling inspects your dataset and reports problems before you "buy into" using it for decisions.
Enable profiling: In Power Query Editor, go to View tab → check "Column Quality", "Column Distribution", and "Column Profile". By default, profiling analyzes first 1000 rows only. Change to "Column profiling based on entire dataset" in status bar for full analysis (slower but accurate).
Column Quality indicators appear: Above each column header, you see three bars:
Column Distribution displays: Below column header, you see:
Column Profile pane shows details: Bottom pane displays:
Error identification: Red error bars indicate rows that failed type conversion or transformation. Click error bar to filter to error rows only. Right-click column → "Replace Errors" or "Remove Errors" to handle them.
Quality assessment: Based on profiling results, you decide on cleaning actions:
📊 Data Profiling Workflow Diagram:
graph TD
Start[Load Data in Power Query] --> Enable[Enable Profiling Tools<br/>View → Column Quality/Distribution/Profile]
Enable --> Scope{Profiling<br/>Scope?}
Scope -->|First 1000 rows| FastProfile[Quick Profile<br/>Fast but may miss issues]
Scope -->|Entire dataset| FullProfile[Full Profile<br/>Accurate but slower]
FastProfile --> Analyze[Analyze Metrics]
FullProfile --> Analyze
Analyze --> Quality[Column Quality Check<br/>Valid % / Error % / Empty %]
Quality --> HighErrors{Error %<br/>> 5%?}
HighErrors -->|Yes| InvestigateError[Investigate Error Source<br/>Click error bar to filter]
HighErrors -->|No| Distribution
InvestigateError --> FixError{Can fix<br/>at source?}
FixError -->|Yes| FixSource[Modify source query or connection]
FixError -->|No| HandleError[Remove Errors or Replace Errors]
FixSource --> Distribution
HandleError --> Distribution
Distribution[Column Distribution Check<br/>Distinct / Unique counts]
Distribution --> Cardinality{High<br/>cardinality?}
Cardinality -->|Yes| CardinalityImpact[Consider impact on model size<br/>Group values or remove column]
Cardinality -->|No| Profile
CardinalityImpact --> Profile
Profile[Column Profile Details<br/>Statistics + Value Distribution]
Profile --> Outliers{Outliers or<br/>unexpected values?}
Outliers -->|Yes| Clean[Apply Cleaning Steps<br/>Filter/Replace/Remove]
Outliers -->|No| ValidData[Data Quality Acceptable]
Clean --> ValidData
ValidData --> Continue[Continue Transformations]
style Start fill:#e1f5fe
style HighErrors fill:#ffebee
style ValidData fill:#c8e6c9
style Analyze fill:#fff3e0
See: diagrams/02_domain1_profiling_workflow.mmd
Diagram Explanation (Detailed):
This flowchart shows the systematic data profiling process in Power Query. After loading data (blue start), you enable profiling tools from the View tab. The first decision point (orange diamond) is profiling scope - "first 1000 rows" gives quick insights but may miss issues in larger datasets, while "entire dataset" is slower but comprehensive. The workflow then progresses through three analysis stages: (1) Column Quality checks error percentages - if errors exceed 5%, the flow branches red to investigate root cause and either fix at source or apply error handling. (2) Column Distribution reveals cardinality issues - high distinct counts (e.g., millions of unique IDs) impact model size and require grouping or removal. (3) Column Profile displays detailed statistics and value distribution where outliers and unexpected values get identified. Each issue gets addressed through cleaning steps (filter rows, replace values, remove columns). The flow ends at green "Data Quality Acceptable" when all checks pass, allowing you to continue with transformations confident in data quality. This systematic approach ensures comprehensive quality assessment before model loading.
Detailed Example 1: Profiling Sales Data with Errors
CSV file "Sales2024.csv" imported shows quality issues:
Profiling Results:
Revenue column: 85% valid (green), 10% errors (red), 5% empty (gray)
Click red error bar → filter shows 100 rows with "Error" in Revenue
Column Profile shows these have text values like "N/A" or "TBD" instead of numbers
Solution: Replace errors with null: Right-click Revenue → Replace Errors → null
Table.ReplaceErrorValues(#"Changed Type", {{"Revenue", null}})ProductID column: Distinct count 2,500, Unique count 2,500 (every value appears once)
This indicates proper ID column (100% unique values expected for IDs)
Column Profile shows no duplicates - good quality
Category column: Distinct count 5 (Electronics, Clothing, Food, Home, Other)
Value distribution shows: Electronics 45%, Clothing 30%, Food 15%, Home 8%, Other 2%
One outlier value "Electroncs" (typo) with 10 occurrences found in distribution chart
Solution: Replace value: Right-click Category → Replace Values → "Electroncs" to "Electronics"
Table.ReplaceValue(#"Replaced Errors", "Electroncs", "Electronics", Replacer.ReplaceText, {"Category"})Why this works: Profiling first reveals issues before transformations. Replacing errors with null allows DAX measures to handle missing data appropriately (SUM ignores nulls). Fixing typos ensures accurate grouping in visuals. Working in this order - profile, then clean - prevents downstream problems.
Null Handling:
Duplicate Removal:
Error Handling:
Value Correction:
Type Conversion Errors:
The problem: Source data rarely matches the structure needed for analysis. Sales data might be in wide format when you need long format. Date components are in separate columns when you need a single date. Text values need parsing into multiple fields.
The solution: Power Query M language provides 300+ transformation functions to reshape, enrich, combine, and restructure data into optimal format for modeling and reporting.
Why it's tested: Transformations are the core of data preparation. The exam extensively tests your ability to apply appropriate transformations for common scenarios using both the UI and M code.
Common text operations:
Split Column (by delimiter, number of characters, positions): Breaks one column into multiple
Table.SplitColumn(Source, "FullName", Splitter.SplitTextByDelimiter(" "), {"FirstName", "LastName"})Merge Columns: Combines multiple columns with separator
Table.CombineColumns(Source, {"FirstName", "LastName"}, Combiner.CombineTextByDelimiter(" "), "FullName")Extract: Pull first/last/range of characters
Text.Start([Column], 3)Text.End([Column], 2)Text.Range([Column], 5, 3) (start position 5, length 3)Format: Change case, trim spaces
Text.Upper([Column])Text.Lower([Column])Text.Proper([Column])Text.Trim([Column])Mathematical operations:
Number.Round([Value], 2) (round to 2 decimals)Number.Abs([Value])Number.Mod([Value], 10) (remainder after division)Date parsing and extraction:
Date.Year([DateColumn]) → creates integer columnDate.Month([DateColumn]) or Date.MonthName([DateColumn]) for nameDate.Day([DateColumn])Date.DayOfWeek([DateColumn]) or Date.DayOfWeekName([DateColumn])Date.FromText("2024-01-15") or DateTime.FromText("2024-01-15 14:30:00")Create columns with logic:
If-Then-Else via UI: Add Column → Conditional Column
if [Sales] > 1000 then "High" else "Low"Nested conditions: Handle multiple cases
Custom column with M: Add Column → Custom Column
if [Country] = "USA" and [Sales] > 500 then [Sales] * 0.9 else [Sales]What it does: Converts unique values in a column into multiple columns (wide format)
When to use: When you have attribute-value pairs that should become separate columns
Example:
Before (Long): After (Wide):
Month | Metric | Value Month | Sales | Cost
Jan | Sales | 1000 → Jan | 1000 | 400
Jan | Cost | 400 Feb | 1500 | 600
Feb | Sales | 1500
Feb | Cost | 600
M Code: Table.Pivot(Source, List.Distinct(Source[Metric]), "Metric", "Value")
Step-by-step:
What it does: Converts multiple columns into attribute-value pairs (long format)
When to use: When you have months/years/categories as columns but need them as rows
Example:
Before (Wide): After (Long):
Month | Jan | Feb | Mar → Month | Period | Sales
Sales | 100 | 150 | 200 Sales | Jan | 100
Cost | 40 | 60 | 80 Sales | Feb | 150
Sales | Mar | 200
Cost | Jan | 40
Cost | Feb | 60
Cost | Mar | 80
M Code: Table.UnpivotOtherColumns(Source, {"Month"}, "Period", "Sales")
Step-by-step:
What it does: Swaps rows and columns (rotates table 90 degrees)
When to use: Rarely - only when column headers are in first column instead of first row
Example:
Before: After:
Name | John Doe | John Doe | Jane Smith
Age | 30 → Name | John Doe | Jane Smith
City | NYC Age | 30 | 28
City | NYC | LA
M Code: Table.Transpose(Source)
What it does: Combines two tables horizontally based on matching key columns (like SQL JOIN)
When to use: Enriching one table with columns from another (e.g., add Product details to Sales)
Join Types:
Example - Left Outer Join:
Sales Table: Products Table: Result:
OrderID | ProductID ProductID | Name OrderID | ProductID | Name
1001 | P1 + P1 | Widget → 1001 | P1 | Widget
1002 | P2 P2 | Gadget 1002 | P2 | Gadget
1003 | P3 P3 | Tool 1003 | P3 | Tool
1004 | P99 1004 | P99 | null (no match)
Step-by-step:
M Code:
Table.NestedJoin(Sales, {"ProductID"}, Products, {"ProductID"}, "Products", JoinKind.LeftOuter)
Table.ExpandTableColumn(#"Merged Queries", "Products", {"Name"}, {"Product Name"})
What it does: Combines two or more tables vertically (stacks rows) - like SQL UNION
When to use: Combining data from multiple sources with same structure (e.g., monthly files)
Example:
January Sales: February Sales: Result (Appended):
OrderID | Amount OrderID | Amount OrderID | Amount
1001 | 100 + 2001 | 200 → 1001 | 100
1002 | 150 2002 | 250 1002 | 150
2001 | 200
2002 | 250
Step-by-step:
M Code: Table.Combine({January, February, March})
⭐ Must Know (Critical Facts):
📊 Merge vs Append Diagram:
graph TB
subgraph "Merge Queries (Horizontal Join)"
M1[Sales Table<br/>OrderID | ProductID | Amount] --> M3[Match on ProductID]
M2[Products Table<br/>ProductID | Name | Category] --> M3
M3 --> M4[Result: Sales + Products<br/>OrderID | ProductID | Amount | Name | Category]
M5[Join Types] --> M6[Left Outer: All Sales + matching Products]
M5 --> M7[Inner: Only matching rows]
M5 --> M8[Left Anti: Sales without Products]
end
subgraph "Append Queries (Vertical Union)"
A1[January Sales<br/>OrderID | Amount | Date] --> A3[Stack Rows]
A2[February Sales<br/>OrderID | Amount | Date] --> A3
A3 --> A4[Result: Combined Sales<br/>All rows from both tables]
A5[Requirements] --> A6[Column names must match exactly]
A5 --> A7[Data types should match]
end
style M4 fill:#c8e6c9
style A4 fill:#c8e6c9
style M3 fill:#fff3e0
style A3 fill:#fff3e0
See: diagrams/02_domain1_merge_append.mmd
Diagram Explanation:
Top section shows Merge Queries - a horizontal join operation combining Sales and Products tables based on matching ProductID values (orange matching node). The result (green) is a wider table with columns from both sources. Three common join types are shown: Left Outer keeps all Sales records and adds matching Product details (null if no match), Inner keeps only matching rows, and Left Anti finds Sales without corresponding Products (orphan detection). Bottom section shows Append Queries - a vertical union stacking January and February sales tables (orange stack node). The result (green) is a taller table with all rows from both sources. Critical requirements shown: column names must match exactly (case-sensitive) for proper alignment, and data types should match to avoid errors. Use Merge when you need to ADD COLUMNS from another table based on a key. Use Append when you need to ADD ROWS from tables with identical structure.
Fact Tables (Transaction data):
Dimension Tables (Descriptive data):
Creating Fact Table in Power Query:
Creating Dimension Table in Power Query:
Reference vs Duplicate Queries:
Reference: Creates linked query pointing to original - changes in original affect reference
= Source (references another query)Duplicate: Creates independent copy - changes isolated
Table.DuplicateColumn(Source, "Column1", "Column1 - Copy")Test yourself before moving on:
Scenario Type 1: Storage Mode Selection
Scenario Type 2: Privacy Level Errors
Scenario Type 3: Unpivot Requirement
Scenario Type 4: Join Type Selection
Try these from your practice test bundles:
If you scored below 75%:
Key Transformations:
Join Types:
Storage Modes:
Profiling Tools:
Next Steps: Proceed to 03_domain2_model_data to learn data modeling, relationships, and DAX calculations.
What it is: Query folding is Power Query's ability to translate M language transformations into native database queries (SQL, etc.) that execute at the data source instead of locally in Power BI.
Why it exists: When you transform data in Power Query, you can either process it locally (slow, resource-intensive) or push the work to the source database (fast, leverages database optimization). Query folding enables the latter, dramatically improving performance and reducing memory usage.
Real-world analogy: Imagine you're ordering a custom sandwich. Query folding is like telling the deli exactly what you want (filtered, sliced, assembled) rather than ordering all ingredients separately and assembling it yourself at home. The professional deli does it faster and better.
How it works (Detailed step-by-step):
When folding breaks (Common scenarios):
📊 Query Folding Performance Diagram:
graph TB
subgraph "With Query Folding (Fast)"
A1[Power Query] -->|SQL Query| B1[SQL Server]
B1 -->|Filter at Source| C1[Process 1M → 10K rows]
C1 -->|Transfer 10K rows| D1[Power BI]
D1 --> E1[Load: 2 seconds]
end
subgraph "Without Query Folding (Slow)"
A2[Power Query] -->|Fetch All| B2[SQL Server]
B2 -->|Transfer 1M rows| C2[Power BI Memory]
C2 -->|Filter Locally| D2[Process 1M → 10K rows]
D2 --> E2[Load: 45 seconds]
end
style E1 fill:#c8e6c9
style E2 fill:#ffebee
See: diagrams/02_domain1_query_folding_performance.mmd
Diagram Explanation: The diagram illustrates the dramatic performance difference between query folding and non-folding scenarios. In the top path (with folding), Power Query sends a SQL query to SQL Server that filters 1 million rows down to 10,000 at the source. Only 10,000 rows are transferred over the network, resulting in a 2-second load time. In the bottom path (without folding), all 1 million rows must be transferred to Power BI, consuming network bandwidth and memory, then filtered locally. This results in a 45-second load time - over 20x slower. This performance gap widens with larger datasets. Query folding is essential for working with big data sources efficiently.
Detailed Example 1: Optimizing a Sales Query with Folding
You're connecting to a SQL Server database with 5 million sales transactions. You need only 2024 data for products in the "Electronics" category where revenue exceeded $100. Here's how query folding helps:
Without understanding folding (what beginners do):
With query folding (optimized approach):
How to verify folding is working: Right-click any step in Applied Steps and look for "View Native Query" option. If available, folding is happening. If grayed out, that step broke folding.
Detailed Example 2: When Folding Breaks and How to Fix It
You're working with a database query that was folding perfectly until you added a text transformation:
Scenario: Customer names need to be formatted as "FirstName LastName" but source has "LASTNAME, FIRSTNAME" in all caps.
Approach 1 (breaks folding):
= Table.AddColumn(Source, "FormattedName", each
Text.Proper(
Text.AfterDelimiter([FullName], ",") & " " &
Text.BeforeDelimiter([FullName], ",")
)
)
This breaks folding because Power Query's text functions don't map directly to SQL string functions. All data must now be downloaded.
Approach 2 (maintains folding for SQL Server):
Use SQL-compatible logic by doing the transformation at the source:
When to accept broken folding: If you've already filtered data down to a small subset (say 10,000 rows), breaking folding for a final transformation is acceptable. The key is to maintain folding for heavy operations (filtering millions of rows, joining large tables).
Detailed Example 3: Incremental Refresh with Query Folding
You have a 10-year sales history (100M rows) but only need to refresh the last 30 days daily. Incremental refresh requires query folding:
Setup:
Date >= RangeStart and Date < RangeEndWhy folding is critical here: Without folding, Power BI would download all 100M rows to check dates locally, defeating the purpose of incremental refresh. With folding, the database returns only the last 30 days (approximately 270K rows), making daily refreshes fast.
⭐ Must Know (Critical Facts):
What it is: Incremental refresh is a technique where Power BI refreshes only new or changed data rather than reloading the entire dataset, dramatically reducing refresh times for large historical datasets.
Why it exists: Consider a sales database with 10 years of history. Daily sales add maybe 50,000 new rows, but historical data (99.9% of the dataset) never changes. Refreshing all 100 million rows daily wastes time, resources, and bandwidth. Incremental refresh solves this by refreshing only recent data while archiving historical data.
Real-world analogy: Think of a library. When new books arrive, you don't reorganize the entire library. You add new books to the "New Arrivals" section and periodically move older items to archives. Incremental refresh works the same way.
How it works (Detailed step-by-step):
Date >= RangeStart and Date < RangeEnd📊 Incremental Refresh Architecture Diagram:
graph TB
subgraph "Power BI Dataset (Partitioned)"
P1[2020 Data<br/>Archived]
P2[2021 Data<br/>Archived]
P3[2022 Data<br/>Archived]
P4[2023 Data<br/>Archived]
P5[Jan-Nov 2024<br/>Archived]
P6[Dec 2024<br/>Refreshed Daily]
end
DB[(SQL Server<br/>100M Rows)] -->|RangeStart: Dec 1<br/>RangeEnd: Today| P6
DB -.->|First Load Only| P1
DB -.->|First Load Only| P2
DB -.->|First Load Only| P3
DB -.->|First Load Only| P4
DB -.->|First Load Only| P5
P6 -->|New Data| REFRESH[Daily Refresh<br/>5 min]
P1 & P2 & P3 & P4 & P5 -.->|No Refresh| SKIP[Skipped<br/>Saves 95% time]
style P6 fill:#fff3e0
style REFRESH fill:#c8e6c9
style SKIP fill:#e3f2fd
See: diagrams/02_domain1_incremental_refresh.mmd
Diagram Explanation: This diagram shows how incremental refresh partitions data across years. The Power BI dataset is divided into partitions: 2020-2023 data is archived (gray boxes), January-November 2024 is also archived, but December 2024 (orange box) is the active partition that gets refreshed daily. The SQL Server database contains all 100M rows, but only queries the range from December 1 to today (shown by the solid arrow). The dotted arrows indicate historical data was loaded only once during initial setup. During daily refresh, only the December partition is updated (5-minute refresh time shown in green), while all historical partitions are skipped (blue), saving 95% of refresh time. This architecture allows Power BI to handle massive datasets efficiently.
Detailed Example 1: Setting Up Incremental Refresh for Sales Data
You have a sales database with transaction history from 2015 to present (50 million rows). New sales are added daily. You want to:
Step-by-step configuration:
// Create RangeStart parameter (exact name required)
RangeStart = #datetime(2023, 1, 1, 0, 0, 0) meta [IsParameterQuery=true, Type="DateTime"]
// Create RangeEnd parameter (exact name required)
RangeEnd = #datetime(2025, 12, 31, 23, 59, 59) meta [IsParameterQuery=true, Type="DateTime"]
// In your Sales query, add this filter step:
#"Filtered Rows" = Table.SelectRows(Source, each [OrderDate] >= RangeStart and [OrderDate] < RangeEnd)
Verify query folding: Right-click the "Filtered Rows" step → "View Native Query". You should see a SQL WHERE clause with date filters. If this option is grayed out, folding is broken and incremental refresh won't work.
In Power BI Desktop: Right-click the Sales table → "Incremental refresh"
Configure policy:
Publish to workspace: Publish the report to Power BI Service (Premium capacity or PPU required)
First refresh in Service: Takes longer (loads 2 years of data), but subsequent refreshes only process 7 days
What happens behind the scenes:
WHERE OrderDate >= [Today - 7 days] AND OrderDate < [Today]Performance impact:
Detailed Example 2: Change Detection Scenario
Sometimes historical data changes (order corrections, retroactive adjustments). Change detection handles this:
Scenario: Finance team corrects revenue figures for December 2024 on January 15, 2025. Without change detection, this correction would be missed.
With change detection enabled (set to 30 days):
WHERE OrderDate >= [Today - 30 days]When to use change detection:
Detailed Example 3: Troubleshooting Incremental Refresh Failures
Common issues and solutions:
Problem 1: "Incremental refresh requires query folding" error
Problem 2: Refresh takes as long as before
Problem 3: Missing historical data
⭐ Must Know (Critical Facts):
What it is: Dataflows are cloud-based ETL (Extract, Transform, Load) processes that run in Power BI Service, allowing you to centralize data preparation logic that multiple datasets can reuse.
Why it exists: In many organizations, the same raw data (e.g., sales transactions) is used by multiple reports. Without dataflows, each report creator writes their own Power Query transformations, leading to duplicated effort, inconsistent logic, and maintenance nightmares. Dataflows solve this by creating a single, centralized source of truth.
Real-world analogy: Imagine multiple chefs in different restaurants all needing prep work done (vegetables chopped, meat marinated). Instead of each chef doing their own prep, a central prep kitchen handles it, delivering ready-to-cook ingredients to all chefs. Dataflows are that central prep kitchen for data.
How it works (Detailed step-by-step):
📊 Dataflows Architecture Diagram:
graph TB
subgraph "Data Sources"
SQL[(SQL Server)]
SP[(SharePoint Lists)]
API[Web APIs]
end
subgraph "Power BI Service - Dataflow"
DF[Dataflow ETL<br/>Power Query Logic]
STORE[(Dataverse/<br/>Azure Data Lake)]
end
subgraph "Consuming Datasets"
DS1[Sales Report<br/>Dataset]
DS2[Finance Report<br/>Dataset]
DS3[Executive Dashboard<br/>Dataset]
end
SQL --> DF
SP --> DF
API --> DF
DF -->|Transform & Store| STORE
STORE -->|Clean Data| DS1
STORE -->|Clean Data| DS2
STORE -->|Clean Data| DS3
DS1 --> R1[Report 1]
DS2 --> R2[Report 2]
DS3 --> R3[Report 3]
style DF fill:#fff3e0
style STORE fill:#e1f5fe
style DS1 fill:#f3e5f5
style DS2 fill:#f3e5f5
style DS3 fill:#f3e5f5
See: diagrams/02_domain1_dataflows_architecture.mmd
Diagram Explanation: This diagram illustrates the dataflows architecture pattern. At the top are three different data sources: SQL Server, SharePoint Lists, and Web APIs (various colors). These all connect to a centralized Dataflow in Power BI Service (orange box), which contains Power Query transformation logic. The dataflow processes data and stores the transformed results in either Dataverse or Azure Data Lake Storage (blue cylinder). Three separate consuming datasets (purple boxes) - Sales Report, Finance Report, and Executive Dashboard - all connect to this centralized storage instead of querying sources directly. Each dataset then produces its respective report. This architecture provides a single source of truth, eliminates duplicate transformation logic across datasets, and allows independent refresh schedules for ETL (dataflow) and consumption (datasets).
What you'll learn:
Time to complete: 12-15 hours
Prerequisites: Chapter 1 (Prepare the Data), Chapter 0 (Fundamentals)
Exam weight: 25-30% (approximately 13-15 questions)
Why this domain matters: The data model is the foundation of every Power BI report. A well-designed model with proper relationships and efficient DAX enables fast, accurate analytics. Poor modeling leads to incorrect calculations, slow performance, and maintenance nightmares. This domain tests your ability to design optimal models and write effective DAX.
The problem: Raw tables from Power Query need structure and relationships to enable analysis. Without a proper model, you cannot create measures that span tables, drill across hierarchies, or leverage time intelligence. Star schema design is essential for BI performance and usability.
The solution: Power BI's modeling view allows you to configure table properties, create relationships, define hierarchies, and implement role-playing dimensions. Following star schema principles ensures optimal query performance and intuitive report building.
Why it's tested: Model design directly impacts solution quality. The exam verifies you can create proper relationships, understand cardinality, configure cross-filter direction, and implement advanced patterns like role-playing dimensions and many-to-many relationships.
Relationships connect tables so DAX can follow paths between them, enabling analysis across multiple tables. Each relationship has a "from" table (many side) and a "to" table (one side), with cardinality defining how rows relate.
Without relationships, tables are isolated islands. You cannot create a measure in Sales that filters by Product Category from the Products table unless a relationship exists. Relationships enable filter context propagation - when you select a category, it automatically filters related sales.
Think of relationships like family trees. A parent (one side) can have many children (many side). When you ask "show me all children of Parent A," you follow the relationship. Similarly, when you ask "show sales for Category X," Power BI follows the Product→Sales relationship to find matching sales records.
Creating a Relationship:
Filter Propagation Flow:
📊 Relationship Types Diagram:
graph TB
subgraph "One-to-Many (Most Common)"
OM1[Dim_Product<br/>ProductID PK: 1,2,3,4,5<br/>Unique values] -->|1:*| OM2[Fact_Sales<br/>ProductID FK: 1,1,2,3,3,3,4<br/>Duplicates allowed]
OM3[Filter: Category=Electronics] -->|Propagates| OM1
OM1 -->|Filters ProductID 1,2| OM2
end
subgraph "Many-to-Many (Bridge Table)"
MM1[Dim_Student<br/>StudentID: 1,2,3] <--> MM3[Bridge_Enrollment<br/>StudentID | CourseID<br/>1 | A<br/>1 | B<br/>2 | A<br/>3 | C]
MM2[Dim_Course<br/>CourseID: A,B,C] <--> MM3
MM4[Student 1 enrolled in<br/>Courses A and B]
MM5[Course A has<br/>Students 1 and 2]
end
subgraph "Role-Playing Dimension"
RP1[Dim_Date<br/>DateKey: 20240101, 20240102...] -->|OrderDate| RP2[Fact_Sales]
RP1 -->|ShipDate Inactive| RP2
RP1 -->|DeliveryDate Inactive| RP2
RP3[Activate specific<br/>relationship in DAX<br/>USERELATIONSHIP]
end
style OM1 fill:#e1f5fe
style OM2 fill:#fff3e0
style MM3 fill:#f3e5f5
style RP2 fill:#fff3e0
style RP3 fill:#c8e6c9
See: diagrams/03_domain2_relationships.mmd
Diagram Explanation (Detailed):
The diagram illustrates three critical relationship patterns. Top section shows One-to-Many, the most common pattern where one Product (blue dimension with unique ProductIDs 1-5) relates to many Sales records (orange fact with duplicate ProductIDs). When filter "Category=Electronics" applies to Products, it propagates through the relationship to filter only matching Sales rows. The "1:*" notation indicates cardinality - one Product can have many Sales. Middle section demonstrates Many-to-Many using a bridge table (purple). Direct M:M between Students and Courses is impossible since Student 1 enrolls in multiple courses (A,B) and Course A has multiple students (1,2). The Bridge_Enrollment table breaks this into two 1:M relationships, storing all combinations. Bottom section shows Role-Playing Dimension where single Date table serves multiple date roles (OrderDate, ShipDate, DeliveryDate) in Sales. Only one relationship can be active (OrderDate solid line), others are inactive (dashed). DAX function USERELATIONSHIP activates inactive relationships temporarily in calculations (green node). This pattern avoids duplicating the Date table three times.
Cardinality Types:
Cross-Filter Direction:
⭐ Must Know (Critical Facts):
The problem: Imported data contains raw values but business needs calculated metrics - profit margins, year-over-year growth, running totals, rankings. Excel formulas don't work in Power BI because data is compressed and calculations must be dynamic based on filter context.
The solution: DAX (Data Analysis Expressions) provides 200+ functions to create measures (dynamic calculations), calculated columns (row-level values), and calculated tables (generated tables). DAX understands filter context and enables time intelligence, statistical analysis, and complex business logic.
Why it's tested: DAX is the language of Power BI analytics. The exam extensively tests your ability to write measures using CALCULATE, time intelligence functions, iterators, and proper context handling.
Measures:
Total Sales = SUM(Sales[Amount])Calculated Columns:
Profit = Sales[Revenue] - Sales[Cost]When to use each:
CALCULATE is the most powerful and frequently used DAX function. It evaluates an expression while modifying the filter context - essentially saying "calculate this measure, but change the filters first."
Without CALCULATE, measures only work within existing filter context. You cannot answer questions like "what were sales last year?" or "show all products even when category is filtered" or "calculate profit margin for just USA while showing global totals." CALCULATE enables these context modifications.
Imagine you're in a library with a search filter set to "Fiction books published in 2023." CALCULATE is like temporarily changing the filter to "Fiction books published in 2022" to compare this year vs last year, then reverting back. It manipulates your filter "lens" through which you view the data.
Basic Syntax:
CALCULATE(
<expression>, -- What to calculate (measure or aggregation)
<filter1>, -- Filter modification 1
<filter2>, -- Filter modification 2
...
)
Common Patterns:
USA Sales = CALCULATE(
SUM(Sales[Amount]),
Country[Country] = "USA"
)
This REMOVES any existing Country filters and applies "USA" only.
High Value Sales = CALCULATE(
SUM(Sales[Amount]),
FILTER(Sales, Sales[Amount] > 1000)
)
FILTER adds condition while keeping other filters intact.
All Categories Sales = CALCULATE(
SUM(Sales[Amount]),
ALL(Products[Category])
)
ALL removes filters from Category column, showing total regardless of category selection.
Sales YTD = CALCULATE(
SUM(Sales[Amount]),
DATESYTD(Date[Date])
)
DATESYTD modifies date filter to include all dates from year start to current date.
Detailed Example 1: Year-over-Year Comparison
Your report shows 2024 sales, but you want to compare with 2023:
Sales This Year = SUM(Sales[Amount]) // Current filter context
Sales Last Year = CALCULATE(
SUM(Sales[Amount]),
DATEADD(Date[Date], -1, YEAR)
)
YoY Growth % = DIVIDE(
[Sales This Year] - [Sales Last Year],
[Sales Last Year]
)
How it works:
Sales This Year calculates within that context → $500KSales Last Year uses CALCULATE to shift date filter back 1 yearYoY Growth % divides difference by last year → 25% growthDetailed Example 2: Percentage of Total
Calculate each product's sales as percentage of category total:
Product Sales = SUM(Sales[Amount])
Category Total = CALCULATE(
SUM(Sales[Amount]),
ALLEXCEPT(Products, Products[Category])
)
% of Category = DIVIDE(
[Product Sales],
[Category Total]
)
How it works:
Product Sales = $50K in current contextCategory Total uses CALCULATE with ALLEXCEPT to remove all filters EXCEPT Category% of Category = $50K / $200K = 25%📊 CALCULATE Filter Context Modification Diagram:
graph TB
Start[Original Filter Context<br/>Category = Electronics<br/>Year = 2024] --> Measure1[Sales This Year<br/>= SUM Sales Amount]
Start --> Calculate[CALCULATE Function]
Calculate --> Modify[Modify Filter Context]
Modify --> Pattern1[Pattern 1: Replace Filter<br/>CALCULATE SUM, Country = USA<br/>Replaces existing Country filter]
Modify --> Pattern2[Pattern 2: Remove Filter<br/>CALCULATE SUM, ALL Products Category<br/>Removes Category filter]
Modify --> Pattern3[Pattern 3: Add Filter<br/>CALCULATE SUM, FILTER Sales>1000<br/>Adds condition, keeps others]
Modify --> Pattern4[Pattern 4: Time Shift<br/>CALCULATE SUM, DATEADD -1 YEAR<br/>Shifts date, keeps Category]
Pattern1 --> Result1[New Context:<br/>Country = USA<br/>Year = 2024]
Pattern2 --> Result2[New Context:<br/>ALL Categories<br/>Year = 2024]
Pattern3 --> Result3[New Context:<br/>Category = Electronics<br/>Year = 2024<br/>Amount > 1000]
Pattern4 --> Result4[New Context:<br/>Category = Electronics<br/>Year = 2023]
Result1 --> Calc[Calculate Expression<br/>in Modified Context]
Result2 --> Calc
Result3 --> Calc
Result4 --> Calc
style Start fill:#e1f5fe
style Calculate fill:#fff3e0
style Calc fill:#c8e6c9
style Result4 fill:#f3e5f5
See: diagrams/03_domain2_calculate.mmd
Diagram Explanation:
This flowchart shows how CALCULATE modifies filter context before evaluating expressions. Starting with original context (blue) "Electronics, 2024", the orange CALCULATE node branches into four common modification patterns. Pattern 1 (Replace) substitutes "Country=USA" completely removing any existing Country filter. Pattern 2 (Remove) uses ALL() to eliminate Category filter, showing all categories while keeping Year. Pattern 3 (Add) uses FILTER() to add "Amount>1000" condition without removing existing filters. Pattern 4 (Time Shift) uses DATEADD() to change Year to 2023 while preserving Category=Electronics (purple result) - this is key for year-over-year comparisons. All patterns converge at green "Calculate Expression" node where the measure evaluates in the modified context, then returns to original context. Understanding these patterns is critical for exam questions about CALCULATE behavior.
Prerequisites: Marked Date table required (Table Tools → Mark as Date Table)
Year-to-Date (YTD):
Sales YTD = CALCULATE(
SUM(Sales[Amount]),
DATESYTD(Date[Date])
)
// Or use shortcut: TOTALYTD(SUM(Sales[Amount]), Date[Date])
Returns sales from Jan 1 to current date in filter context.
Previous Year Comparison:
Sales PY = CALCULATE(
SUM(Sales[Amount]),
SAMEPERIODLASTYEAR(Date[Date])
)
// Or: DATEADD(Date[Date], -1, YEAR)
Shifts date filter back exactly one year.
Month-to-Date (MTD):
Sales MTD = TOTALMTD(SUM(Sales[Amount]), Date[Date])
Returns sales from month start to current date.
Previous Month:
Sales PM = CALCULATE(
SUM(Sales[Amount]),
PREVIOUSMONTH(Date[Date])
)
// Or: DATEADD(Date[Date], -1, MONTH)
Quarter-to-Date (QTD):
Sales QTD = TOTALQTD(SUM(Sales[Amount]), Date[Date])
Running Total:
Running Total = CALCULATE(
SUM(Sales[Amount]),
FILTER(
ALL(Date[Date]),
Date[Date] <= MAX(Date[Date])
)
)
Accumulates sales from beginning of time to current date.
⭐ Must Know - Time Intelligence:
DATESYTD(Date[Date], "6/30") for fiscal year ending June 30Test yourself before moving on:
Scenario Type 1: Relationship Troubleshooting
Scenario Type 2: CALCULATE Usage
Scenario Type 3: Time Intelligence
Scenario Type 4: Role-Playing Dimension
Try these from your practice test bundles:
If you scored below 75%:
Relationship Rules:
CALCULATE Patterns:
CALCULATE(SUM, Country = "USA") - replaces Country filterCALCULATE(SUM, ALL(Table[Column])) - removes filterCALCULATE(SUM, FILTER(...)) - adds conditionTime Intelligence:
TOTALYTD(SUM, Date[Date])SAMEPERIODLASTYEAR(Date[Date])CALCULATE(SUM, FILTER(ALL(Date), Date <= MAX(Date)))Next Steps: Proceed to 04_domain3_visualize_analyze to learn report creation, visualizations, and data analysis techniques.
The problem: Raw data doesn't answer business questions directly - you need metrics like "total sales", "year-over-year growth", or "average customer lifetime value".
The solution: DAX (Data Analysis Expressions) allows you to create calculated measures, columns, and tables that transform raw data into meaningful business insights.
Why it's tested: DAX represents 40-50% of Domain 2 questions and is critical for Power BI data analysts. Understanding filter context, row context, and time intelligence is essential.
What it is: DAX (Data Analysis Expressions) is a formula language for Power BI, similar to Excel formulas but specifically designed for working with relational data models. DAX formulas are used to create measures (calculations), calculated columns, and calculated tables.
Why it exists: Business intelligence requires dynamic calculations that respond to user interactions (filtering, slicing, grouping). DAX provides this dynamic calculation capability while maintaining optimal performance with large datasets.
Real-world analogy: Think of DAX like a sophisticated calculator that understands your data's relationships. Just as Excel formulas calculate values in cells, DAX formulas calculate values in your reports - but DAX can automatically adjust calculations based on what data the user is viewing.
How it works (Detailed step-by-step):
📊 DAX Evaluation Flow Diagram:
sequenceDiagram
participant User
participant Visual
participant Engine as DAX Engine
participant Model as Data Model
User->>Visual: Selects filters/slicers
Visual->>Engine: Creates query with filter context
Engine->>Model: Retrieves relevant data
Model-->>Engine: Returns filtered rows
Engine->>Engine: Evaluates DAX measure
Engine-->>Visual: Returns calculated result
Visual-->>User: Displays value
Note over Engine: Context determines<br/>which data is used
See: diagrams/03_domain2_dax_evaluation_flow.mmd
Diagram Explanation (200-400 words):
This sequence diagram illustrates how DAX calculations work in Power BI from user interaction to result display. When a user selects filters or slicers in a report, the Visual component sends this information to the DAX Engine, which creates a query with filter context - essentially defining "what data should I calculate on?". The DAX Engine then retrieves only the relevant data from the Data Model based on this filter context. For example, if the user selected "Product Category = Electronics" and "Year = 2024", only rows matching those criteria are retrieved. The DAX Engine evaluates your measure formula using this filtered data. The critical concept here is that the SAME DAX formula produces different results based on context - this is what makes DAX powerful. Finally, the calculated result returns to the visual for display. When the user changes selections (e.g., switches from Electronics to Clothing), the entire process repeats automatically, recalculating the measure with the new filter context. This automatic context-awareness is DAX's key advantage over static calculations.
What it is: Context is the environment in which a DAX formula evaluates. There are two fundamental types of context: Filter Context and Row Context. Understanding context is THE most important concept in DAX mastery.
Why it exists: The same DAX formula needs to produce different results based on what data the user is viewing. Context provides this dynamic calculation capability. For example, "Total Sales" should show different values for different years, products, or regions - context makes this happen automatically.
Real-world analogy: Imagine you're at a restaurant reading a menu. Your "context" includes: what meal time it is (breakfast/lunch/dinner), what you're hungry for, dietary restrictions, and your budget. The same menu gives you different options based on your context. Similarly, the same DAX formula gives different results based on filter context and row context.
Filter Context Explained:
What it is: Filter context is the set of filters applied to your data when a DAX formula evaluates. These filters determine which rows from your tables are included in the calculation.
How it's created:
📊 Filter Context Visualization:
graph TB
subgraph "Report Level"
RF[Report Filter: Region = 'North America']
end
subgraph "Page Level"
PF[Page Filter: Year = 2024]
end
subgraph "Visual Level"
VF[Visual: Group by Product Category]
S[Slicer: Month = 'January']
end
subgraph "Data Context"
DC[Effective Filter Context:<br/>Region = North America<br/>Year = 2024<br/>Month = January<br/>Grouped by Category]
end
RF --> DC
PF --> DC
VF --> DC
S --> DC
DC --> CALC[DAX Measure Evaluates<br/>Using This Context]
style DC fill:#c8e6c9
style CALC fill:#fff3e0
See: diagrams/03_domain2_filter_context_layers.mmd
Diagram Explanation:
This diagram shows how filter context is built from multiple layers in a Power BI report. At the top, we have Report Level filters that apply to all pages (in this example, Region = 'North America'). Below that, Page Level filters apply to all visuals on the current page (Year = 2024). At the Visual Level, we have both the visual's own grouping (Group by Product Category) and any slicers affecting it (Month = 'January'). All these filters combine to create the Effective Filter Context, which is the actual set of filters applied when your DAX measure evaluates. In this example, when a measure like [Total Sales] evaluates, it only includes sales from North America, in 2024, during January, calculated separately for each Product Category. Understanding how these layers combine is crucial because forgetting about a report-level filter can lead to confusion about why calculations aren't showing expected results. The filters form an AND operation - ALL conditions must be true for a row to be included.
Detailed Example 1: Filter Context in a Sales Report
Imagine you have a Sales table with columns: Date, Product, Region, Quantity, SalesAmount. You create a simple measure:
Total Sales = SUM(Sales[SalesAmount])
Scenario A - Card Visual (No Filters):
Scenario B - Card Visual with Year Slicer:
Scenario C - Table Visual Grouped by Product:
Scenario D - Combining Multiple Filters:
Row Context Explained:
What it is: Row context is the concept of "current row" when evaluating a formula. Row context exists when you iterate through a table row-by-row, such as in calculated columns or when using iterator functions.
How it differs from filter context:
When row context exists:
📊 Row Context vs Filter Context:
graph TB
subgraph "Calculated Column (Has Row Context)"
CC[Profit Margin = <br/>DIVIDE([Sales] - [Cost], [Sales])]
R1[Row 1: Sales=$100, Cost=$60<br/>Calculation: (100-60)/100 = 40%]
R2[Row 2: Sales=$200, Cost=$120<br/>Calculation: (200-120)/200 = 40%]
R3[Row 3: Sales=$150, Cost=$100<br/>Calculation: (150-100)/150 = 33%]
CC --> R1
CC --> R2
CC --> R3
end
subgraph "Measure (Has Filter Context)"
M[Total Profit Margin = <br/>DIVIDE(SUM([Sales]) - SUM([Cost]), SUM([Sales]))]
FC[Filter Context determines<br/>which rows to SUM]
RES[Result: Single aggregated value<br/>based on ALL filtered rows]
M --> FC --> RES
end
style R1 fill:#e1f5fe
style R2 fill:#e1f5fe
style R3 fill:#e1f5fe
style RES fill:#c8e6c9
See: diagrams/03_domain2_row_context_vs_filter_context.mmd
Detailed Example 2: Row Context in Calculated Columns
You have a Products table:
| ProductID | ProductName | Cost | RetailPrice |
|---|---|---|---|
| 1 | Laptop | 600 | 1000 |
| 2 | Mouse | 8 | 20 |
| 3 | Keyboard | 25 | 50 |
You create a calculated column:
Profit = [RetailPrice] - [Cost]
How it evaluates:
The formula "knows" which row it's on because of row context. Each row gets its own calculation stored in the table.
⚠️ Critical Difference: Calculated columns STORE values (take up space, calculated during refresh). Measures calculate dynamically (no storage, calculate when visual needs them).
What it is: CALCULATE is THE most important DAX function. It evaluates an expression (usually a measure) in a modified filter context. In simple terms, CALCULATE lets you change which data is included in a calculation.
Why it exists: You often need calculations that don't follow the normal filter flow. For example: "Show sales for ALL products even when user selects one product" or "Show last year's sales alongside this year's sales". CALCULATE makes these scenarios possible.
Real-world analogy: Imagine you're shopping online with filters applied (Category=Electronics, Price<$500, Brand=Sony). CALCULATE is like temporarily removing or changing those filters to see different results (e.g., "Show me ALL brands, not just Sony" or "Show me items from $500-$1000 instead").
Basic Syntax:
CALCULATE(
<expression>,
<filter1>,
<filter2>,
...
)
How it works (Detailed step-by-step):
Three Ways CALCULATE Modifies Filter Context:
1. REPLACE filter (most common):
Sales USA = CALCULATE(
[Total Sales],
Sales[Country] = "USA"
)
This REPLACES any existing filter on Country with "USA". Even if user selected "Canada", this measure shows USA sales.
2. REMOVE filter:
Sales All Countries = CALCULATE(
[Total Sales],
ALL(Sales[Country])
)
This REMOVES the filter on Country completely, showing sales for ALL countries regardless of user selection.
3. ADD filter (using FILTER or table functions):
High Value Sales = CALCULATE(
[Total Sales],
FILTER(Sales, Sales[Amount] > 1000)
)
This ADDS a filter condition on top of existing filters.
📊 CALCULATE Filter Modification Flow:
graph TD
START[User Filter: Country = Canada<br/>Product = Laptop] --> CALC1{CALCULATE with<br/>Country = USA}
START --> CALC2{CALCULATE with<br/>ALL Country}
START --> CALC3{CALCULATE with<br/>FILTER Amount > 1000}
CALC1 --> RES1[New Context:<br/>Country = USA REPLACES Canada<br/>Product = Laptop remains<br/><br/>Result: USA Laptop sales]
CALC2 --> RES2[New Context:<br/>Country filter REMOVED<br/>Product = Laptop remains<br/><br/>Result: All countries Laptop sales]
CALC3 --> RES3[New Context:<br/>Country = Canada remains<br/>Product = Laptop remains<br/>Amount > 1000 ADDED<br/><br/>Result: High-value Canada Laptop sales]
style RES1 fill:#fff3e0
style RES2 fill:#e1f5fe
style RES3 fill:#c8e6c9
See: diagrams/03_domain2_calculate_filter_modification.mmd
Diagram Explanation:
This decision tree shows how CALCULATE modifies filter context in three different ways. Starting from the same user filter state (Country = Canada, Product = Laptop), we see three different CALCULATE patterns. In the first path (orange), using Country = "USA" as a filter argument REPLACES the existing Country filter - so even though the user selected Canada, the measure shows USA data. The Product=Laptop filter remains unchanged. In the second path (blue), using ALL(Sales[Country]) REMOVES the Country filter entirely, so the measure shows data for ALL countries combined, but still only for Laptops. In the third path (green), using FILTER(Sales, Sales[Amount] > 1000) ADDS a new condition without removing existing filters - so we get Canada Laptop sales where the amount exceeds $1000. Understanding which filter modification pattern to use is critical for writing correct DAX measures. Most exam questions test whether you understand the difference between these three patterns.
Detailed Example 3: CALCULATE in Year-over-Year Comparison
You want to show current year sales and previous year sales side-by-side:
Sample Data (Sales table):
| Date | Product | Amount |
|---|---|---|
| 2024-01-15 | Laptop | 1000 |
| 2024-02-20 | Mouse | 25 |
| 2023-01-18 | Laptop | 950 |
| 2023-03-10 | Mouse | 20 |
Measure 1: Current Year Sales (no CALCULATE needed)
Current Sales = SUM(Sales[Amount])
When user has no year filter: Shows all sales ($1,995)
When user selects 2024: Shows 2024 sales ($1,025)
When user selects 2023: Shows 2023 sales ($970)
Measure 2: Previous Year Sales (using CALCULATE)
Previous Year Sales =
CALCULATE(
[Current Sales],
SAMEPERIODLASTYEAR(Date[Date])
)
How this works:
Scenario: User creates table visual with Year in rows:
| Year | Current Sales | Previous Year Sales |
|---|---|---|
| 2024 | $1,025 | $970 (from 2023) |
| 2023 | $970 | (blank - no 2022 data) |
The magic: Same visual row (2024) shows BOTH 2024 data (Current Sales) AND 2023 data (Previous Year Sales) thanks to CALCULATE modifying context!
Detailed Example 4: CALCULATE with Multiple Filters
You want sales for USA in 2024, regardless of what user selected:
USA 2024 Sales =
CALCULATE(
[Total Sales],
Sales[Country] = "USA",
YEAR(Sales[Date]) = 2024
)
How filters combine:
Alternative syntax (equivalent):
USA 2024 Sales =
CALCULATE(
[Total Sales],
Sales[Country] = "USA" && YEAR(Sales[Date]) = 2024
)
⭐ Must Know: CALCULATE Rules:
💡 Tips for Understanding CALCULATE:
⚠️ Common Mistakes with CALCULATE:
CALCULATE([Sales], Product = "Laptop") to ADD to product filterCALCULATE(SUM(Sales[Amount]), ...)CALCULATE([Total Sales], ...) where [Total Sales] = SUM(Sales[Amount])CALCULATE(SUM(Sales[Amount])) with no filter argumentsSUM(Sales[Amount]) - CALCULATE adds no value hereWhat they are: Iterator functions (SUMX, AVERAGEX, COUNTX, etc.) evaluate an expression for each row in a table and then aggregate the results. The "X" suffix indicates iteration.
Why they exist: Sometimes you need calculations that can't be done with simple SUM or AVERAGE. For example: "Sum of (Quantity × Price)" requires multiplying BEFORE summing - this needs iteration.
Real-world analogy: Imagine calculating your grocery bill. You go through each item (iterate), multiply quantity × price for that item (row-level calculation), then add up all the line totals (aggregate). That's exactly what iterator functions do.
Common Iterator Functions:
Basic Syntax:
SUMX(
<table>,
<expression for each row>
)
Detailed Example 5: SUMX for Revenue Calculation
Why you need it: Your Sales table has Quantity and UnitPrice columns, but not TotalRevenue. You need to calculate Quantity × UnitPrice for each row, then sum.
Sales Table:
| ProductID | Quantity | UnitPrice |
|---|---|---|
| 1 | 5 | 100 |
| 2 | 3 | 50 |
| 3 | 10 | 25 |
Wrong Approach (doesn't work):
Revenue WRONG = SUM(Sales[Quantity]) * SUM(Sales[UnitPrice])
Result: (5+3+10) × (100+50+25) = 18 × 175 = 3,150 ❌ WRONG!
Correct Approach (using SUMX):
Revenue = SUMX(
Sales,
Sales[Quantity] * Sales[UnitPrice]
)
How SUMX works (step-by-step):
📊 Iterator Function Flow:
graph TD
START[SUMX Sales, Quantity × UnitPrice] --> TABLE[Iterate through Sales table]
TABLE --> R1[Row 1: Quantity=5, UnitPrice=100<br/>Expression Result: 5 × 100 = 500]
TABLE --> R2[Row 2: Quantity=3, UnitPrice=50<br/>Expression Result: 3 × 50 = 150]
TABLE --> R3[Row 3: Quantity=10, UnitPrice=25<br/>Expression Result: 10 × 25 = 250]
R1 --> AGG[Aggregate all results]
R2 --> AGG
R3 --> AGG
AGG --> RESULT[Final Result: 500 + 150 + 250 = 900]
style R1 fill:#e1f5fe
style R2 fill:#e1f5fe
style R3 fill:#e1f5fe
style RESULT fill:#c8e6c9
See: diagrams/03_domain2_iterator_sumx_flow.mmd
What they are: Time intelligence functions are specialized DAX functions for working with dates and time periods. They enable calculations like YTD (Year-to-Date), MTD (Month-to-Date), previous year comparisons, and moving averages.
Why they exist: Business analysis heavily relies on time-based comparisons: "How are we doing this year vs last year?", "What's our year-to-date sales?", "Show me a rolling 3-month average". Time intelligence functions make these calculations simple.
Prerequisites: To use time intelligence functions, you MUST have a proper Date table in your model marked as a Date table. The Date table should have continuous dates (no gaps) covering your data range.
Common Time Intelligence Scenarios:
1. Year-to-Date (YTD) Calculations:
Sales YTD = TOTALYTD([Total Sales], Date[Date])
Shows cumulative sales from start of year to the current date in filter context.
Example: On March 15, 2024:
2. Previous Year Comparison:
Sales Last Year = CALCULATE(
[Total Sales],
SAMEPERIODLASTYEAR(Date[Date])
)
Shows sales from the same period in the previous year.
Example: When user views March 2024:
3. Month-to-Date (MTD):
Sales MTD = TOTALMTD([Total Sales], Date[Date])
Shows cumulative sales from start of current month.
4. Year-over-Year Growth:
YoY Growth =
VAR CurrentSales = [Total Sales]
VAR PreviousSales = [Sales Last Year]
RETURN
DIVIDE(CurrentSales - PreviousSales, PreviousSales)
Shows percentage growth compared to previous year.
5. Moving Average (3-Month):
Sales 3M Avg =
AVERAGEX(
DATESINPERIOD(Date[Date], LASTDATE(Date[Date]), -3, MONTH),
[Total Sales]
)
📊 Time Intelligence Visual Timeline:
gantt
title Time Intelligence Calculations for 2024
dateFormat YYYY-MM-DD
section YTD (as of Mar 15)
YTD Period :ytd1, 2024-01-01, 2024-03-15
section MTD (March)
MTD Period :mtd1, 2024-03-01, 2024-03-15
section SPLY (Previous Year)
Same Period 2023 :sply1, 2023-03-01, 2023-03-15
section Moving Avg (3 months)
Jan :ma1, 2024-01-01, 31d
Feb :ma2, 2024-02-01, 29d
Mar (partial) :ma3, 2024-03-01, 15d
See: diagrams/03_domain2_time_intelligence_timeline.mmd
Detailed Example 6: Complete YoY Analysis Dashboard
Data Model Setup:
Measures Created:
// Base measure
Total Sales = SUM(Sales[Amount])
// Previous Year
Sales PY = CALCULATE([Total Sales], SAMEPERIODLASTYEAR(Date[Date]))
// Year-over-Year Difference
Sales YoY Diff = [Total Sales] - [Sales PY]
// Year-over-Year Percentage
Sales YoY % = DIVIDE([Sales YoY Diff], [Sales PY], 0)
// Year-to-Date
Sales YTD = TOTALYTD([Total Sales], Date[Date])
// Previous Year YTD
Sales PY YTD = CALCULATE([Sales YTD], SAMEPERIODLASTYEAR(Date[Date]))
Visual Result (Table by Month for 2024):
| Month | Total Sales | Sales PY | YoY Diff | YoY % | Sales YTD | Sales PY YTD |
|---|---|---|---|---|---|---|
| Jan | $100K | $90K | +$10K | +11.1% | $100K | $90K |
| Feb | $120K | $95K | +$25K | +26.3% | $220K | $185K |
| Mar | $110K | $100K | +$10K | +10.0% | $330K | $285K |
How each column calculates for March row:
Common Time Intelligence Functions:
| Function | Purpose | Example |
|---|---|---|
| TOTALYTD | Year-to-date total | TOTALYTD([Sales], Date[Date]) |
| TOTALMTD | Month-to-date total | TOTALMTD([Sales], Date[Date]) |
| TOTALQTD | Quarter-to-date total | TOTALQTD([Sales], Date[Date]) |
| SAMEPERIODLASTYEAR | Same period previous year | CALCULATE([Sales], SAMEPERIODLASTYEAR(Date[Date])) |
| DATEADD | Shift date by period | CALCULATE([Sales], DATEADD(Date[Date], -1, YEAR)) |
| PARALLELPERIOD | Parallel period in past | CALCULATE([Sales], PARALLELPERIOD(Date[Date], -12, MONTH)) |
| DATESINPERIOD | Date range from point | AVERAGEX(DATESINPERIOD(Date[Date], MAX(Date[Date]), -3, MONTH), [Sales]) |
| DATESYTD | Dates in YTD period | CALCULATE([Sales], DATESYTD(Date[Date])) |
⭐ Must Know: Time Intelligence Requirements:
💡 Tips:
CALCULATE([Measure], DATESYTD(Date[Date]))DATEADD(Date[Date], -1, YEAR)⚠️ Common Mistakes:
TOTALYTD([Sales], Sales[OrderDateTime])TOTALYTD([Sales], Date[Date]) where Date table is marked as Date tableThe problem: Large data models can become slow, impacting user experience. Reports take minutes to load, visuals lag when users interact, and refresh times extend beyond acceptable limits.
The solution: Power BI provides tools and techniques to identify performance bottlenecks and optimize model design, DAX calculations, and visual queries.
Why it's tested: Performance optimization is critical for enterprise BI solutions. The exam tests whether you can identify slow areas and apply appropriate optimization techniques.
What it is: Performance Analyzer is a built-in Power BI Desktop tool that records and displays the time taken by each operation when refreshing visuals. It shows DAX query time, visual display time, and other overhead.
How to use:
What the results show:
How to interpret:
💡 Tip: Focus on visuals with total time > 500ms. Optimize highest time consumers first for maximum impact.
Issue 1: Unnecessary Columns Loaded
Problem: Loading columns from source that aren't used in any visual or calculation wastes memory and slows refresh.
Solution: In Power Query, remove unused columns before loading to model.
Example:
Issue 2: Wrong Data Types
Problem: Text columns use more memory than integer columns. Loading dates as text prevents time intelligence and wastes space.
Solution: Use appropriate data types - Integer for IDs, Date for dates, Decimal for money (not Text).
Impact: Text columns can use 10x more memory than integers.
Issue 3: High Cardinality Columns
Problem: Columns with millions of unique values (e.g., timestamps, transaction IDs) compress poorly and consume excessive memory.
Solution:
Issue 4: Bi-directional Relationships
Problem: Bi-directional cross-filtering can create ambiguous filter paths and slow queries.
Solution: Use single-direction relationships when possible. Only use bi-directional when absolutely necessary (e.g., many-to-many scenarios with proper bridge tables).
✅ Data Modeling Fundamentals
✅ DAX Calculations
✅ Model Performance Optimization
Test yourself before moving on:
Try these from your practice test bundles:
If you scored below 75%:
Relationship Cardinality:
CALCULATE Patterns:
CALCULATE([Measure], Table[Column] = "Value")CALCULATE([Measure], ALL(Table[Column]))CALCULATE([Measure], FILTER(Table, condition))Time Intelligence:
TOTALYTD([Measure], Date[Date])CALCULATE([Measure], SAMEPERIODLASTYEAR(Date[Date]))TOTALMTD([Measure], Date[Date])DIVIDE([Current] - [Previous], [Previous])Performance Tips:
Next Steps: Proceed to 04_domain3_visualize_analyze to learn visualization techniques, report creation, and data analysis features.
Variables make DAX more readable, improve performance by avoiding recalculation, and enable complex logic that would otherwise be impossible.
Why Variables Matter:
Basic Variable Syntax:
Measure Name =
VAR VariableName = <expression>
VAR AnotherVariable = <expression>
RETURN
<calculation using variables>
Example 1: Sales Performance with Thresholds
Without variables (inefficient, hard to read):
Sales Performance =
IF(
DIVIDE(
SUM(Sales[Amount]),
CALCULATE(SUM(Sales[Amount]), ALL(Products))
) > 0.1,
"High",
IF(
DIVIDE(
SUM(Sales[Amount]),
CALCULATE(SUM(Sales[Amount]), ALL(Products))
) > 0.05,
"Medium",
"Low"
)
)
With variables (efficient, readable):
Sales Performance =
VAR CurrentSales = SUM(Sales[Amount])
VAR TotalSales = CALCULATE(SUM(Sales[Amount]), ALL(Products))
VAR PercentageOfTotal = DIVIDE(CurrentSales, TotalSales)
RETURN
SWITCH(
TRUE(),
PercentageOfTotal > 0.1, "High",
PercentageOfTotal > 0.05, "Medium",
"Low"
)
Example 2: Customer Segmentation
Customer Segment =
VAR CustomerLifetimeValue = [Total Sales]
VAR CustomerTenure =
DATEDIFF(
RELATED(Customers[FirstPurchaseDate]),
TODAY(),
YEAR
)
VAR AverageOrderValue = DIVIDE([Total Sales], [Total Orders])
RETURN
SWITCH(
TRUE(),
CustomerLifetimeValue > 50000 && CustomerTenure >= 3, "VIP",
CustomerLifetimeValue > 20000 && CustomerTenure >= 2, "Gold",
CustomerLifetimeValue > 5000 && CustomerTenure >= 1, "Silver",
"Bronze"
)
Example 3: YoY Growth with Commentary
Sales Growth Analysis =
VAR CurrentYear = [Total Sales]
VAR PriorYear = [Sales PY]
VAR GrowthAmount = CurrentYear - PriorYear
VAR GrowthPercent = DIVIDE(GrowthAmount, PriorYear)
VAR GrowthText =
SWITCH(
TRUE(),
GrowthPercent > 0.2, "🚀 Exceptional Growth",
GrowthPercent > 0.1, "📈 Strong Growth",
GrowthPercent > 0, "✓ Positive Growth",
GrowthPercent > -0.1, "⚠️ Slight Decline",
"🔻 Significant Decline"
)
RETURN
GrowthText & " (" & FORMAT(GrowthPercent, "0.0%") & ")"
When Each Component Calculates:
Understanding how to precisely control filters is critical for complex business logic.
Pattern 1: Combining ALL variants
ALL - Removes all filters:
Total Sales All Time =
CALCULATE(
SUM(Sales[Amount]),
ALL(Date) -- Removes all filters from Date table
)
ALLEXCEPT - Removes all filters except specified:
Total Sales This Year =
CALCULATE(
SUM(Sales[Amount]),
ALLEXCEPT(Date, Date[Year]) -- Keep Year filter, remove Month/Day
)
ALLSELECTED - Removes row context but keeps slicers/filters:
% of Filtered Total =
VAR CurrentSales = SUM(Sales[Amount])
VAR FilteredTotal = CALCULATE(SUM(Sales[Amount]), ALLSELECTED(Products))
RETURN
DIVIDE(CurrentSales, FilteredTotal)
Example: Understanding the Difference
Setup: Report with Year slicer = 2024, Product table visual
| Measure | With "Electronics" Selected | What It Shows |
|---|---|---|
SUM(Sales[Amount]) |
$50K | Electronics sales in 2024 |
CALCULATE(..., ALL(Date)) |
$200K | Electronics sales all years |
CALCULATE(..., ALL(Products)) |
$150K | All products sales in 2024 |
CALCULATE(..., ALLEXCEPT(Date, Date[Year])) |
$50K | Electronics in 2024 (year kept) |
CALCULATE(..., ALLSELECTED(Products)) |
$150K | All visible products in 2024 |
Pattern 2: Stacking Filters
Filters in CALCULATE combine with AND logic by default:
West Electronics Sales 2024 =
CALCULATE(
[Total Sales],
Products[Category] = "Electronics", -- Filter 1
Stores[Region] = "West", -- Filter 2 (AND)
Date[Year] = 2024 -- Filter 3 (AND)
)
-- Result: Electronics AND West AND 2024
To create OR logic, use a different approach:
Electronics OR Computers =
CALCULATE(
[Total Sales],
FILTER(
Products,
Products[Category] = "Electronics" ||
Products[Category] = "Computers"
)
)
Or even better:
Electronics OR Computers =
CALCULATE(
[Total Sales],
Products[Category] IN {"Electronics", "Computers"}
)
Pattern 3: Complex Time Intelligence
Same Period Multiple Years Ago:
Sales 3 Years Ago =
VAR YearsBack = 3
RETURN
CALCULATE(
[Total Sales],
DATEADD(Date[Date], -YearsBack, YEAR)
)
Comparing to Best Month Ever:
% of Best Month =
VAR CurrentMonthSales = [Total Sales]
VAR BestMonthSales =
CALCULATE(
[Total Sales],
TOPN(1, ALL(Date[Year], Date[Month]), [Total Sales], DESC)
)
RETURN
DIVIDE(CurrentMonthSales, BestMonthSales)
Rolling 12-Month Average:
Rolling 12-Month Avg =
VAR Last12Months =
DATESINPERIOD(
Date[Date],
MAX(Date[Date]),
-12,
MONTH
)
RETURN
CALCULATE(
AVERAGE(Sales[Amount]),
Last12Months
)
Year-Over-Year with Custom Fiscal Year:
// Fiscal year ends June 30
Sales Fiscal Year LY =
VAR FiscalYearEnd = DATE(YEAR(TODAY()), 6, 30)
RETURN
CALCULATE(
[Total Sales],
DATEADD(Date[Date], -1, YEAR)
)
DAX has two types of evaluation context that can exist simultaneously. Mastering this is the key to DAX expertise.
Filter Context Deep Dive:
Filter context is like layers of filters stacked on top of each other:
Layer 1: Report-level filters (applied to all visuals)
Layer 2: Page-level filters (applied to all visuals on page)
Layer 3: Visual-level filters (applied to one visual)
Layer 4: Slicer selections (user-driven filters)
Layer 5: Row/column in visual (auto-generated filter)
Example Scenario:
Report setup:
When a measure calculates for the "Laptop" row:
Current filter context:
Measure without CALCULATE:
Simple Sales = SUM(Sales[Amount])
-- Respects ALL filters above
-- Shows: Laptop sales in West/East regions for Electronics in 2024
Measure with CALCULATE - Remove Year Filter:
All Time Sales =
CALCULATE(
SUM(Sales[Amount]),
ALL(Date[Year])
)
-- Year filter removed
-- Shows: Laptop sales in West/East for Electronics in ALL years
Measure with CALCULATE - Change Category:
Computers Sales =
CALCULATE(
SUM(Sales[Amount]),
Products[Category] = "Computers"
)
-- Category filter REPLACED
-- Shows: Computers (not Electronics) for Laptop row context
-- Usually gives BLANK because Laptop is not in Computers category
Row Context Deep Dive:
Row context happens when iterating through table rows. It does NOT automatically filter related tables.
Example: Why This Fails:
-- WRONG: This will give incorrect results
Wrong Margin =
SUMX(
Sales,
Sales[Revenue] - Products[Cost] -- Products[Cost] not in row context!
)
Corrected with RELATED:
-- CORRECT: RELATED converts row context to filter context
Correct Margin =
SUMX(
Sales,
Sales[Revenue] - RELATED(Products[Cost])
)
How it works:
Sales[Revenue] reads current row's revenueRELATED(Products[Cost]) follows relationship from current Sales row to Products tableExample: Calculating Weighted Average
Without understanding row context (WRONG):
-- This gives average of prices, not weighted by quantity
Wrong Weighted Avg = AVERAGE(Sales[Price])
With proper row context (CORRECT):
-- This weights each price by its quantity
Weighted Avg Price =
VAR TotalRevenue = SUMX(Sales, Sales[Quantity] * Sales[Price])
VAR TotalQuantity = SUM(Sales[Quantity])
RETURN
DIVIDE(TotalRevenue, TotalQuantity)
Example: Ranking Products by Sales
Product Rank =
RANKX(
ALL(Products[ProductName]), -- Table to rank within
[Total Sales], -- Expression to rank by
, -- Value (blank = current product)
DESC, -- Order (DESC = highest ranked #1)
DENSE -- Rank type (DENSE = no gaps)
)
How it works:
Visualization: Evaluation Context in a Matrix Visual
Matrix visual structure:
| Q1 | Q2 | Q3 | Total
--------|-------|-------|-------|-------
West | 100K | 120K | 110K | 330K
East | 90K | 95K | 100K | 285K
Total | 190K | 215K | 210K | 615K
For cell "West, Q1" (100K):
For "Total" column cell "West" (330K):
For bottom-right "Total" (615K):
Pattern 1: Move Filtering to Model
SLOW (filtering in measure):
Active Customers Sales =
SUMX(
FILTER(
Sales,
RELATED(Customers[Status]) = "Active"
),
Sales[Amount]
)
FAST (filter in model with relationship):
Create calculated table:
Active Customers = FILTER(Customers, Customers[Status] = "Active")
Then use simple measure:
Active Customers Sales = SUM(Sales[Amount])
Pattern 2: Avoid Calculated Columns in Large Tables
SLOW (calculated column on 10M row table):
Sales[Margin] = Sales[Revenue] - RELATED(Products[Cost])
This calculates 10M times and stores 10M values.
FAST (measure instead):
Total Margin =
SUMX(
Sales,
Sales[Revenue] - RELATED(Products[Cost])
)
This calculates only when needed and stores nothing.
When to use calculated columns:
Pattern 3: Use Variables to Avoid Recalculation
SLOW (calculates Total Sales 3 times):
Performance Metric =
IF(
[Total Sales] > 100000,
[Total Sales] * 1.1,
[Total Sales] * 0.9
)
FAST (calculates once):
Performance Metric =
VAR Sales = [Total Sales]
RETURN
IF(Sales > 100000, Sales * 1.1, Sales * 0.9)
Pattern 4: Reduce Cardinality
High cardinality columns (many unique values) hurt performance:
Optimization:
A proper Date table is absolutely critical for time intelligence. The exam will test your knowledge of creating and configuring date tables.
Power BI auto date/time creates hidden date tables automatically, but:
Custom Date table:
= List.Dates(
#date(2020, 1, 1), // Start date
Duration.Days( // Number of days
#date(2030, 12, 31) - #date(2020, 1, 1)
) + 1,
#duration(1, 0, 0, 0) // Increment by 1 day
)
Then convert to table and add columns:
= Table.FromList(
DateList,
Splitter.SplitByNothing(),
{"Date"},
null,
ExtraValues.Error
)
Add calculated columns in Power Query:
Year = Date.Year([Date])
Quarter = "Q" & Text.From(Date.QuarterOfYear([Date]))
Month = Date.Month([Date])
MonthName = Date.MonthName([Date])
DayOfWeek = Date.DayOfWeek([Date])
DayName = Date.DayOfWeekName([Date])
Basic version:
Date =
CALENDAR(DATE(2020, 1, 1), DATE(2030, 12, 31))
Or automatically match data range:
Date =
CALENDAR(
DATE(YEAR(MIN(Sales[OrderDate])), 1, 1),
DATE(YEAR(MAX(Sales[OrderDate])), 12, 31)
)
Add calculated columns:
Year = YEAR(Date[Date])
Quarter = "Q" & FORMAT(Date[Date], "Q")
QuarterNumber = QUARTER(Date[Date])
Month = MONTH(Date[Date])
MonthName = FORMAT(Date[Date], "MMMM")
MonthShort = FORMAT(Date[Date], "MMM")
DayOfWeek = WEEKDAY(Date[Date])
DayName = FORMAT(Date[Date], "DDDD")
DayShort = FORMAT(Date[Date], "DDD")
IsWeekend = IF(WEEKDAY(Date[Date]) IN {1, 7}, TRUE(), FALSE())
Fiscal Year (assuming fiscal year ends June 30):
FiscalYear =
VAR MonthNumber = MONTH(Date[Date])
VAR CalendarYear = YEAR(Date[Date])
RETURN
IF(
MonthNumber >= 7,
"FY" & CalendarYear + 1,
"FY" & CalendarYear
)
Fiscal Quarter:
FiscalQuarter =
VAR MonthNumber = MONTH(Date[Date])
VAR FiscalMonth = IF(MonthNumber >= 7, MonthNumber - 6, MonthNumber + 6)
RETURN
"FQ" & ROUNDUP(FiscalMonth / 3, 0)
Is Holiday (example for US):
IsHoliday =
VAR MonthNum = MONTH(Date[Date])
VAR DayNum = DAY(Date[Date])
VAR DayOfWeek = WEEKDAY(Date[Date])
RETURN
SWITCH(
TRUE(),
// New Year
MonthNum = 1 && DayNum = 1, TRUE(),
// Independence Day
MonthNum = 7 && DayNum = 4, TRUE(),
// Christmas
MonthNum = 12 && DayNum = 25, TRUE(),
// Thanksgiving (4th Thursday of November)
MonthNum = 11 && DayOfWeek = 5 && DayNum >= 22 && DayNum <= 28, TRUE(),
FALSE()
)
Working Days (excluding weekends and holidays):
IsWorkingDay =
IF(
Date[IsWeekend] = TRUE() || Date[IsHoliday] = TRUE(),
FALSE(),
TRUE()
)
Week Number:
WeekNumber = WEEKNUM(Date[Date])
Relative Period Columns (useful for filtering):
IsCurrentMonth =
VAR Today = TODAY()
RETURN
YEAR(Date[Date]) = YEAR(Today) &&
MONTH(Date[Date]) = MONTH(Today)
IsLastMonth =
VAR LastMonthStart = DATE(YEAR(EOMONTH(TODAY(), -2)) + 1, MONTH(EOMONTH(TODAY(), -2)) + 1, 1)
VAR LastMonthEnd = EOMONTH(TODAY(), -1)
RETURN
Date[Date] >= LastMonthStart &&
Date[Date] <= LastMonthEnd
IsCurrentYear =
YEAR(Date[Date]) = YEAR(TODAY())
IsLastYear =
YEAR(Date[Date]) = YEAR(TODAY()) - 1
Critical step - won't work without this!
In Power BI Desktop:
Why this matters:
Create relationships:
Role-playing dimensions: Same date table used for multiple date columns
To use inactive relationship:
Sales by Ship Date =
CALCULATE(
[Total Sales],
USERELATIONSHIP(Sales[ShipDate], Date[Date])
)
❌ WRONG: Using OrderDate directly from fact table
❌ WRONG: Multiple date tables
❌ WRONG: Forgetting to mark as date table
✅ CORRECT: One date table, marked, related to all fact date columns
See diagrams/03_domain2_star_schema.mmd for visual representation of proper date table relationships.
See diagrams/03_domain2_calculate_patterns.mmd for CALCULATE evaluation flow.
What it is: DAX operates in two types of contexts - filter context (what data is visible) and row context (iterating through rows). Understanding the difference is critical for writing correct DAX measures.
Why it exists: Power BI needs different evaluation modes for different operations. When calculating a SUM across all rows, it uses filter context. When evaluating a calculated column row-by-row, it uses row context. The distinction determines which data is accessible and how calculations behave.
Real-world analogy: Filter context is like looking at a filtered spreadsheet where you only see certain rows based on criteria (e.g., only showing 2024 sales). Row context is like moving your finger down each visible row one at a time to perform a calculation. They serve different purposes and work differently.
How it works (Detailed step-by-step):
Filter Context:
Row Context:
📊 Context Types Comparison Diagram:
graph TB
subgraph "Filter Context (Measures)"
FC1[User Selects:<br/>Year = 2024] --> FC2[Filter Applied<br/>to Dataset]
FC2 --> FC3[Measure Evaluates<br/>Across Filtered Rows]
FC3 --> FC4[Single Result:<br/>Total Sales = $1.2M]
end
subgraph "Row Context (Calculated Columns)"
RC1[Power BI Iterates<br/>Row 1] --> RC2[Evaluate Formula<br/>for Row 1]
RC2 --> RC3[Store Result<br/>in Row 1]
RC3 --> RC4[Move to Row 2]
RC4 --> RC5[Repeat for<br/>All Rows]
end
subgraph "Context Transition (CALCULATE)"
CT1[Row Context:<br/>Current Customer Row] --> CT2[CALCULATE Creates<br/>Filter Context]
CT2 --> CT3[Filter: CustomerID<br/>= Current Row]
CT3 --> CT4[Measure Evaluates<br/>in New Filter Context]
end
style FC4 fill:#c8e6c9
style RC5 fill:#fff3e0
style CT4 fill:#e1f5fe
See: diagrams/03_domain2_context_types.mmd
Diagram Explanation: This diagram compares three DAX context scenarios. The top section (Filter Context in green) shows how measures work: a user filter (Year = 2024) creates a filter context, the measure evaluates across all filtered rows, and returns a single aggregated result ($1.2M). The middle section (Row Context in orange) illustrates calculated columns: Power BI iterates row by row, evaluates the formula for each row individually, stores the result, and moves to the next row until complete. The bottom section (Context Transition in blue) shows what happens when CALCULATE is used in row context: it converts the current row's context into a filter context, allowing measures to be evaluated for that specific row's identifier (e.g., CustomerID from current row). Understanding these differences is essential for writing correct DAX.
Detailed Example 1: Calculated Column vs Measure - Common Mistake
Scenario: You want to calculate profit (Revenue - Cost) for sales transactions.
Approach 1 - Calculated Column (increases model size, fixed at refresh):
Profit = Sales[Revenue] - Sales[Cost]
This creates a new column in the Sales table. Each row stores its profit value. The column:
Approach 2 - Measure (dynamic, memory-efficient):
Total Profit = SUM(Sales[Revenue]) - SUM(Sales[Cost])
This creates a measure that calculates dynamically. The measure:
When to use each:
Detailed Example 2: Context Transition with CALCULATE
Scenario: You want to show each customer's sales as a percentage of total sales.
The problem: In a calculated column, you're in row context (current customer row). To get total sales across ALL customers, you need filter context.
Solution using CALCULATE for context transition:
Customer Sales % =
DIVIDE(
CALCULATE(SUM(Sales[Amount])), // Context transition: current customer's sales
CALCULATE(SUM(Sales[Amount]), ALL(Sales[CustomerID])) // Remove customer filter: total sales
)
What happens step-by-step:
Without CALCULATE (won't work):
Customer Sales % WRONG =
DIVIDE(
SUM(Sales[Amount]), // ERROR: SUM needs filter context, but we're in row context
SUM(Sales[Amount]) // ERROR: Same problem
)
This fails because SUM requires filter context, but calculated columns run in row context.
Detailed Example 3: Iterator Functions and Row Context
Scenario: Calculate average order value (total revenue / number of orders) per customer.
Approach 1 - Using measure with aggregations:
Avg Order Value =
DIVIDE(
SUM(Sales[Revenue]),
DISTINCTCOUNT(Sales[OrderID])
)
This works at the visual level but doesn't give you order-level detail.
Approach 2 - Using AVERAGEX iterator:
Avg Order Value =
AVERAGEX(
VALUES(Sales[OrderID]), // Create table of distinct orders in current context
CALCULATE(SUM(Sales[Revenue])) // For each order, calculate revenue
)
How AVERAGEX works:
Why this matters: The iterator approach correctly handles orders with multiple line items, while the simple division approach might give incorrect results with complex data structures.
What you'll learn:
Time to complete: 10-12 hours
Prerequisites: Chapters 0-2 (Fundamentals, Data Preparation, Data Modeling)
The problem: Raw data in tables is overwhelming and doesn't communicate insights effectively. Users need visual representations that make patterns, trends, and anomalies immediately obvious.
The solution: Power BI provides 30+ built-in visual types, each optimized for specific data storytelling scenarios. Selecting the right visual type transforms data into actionable insights.
Why it's tested: Visual selection is fundamental to effective reporting. The exam tests whether you know which visual to use for different business scenarios.
What makes a good visual:
The analytical question determines visual type:
| Question Type | Best Visual(s) | Why |
|---|---|---|
| What are the values? | Table, Matrix, Card | Shows exact numbers for reference |
| How do categories compare? | Bar/Column Chart | Length comparison is highly accurate |
| How does a value change over time? | Line Chart, Area Chart | Shows trends and patterns clearly |
| What is the composition/part-to-whole? | Pie, Donut, Treemap | Shows relative proportions |
| How do two measures correlate? | Scatter Chart | Reveals relationships between variables |
| How is data distributed? | Histogram, Box Plot | Shows distribution and outliers |
| Where are things located? | Map, Filled Map | Geographic context matters |
| What is the ranking? | Bar Chart (sorted), Ribbon Chart | Shows relative position clearly |
📊 Visual Selection Decision Tree:
graph TD
START[What question am I answering?] --> Q1{Need exact<br/>values?}
Q1 -->|Yes| TABLE[Table/Matrix]
Q1 -->|No| Q2{Comparing<br/>categories?}
Q2 -->|Yes| Q2A{Time series?}
Q2A -->|Yes| LINE[Line/Area Chart]
Q2A -->|No| BAR[Bar/Column Chart]
Q2 -->|No| Q3{Showing<br/>composition?}
Q3 -->|Yes| PIE[Pie/Donut/Treemap]
Q3 -->|No| Q4{Correlation?}
Q4 -->|Yes| SCATTER[Scatter Chart]
Q4 -->|No| Q5{Geographic?}
Q5 -->|Yes| MAP[Map Visual]
Q5 -->|No| Q6{Single KPI?}
Q6 -->|Yes| CARD[Card/Gauge/KPI]
Q6 -->|No| OTHER[Consider:<br/>Waterfall, Funnel,<br/>Decomposition Tree]
style TABLE fill:#e1f5fe
style LINE fill:#c8e6c9
style BAR fill:#fff3e0
style PIE fill:#f3e5f5
style SCATTER fill:#ffe0b2
style MAP fill:#c5e1a5
style CARD fill:#ffccbc
See: diagrams/04_domain3_visual_selection_tree.mmd
Diagram Explanation:
This decision tree guides visual selection by asking analytical questions in sequence. Starting with "What question am I answering?", the tree first checks if exact values are needed - if yes, use Table or Matrix visuals which display precise numbers. If no, it checks if you're comparing categories. For category comparison with time dimension, Line or Area charts show trends best; without time, Bar or Column charts provide clear comparisons. If not comparing categories, the tree checks for composition analysis (part-to-whole relationships) where Pie, Donut, or Treemap visuals excel. For correlation analysis between two measures, Scatter charts are optimal. Geographic questions require Map visuals. Single KPI displays use Card, Gauge, or KPI visuals. Finally, specialized scenarios might need Waterfall (for sequential changes), Funnel (for stage-based processes), or Decomposition Tree (for hierarchical analysis). This systematic approach ensures you select visuals based on analytical purpose rather than aesthetics.
What they are: Bar charts show categories on vertical axis with bars extending horizontally. Column charts show categories on horizontal axis with bars extending vertically. Both use bar length to represent values.
Why they exist: Human eyes are extremely accurate at comparing lengths. Bar/column charts leverage this for precise category comparison.
When to use:
When NOT to use:
Detailed Example: Sales by Product Category
Scenario: Compare 2024 sales across 6 product categories.
Data:
| Category | Sales |
|---|---|
| Electronics | $450K |
| Clothing | $380K |
| Home & Garden | $320K |
| Sports | $280K |
| Books | $120K |
| Toys | $90K |
Column Chart Configuration:
What makes this effective:
⭐ Must Know: Bar vs Column:
What they are: Line charts connect data points with lines. Area charts fill the space below the line. Both are optimized for showing changes over continuous dimensions, especially time.
Why they exist: Lines show trends, patterns, and rate of change better than other visual types. Human eyes naturally follow lines to detect patterns.
When to use:
When NOT to use:
Detailed Example: Monthly Sales Trend
Scenario: Show sales trend for 2024 by month to identify seasonality.
Data:
| Month | Sales |
|---|---|
| Jan | $85K |
| Feb | $92K |
| Mar | $110K |
| Apr | $105K |
| May | $98K |
| Jun | $115K |
| Jul | $125K |
| Aug | $120K |
| Sep | $130K |
| Oct | $140K |
| Nov | $180K |
| Dec | $220K |
Line Chart Configuration:
Insights immediately visible:
Area Chart vs Line Chart:
💡 Pro Tip: For multiple time series, limit to 3-5 lines for readability. Use legend labels and consistent colors across reports.
What they are:
Why they exist: Sometimes users need exact values for reference, detailed drill-down, or to export data. Tables and matrices provide precision that charts don't.
Table vs Matrix Decision:
| Feature | Table | Matrix |
|---|---|---|
| Structure | Flat list | Grouped rows & columns |
| Subtotals | No (just grand total) | Yes (at group levels) |
| Column expansion | Fixed columns | Dynamic (can expand/collapse) |
| Use for | Detail records, lists | Aggregated summaries |
When to use Table:
When to use Matrix:
Detailed Example: Matrix for Sales Analysis
Scenario: Show sales by Product Category (rows) and Year (columns) with quarterly drill-down.
Matrix Configuration:
Result Structure:
2023 2024 Grand Total
Q1 Q2 Total Q1 Q2 Total
Electronics 50K 60K 110K 65K 70K 135K 245K
- Laptops 30K 35K 65K 40K 42K 82K 147K
- Phones 20K 25K 45K 25K 28K 53K 98K
Clothing 40K 45K 85K 48K 52K 100K 185K
Grand Total 90K 105K 195K 113K 122K 235K 430K
Why this works:
⚠️ Common Mistake: Using table when matrix needed
What they are: Single-value visuals that display one key metric prominently.
Card Visual:
KPI Visual:
Gauge Visual:
Detailed Example: Sales Dashboard KPIs
Scenario: Executive dashboard showing key sales metrics.
Card Visuals (4 across top):
KPI Visual (Sales Target):
Gauge Visual (Quota Achievement):
Configuration Best Practices:
The problem: Default visual formatting often doesn't align with corporate branding or doesn't emphasize the right information.
The solution: Power BI provides extensive formatting options for every visual type including colors, fonts, data labels, titles, and conditional formatting.
1. Colors and Themes
Theme Application:
Manual Color Override:
Color Best Practices:
2. Data Labels
What they are: Text labels showing exact values on visual elements (bars, lines, pie slices).
When to enable:
When to disable:
Configuration:
Example: Column chart with 6 categories:
Sales by Category
Electronics [$450K] ████████████████████
Clothing [$380K] ████████████████
Home/Garden [$320K] █████████████
Sports [$280K] ███████████
Books [$120K] ████
Toys [$90K] ███
Data labels ([$XXX]) make exact values clear without hovering.
3. Conditional Formatting
What it is: Automatically format visual elements based on values or rules (e.g., color high values green, low values red).
Available in:
Common Patterns:
Pattern 1: Traffic Light Colors (Background)
Sales Target Achievement:
- >= 100%: Green background
- 80-99%: Yellow background
- < 80%: Red background
Configuration (Matrix/Table):
Pattern 2: Data Bars
Shows horizontal bars inside table cells proportional to value (like Excel's data bars).
Use for: Quick visual comparison within table rows
Pattern 3: Icon Sets
Shows icons (arrows, shapes, flags) based on value ranges.
Example: Trend indicators
💡 Pro Tip: Combine multiple conditional formats
Slicers: Visual filters that users can click to filter report data.
Filter Pane: Behind-the-scenes filters at visual, page, or report level.
Slicer Types:
| Type | Best For | Example |
|---|---|---|
| List | 5-20 options | Product categories |
| Dropdown | >20 options | Customer list |
| Numeric Range | Continuous numbers | Price range, Age |
| Date Range | Date filtering | Order date range |
| Relative Date | Dynamic dates | Last 30 days, YTD |
| Hierarchy | Drill-down filtering | Region → State → City |
Slicer Configuration Best Practices:
Enable Multi-Select (if appropriate):
Show "Select All" (for list slicers):
Responsive Design:
Sync Slicers Across Pages:
Detailed Example: Sales Report Slicers
Page 1: Overview Dashboard
Slicers present:
Page 2: Product Detail
Same slicers, synced from Page 1:
Benefit: User selects "2024, Electronics, West" on Page 1, navigates to Page 2, sees filtered detail automatically.
Filter Levels:
Visual-level filter:
Page-level filter:
Report-level filter:
⚠️ Common Mistake: Too many slicers
What they are: Bookmarks capture the current state of a report page (filter selections, visual visibility, spotlight) and let you return to that state later via buttons or bookmark pane.
Why they exist: Enable interactive storytelling, create navigation menus, show/hide visuals, and build guided analytical experiences.
Common Use Cases:
1. Show/Hide Visuals (Toggle)
Create buttons that show/hide different visual sets for different analysis perspectives.
Example: Sales Analysis with Two Views
Setup:
User experience: Click "Show Table" → Charts disappear, table appears
2. Story Navigation
Guide users through analytical narrative with previous/next buttons.
Example: Monthly Sales Story
Setup:
3. Reset Filters
Clear all slicers with one button click.
Setup:
Bookmark Properties:
| Property | Captures | Use Case |
|---|---|---|
| Data | Filter states, slicer values | Reset filters, apply specific filter sets |
| Display | Visual visibility, spotlight | Show/hide visuals, focus on specific visual |
| Current Page | Which page is active | Navigate between pages |
📊 Bookmark Navigation Flow:
graph LR
START[Landing Page] --> BTN1{User clicks<br/>'View Charts'}
START --> BTN2{User clicks<br/>'View Table'}
BTN1 --> BM1[Bookmark: Chart View<br/>Charts: Visible<br/>Table: Hidden]
BTN2 --> BM2[Bookmark: Table View<br/>Charts: Hidden<br/>Table: Visible]
BM1 --> DISPLAY1[Display:<br/>Column Chart<br/>Line Chart]
BM2 --> DISPLAY2[Display:<br/>Detailed Table]
DISPLAY1 --> BTN2
DISPLAY2 --> BTN1
style BM1 fill:#c8e6c9
style BM2 fill:#e1f5fe
See: diagrams/04_domain3_bookmark_navigation.mmd
⚠️ Common Mistakes:
What it is: Right-click on a data point in one visual, select "Drill through" to navigate to a detail page filtered to that specific data point.
Why it exists: Users need to go from summary to detail without cluttering the summary page with details.
Setup Requirements:
Detailed Example: Sales Summary to Customer Detail
Page 1: Sales Overview (summary page)
Page 2: Customer Detail (drill-through target page)
User Flow:
Why this is powerful:
Advanced: Multiple Drill-Through Fields
Add Region AND Customer to drill-through fields → Page filters to both Region AND Customer when user drills through.
Drill-Through Filters:
Can add additional filters to drill-through page:
💡 Pro Tip: Keep drill-through pages hidden from normal navigation (right-click page tab → Hide page). Users only access via drill-through, keeping report navigation clean.
What they are: Custom report pages that appear as tooltips when hovering over visuals.
Why they exist: Default tooltips only show field name and value. Custom tooltips can show charts, multiple metrics, formatted layouts.
Setup:
Detailed Example: Product Tooltip
Main Page: Bar chart showing Sales by Product Category
Tooltip Page (named: "Product Tooltip"):
User Experience:
Benefit: Rich context without cluttering main visual or requiring clicks.
Tooltip Fields:
Add fields to "Tooltip fields" well → Tooltip automatically filters to those values when shown.
Example: Add Product Category to tooltip fields → When hovering over Electronics, tooltip filters to Electronics.
⚠️ Common Mistake: Tooltip too large or complex
What it is: AI visual that analyzes your data to find factors that influence a target metric (increase/decrease, or classification).
Why it exists: Automatically discovers what drives changes in KPIs without manual analysis.
Use Cases:
Configuration:
Example: What Increases Sales?
Data: Sales transactions with Product, Region, Season, Discount Level, Sales Amount
Key Influencers Configuration:
Results Shown:
How it works: AI tests combinations of dimension values to find statistically significant correlations with target metric.
💡 Pro Tip: Requires enough data for statistical significance (typically 100+ rows minimum).
What it is: AI-powered hierarchical breakdown visual that shows how a metric decomposes across dimensions.
Why it exists: Lets users interactively drill down to find where values are concentrated.
Use Cases:
Configuration:
User Interaction:
Example: Sales Decomposition
Analyze: Sum of Sales ($1.2M total)
Level 1: Split by Region
Level 2: User expands West, splits by Product
Level 3: User expands Electronics, splits by Customer Segment
Insight: West region, Electronics category, Enterprise segment is the highest revenue path ($120K).
AI High/Low Analysis:
Select node → Choose "High value" split → AI automatically expands dimension with highest value.
What it is: Natural language query visual where users type questions and get visual answers.
Why it exists: Democratizes data access - users don't need to know DAX or visual creation.
How it works:
Setup:
Example Questions:
Teaching Q&A:
💡 Pro Tip: Add Q&A visual to executive dashboards for ad-hoc exploration.
✅ Visual Types and Selection
✅ Formatting and Customization
✅ Interactive Features
✅ Advanced Interactivity
✅ AI-Powered Analytics
Try these from your practice test bundles:
Visual Selection:
Interactivity Levels:
AI Visuals:
Next Steps: Proceed to 05_domain4_manage_secure to learn workspace management, sharing, security (RLS), and governance.
Power BI supports custom visuals from AppSource, but the exam focuses on knowing when built-in visuals are insufficient.
Built-in Visuals (Always Prefer These):
When You Need Custom Visuals:
Exam Tip: Questions asking "which visual should you use?" will have answers using BUILT-IN visuals. Don't overthink it.
The matrix visual is powerful but complex. Understanding its features is critical for the exam.
Matrix vs Table:
| Feature | Table | Matrix |
|---|---|---|
| Rows | Flat list | Hierarchical groups |
| Columns | Fixed | Dynamic (can pivot) |
| Subtotals | No | Yes |
| Expand/Collapse | No | Yes |
| Drill down | No | Yes |
| Use case | Detail records | Aggregated analysis |
Example Business Scenario: Sales by Region > Store > Product
Table visual would show:
Region | Store | Product | Sales
West | S1 | P1 | 100
West | S1 | P2 | 150
West | S2 | P1 | 120
...
Flat list, no grouping, no subtotals.
Matrix visual would show:
+ West $5,270
+ Store 1 $2,250
- Product 1 $100
- Product 2 $150
+ Store 2 $3,020
- Product 1 $120
...
+ East $4,830
Hierarchical with expand/collapse and subtotals.
Advanced Matrix Features:
1. Stepped Layout:
2. Conditional Formatting on Matrix:
Background color by value:
Data bars in cells:
Icons for indicators:
3. Show Values As:
Instead of absolute values, show:
Example:
% of Total Sales =
DIVIDE(
[Total Sales],
CALCULATE([Total Sales], ALL(Products))
)
In matrix, this shows each product's contribution to total.
4. Drill-Through from Matrix:
Right-click any cell → Drill through to detail page showing:
Performance Analyzer is critical for identifying slow visuals. The exam tests your knowledge of interpreting and fixing performance issues.
Using Performance Analyzer:
Reading Performance Analyzer Results:
Each visual shows three timings:
Total Time = DAX + Display + Other
Example Results:
Sales by Category (Column Chart)
├─ DAX query: 2,450 ms ⚠️ SLOW
├─ Visual display: 120 ms ✓ OK
└─ Other: 45 ms ✓ OK
Total: 2,615 ms
Top 10 Products (Table)
├─ DAX query: 85 ms ✓ FAST
├─ Visual display: 1,850 ms ⚠️ SLOW
└─ Other: 30 ms ✓ OK
Total: 1,965 ms
Diagnosis and Fixes:
Problem: DAX query slow (>2 seconds)
Causes:
Solutions:
Problem: Visual display slow (>1 second)
Causes:
Solutions:
Problem: Both DAX and Display slow
Causes:
Solutions:
Example Optimization Workflow:
Before:
After:
Power BI reports can have separate mobile layouts. Understanding when and how to create them is tested on the exam.
When to Create Mobile Layout:
When NOT Needed:
Mobile Layout Best Practices:
1. Visual Priority:
2. Visual Types for Mobile:
3. Interaction Design:
4. Phone vs Tablet:
Example Mobile Layout Structure:
Desktop layout (3 columns, 8 visuals):
[KPI Card] [KPI Card] [KPI Card]
[Trend Chart spanning 2 columns] [Slicer]
[Table spanning 3 columns]
[Map] [Category Chart]
Mobile layout (1 column, 5 visuals):
[KPI Card - Sales]
[Trend Chart]
[Slicer - Year]
[Category Chart]
[Top 5 Products Table]
Removed: Low-priority visuals (map, 2 KPI cards)
Simplified: Table shows only top 5 instead of all
Understanding filter hierarchy and interactions is critical for exam scenarios.
Filter Levels (from broadest to most specific):
Report-level filters
Page-level filters
Visual-level filters
Drill-through filters
Example Scenario: Sales Dashboard
Report Filter (affects all pages):
Page 1 Filter (overview page):
Page 2 Filter (regional analysis page):
Visual Filter (top products chart on Page 1):
Result:
Filter Interactions:
When you click a data point in one visual, it filters other visuals on the page. You can control this behavior.
Interaction Types:
Filter (default for most visuals)
Highlight
None
Example Configuration:
Page with 3 visuals:
Scenario: User clicks "West" in region slicer
Option A (both set to Filter):
Option B (Category=Highlight, Trend=Filter):
Option C (Category=None, Trend=None):
How to Configure:
Common Exam Scenario:
"Users should be able to select a product category without affecting the sales trend chart. What should you do?"
Answer:
Bookmarks capture the state of a report page and enable sophisticated navigation patterns.
What Bookmarks Capture:
Common Bookmark Patterns:
Pattern 1: View Switcher
Create "Chart View" and "Table View" buttons:
Setup:
Result: Click buttons to toggle between views
Pattern 2: Presets
Create "YTD View", "Last Month", "Last Year" buttons:
Setup:
Result: One-click to switch time periods
Pattern 3: Story Telling
Create presentation mode with "Next" button:
Setup:
Result: Guided tour through the data
Pattern 4: Reset Filters
Create "Clear All" button:
Setup:
Result: One-click to clear all user selections
Bookmark Settings:
Each bookmark can be configured:
Example Configuration:
| Bookmark | Data | Display | Use Case |
|---|---|---|---|
| Chart View | ❌ | ✅ | Toggle visuals, keep filters |
| YTD Filter | ✅ | ❌ | Change filters, keep visuals |
| Full Reset | ✅ | ✅ | Reset everything |
Button Actions:
Buttons can have multiple actions:
Exam Tip: "Users need to quickly switch between chart and table views" → Use bookmarks with button actions
Power BI includes AI visuals that use machine learning. The exam tests when to use each one and how to interpret results.
Allows users to ask questions in natural language and get automatic visualizations.
When to Use:
How It Works:
Configuring Q&A:
Teach Q&A synonyms:
Add featured questions:
Example Questions That Work Well:
Questions That May Fail:
Exam Tip: Q&A requires well-modeled data with proper relationships and synonyms defined.
Analyzes what factors influence a metric (increase or decrease).
When to Use:
What It Shows:
Increase/Decrease tab:
Top Segments tab:
Example Business Question: "What increases sales?"
Key Influencers Results:
When Category is Electronics, sales are 1.5x higher
When Region is West, sales are 1.3x higher
When Discount > 10%, sales are 1.2x higher
Top Segments Results:
Segment 1: Electronics + West = $250K avg (500 customers)
Segment 2: Computers + Enterprise = $220K avg (300 customers)
Segment 3: Electronics + Discount >10% = $210K avg (800 customers)
How to Configure:
Requirements:
Exam Scenario:
"Management wants to understand what drives high revenue. Which visual should you use?"
Answer: Key Influencers visual
Shows hierarchical breakdown of a measure, letting users explore paths interactively.
When to Use:
How It Works:
Key Difference from Matrix:
AI Features (when enabled):
High Value: Automatically highlight highest value branches
Low Value: Automatically highlight lowest value branches
These use AI to find significant splits automatically.
Example Setup:
Analyze: Total Sales
Explain by: Region, Category, Product, Customer Type, Sales Rep, Month
User Flow:
Total Sales: $1M
├─ [User picks Region]
├─ West: $600K
│ ├─ [User picks Category]
│ ├─ Electronics: $350K
│ │ ├─ [User picks Product]
│ │ ├─ Laptop: $200K
│ │ └─ Phone: $150K
│ └─ Computers: $250K
└─ East: $400K
Exam Tip: Decomposition tree = user-driven drill path. Matrix = fixed hierarchy.
Automatically generate text summaries of visual data using AI.
When to Use:
What It Shows:
Example narrative for sales visual:
In 2024, total sales reached $1.2M, representing a 15% increase compared to 2023.
The West region was the top performer with $600K in sales, driven primarily by
Electronics category which contributed 45% of total revenue. The strongest month
was December with $150K in sales, 25% higher than the average month.
How It Works:
Customization:
You can edit the narrative to:
Dynamic Values:
Insert measure values that update with filters:
Sales this year are [Total Sales], which is [YoY Growth %] compared to last year.
When user filters to 2024, it shows:
"Sales this year are $1.2M, which is +15% compared to last year."
Exam Tip: Smart narratives update automatically with filter context.
See diagrams/04_domain3_conditional_formatting.mmd for formatting options.
See diagrams/04_domain3_slicer_sync.mmd for slicer synchronization.
See diagrams/04_domain3_drill_through.mmd for drill-through flow.
What it is: Custom visuals are specialized visualizations beyond Power BI's standard set, either downloaded from AppSource marketplace or created using the Power BI Visuals SDK. They extend visualization capabilities for specific use cases.
Why it exists: Power BI's built-in visuals cover common scenarios, but specialized industries or unique requirements need custom solutions. For example, healthcare needs patient journey maps, logistics needs route optimization visuals, and finance needs advanced statistical charts. Custom visuals fill these gaps.
Real-world analogy: Standard visuals are like pre-built furniture from IKEA - they work for most people. Custom visuals are like hiring a carpenter to build exactly what you need for your unique space. More effort, but perfect fit.
How it works (Detailed step-by-step):
📊 Custom Visual Integration Flow:
sequenceDiagram
participant User
participant PowerBI
participant AppSource
participant Visual
participant Data
User->>PowerBI: Insert → More Visuals
PowerBI->>AppSource: Browse Marketplace
AppSource-->>PowerBI: Available Visuals List
User->>AppSource: Select & Add Visual
AppSource->>PowerBI: Download Visual Package
PowerBI->>Visual: Load Visual SDK
User->>Visual: Map Data Fields
Visual->>Data: Query Filtered Data
Data-->>Visual: Return Dataset
Visual->>Visual: Execute Rendering Logic
Visual-->>PowerBI: Display Chart
PowerBI-->>User: Show Visual
See: diagrams/04_domain3_custom_visual_flow.mmd
Diagram Explanation: This sequence diagram shows the complete flow of adding and using a custom visual. The process starts with the user selecting "More Visuals" in Power BI, which queries the AppSource marketplace for available visuals. After the user selects a visual, it's downloaded as a package and loaded using the Visual SDK. The user then maps data fields to the visual's requirements. When rendering, the visual queries the filtered dataset from Power BI's data model, receives the data, executes its custom rendering logic (often using D3.js or similar frameworks), and displays the result back to the user. This architecture allows third-party developers to extend Power BI's visualization capabilities while maintaining security through Microsoft's certification process.
Detailed Example 1: Using Gantt Chart for Project Management
Scenario: You need to visualize project tasks with start dates, durations, and dependencies - something standard visuals can't do well.
Step-by-step implementation:
Add visual: Insert → More visuals → Search "Gantt" → Add Gantt chart by MAQ Software
Prepare data: Ensure you have these columns:
Map fields:
Configure formatting:
What you get: A visual showing:
When to use: Project tracking, production scheduling, event planning, resource allocation visualization.
Detailed Example 2: Sankey Diagram for Flow Analysis
Scenario: You want to show how customers move through your sales funnel stages with drop-off visualization.
Why Sankey works: Standard visuals show stage counts but not flow between stages. Sankey shows the actual customer journey with proportional flows.
Data structure needed:
Source | Target | Value
------------|-------------|-------
Visit | Sign Up | 10000
Visit | Bounce | 5000
Sign Up | Trial | 7000
Sign Up | Abandoned | 3000
Trial | Purchase | 4000
Trial | Expired | 3000
Implementation:
Result: Visual showing:
Business insight: Immediately see that 50% bounce at visit stage, 30% abandon after signup, and 57% trial-to-purchase conversion.
Detailed Example 3: R/Python Visuals for Statistical Analysis
What it is: Embed R or Python scripts directly in Power BI visuals, allowing advanced statistical visualizations not available in standard visuals.
Setup requirements:
Example - Python Box Plot for Outlier Detection:
Scenario: You have sales data and want to identify outlier transactions per region using a box plot.
Python script visual:
import matplotlib.pyplot as plt
import pandas as pd
# Power BI passes filtered data as 'dataset'
df = dataset
# Create box plot
plt.figure(figsize=(12, 6))
df.boxplot(column='SalesAmount', by='Region', figsize=(12,6))
plt.suptitle('Sales Distribution by Region')
plt.xlabel('Region')
plt.ylabel('Sales Amount ($)')
plt.xticks(rotation=45)
plt.show()
What Power BI does:
Advantages:
Limitations:
When to use:
What you'll learn:
Time to complete: 6-8 hours
Prerequisites: Chapters 0-3 (Fundamentals, Data Prep, Modeling, Visualization)
What they are: Workspaces are containers in Power BI Service that hold related content (reports, dashboards, datasets, dataflows). Think of them as project folders in the cloud.
Why they exist: Organization needs collaborative spaces where teams can build, share, and manage BI content together with appropriate access control.
Workspace Types:
| Type | Use Case | Licensing | Collaboration |
|---|---|---|---|
| My Workspace | Personal sandbox | Free/Pro | Individual only |
| Workspace (Modern) | Team collaboration | Pro or Premium | Multiple users, roles |
Workspace Roles:
| Role | Can View | Can Edit | Can Publish | Can Manage Users | Can Delete Workspace |
|---|---|---|---|---|---|
| Viewer | ✅ | ❌ | ❌ | ❌ | ❌ |
| Contributor | ✅ | ✅ | ✅ | ❌ | ❌ |
| Member | ✅ | ✅ | ✅ | ✅ | ❌ |
| Admin | ✅ | ✅ | ✅ | ✅ | ✅ |
Detailed Role Permissions:
Viewer:
Contributor:
Member:
Admin:
📊 Workspace Collaboration Flow:
graph TD
ADMIN[Admin Creates Workspace] --> ADD[Adds Team Members]
ADD --> ASSIGN{Assigns Roles}
ASSIGN --> VIEWER[Viewer:<br/>Consumes reports]
ASSIGN --> CONTRIB[Contributor:<br/>Creates content]
ASSIGN --> MEMBER[Member:<br/>Manages users]
CONTRIB --> PUBLISH[Publishes<br/>from Desktop]
PUBLISH --> DATASET[Dataset in Workspace]
PUBLISH --> REPORT[Report in Workspace]
REPORT --> APP[Package as App]
APP --> DISTRIB[Distribute to<br/>End Users]
style ADMIN fill:#f3e5f5
style DATASET fill:#e1f5fe
style REPORT fill:#c8e6c9
style APP fill:#fff3e0
See: diagrams/05_domain4_workspace_flow.mmd
From Power BI Desktop to Service:
Step-by-step publish process:
What gets published:
After Publishing - Required Steps:
1. Configure Data Source Credentials (for cloud sources):
2. Setup Scheduled Refresh (for Import mode):
3. Configure Gateway (for on-premises sources):
Method 1: Apps (Recommended for End Users)
What it is: Apps package related dashboards and reports into a single, easily discoverable unit distributed to large audiences.
Why use apps:
Creating an App:
App vs Direct Sharing:
| Feature | App | Direct Share |
|---|---|---|
| Audience | Hundreds/thousands | <100 users |
| Navigation | Custom menu | Standard Power BI navigation |
| Updates | One app update | Must reshare |
| Workspace access | Not required | Viewer role needed |
| Best for | External distribution | Team collaboration |
Method 2: Direct Sharing
What it is: Share individual report/dashboard with specific users by email.
How to share:
Permissions granted:
⚠️ Common Mistake: Sharing without considering licensing
Method 3: Publish to Web (Public)
What it is: Generate embed code to publish report publicly on internet (no authentication).
When to use:
When NOT to use:
⚠️ Critical Security Warning: Publish to Web makes data publicly accessible. Anyone with link can view. Use only for truly public data.
Method 4: Embed in Applications
What it is: Embed Power BI reports in custom applications using iFrame or JavaScript SDK.
Types:
What it is: Row-Level Security restricts which rows users can see in a dataset based on their identity. Same report shows different data to different users.
Why it exists: Enable secure data sharing where users should only see their own data (e.g., sales reps see only their sales, regional managers see only their region).
How it works:
In Power BI Desktop:
Step 1: Create Role
Common Filter Patterns:
Pattern 1: Filter by User Email
[SalesPersonEmail] = USERPRINCIPALNAME()
Shows only rows where SalesPersonEmail matches logged-in user's email.
Pattern 2: Filter by User in Related Table
[Region] IN
CALCULATETABLE(
VALUES(UserRegions[Region]),
UserRegions[Email] = USERPRINCIPALNAME()
)
Uses lookup table (UserRegions) mapping users to regions.
Pattern 3: Manager Hierarchy
PATHCONTAINS(
Employee[ManagerPath],
Employee[EmployeeID],
LOOKUPVALUE(Employee[EmployeeID], Employee[Email], USERPRINCIPALNAME())
)
Shows data for user and all subordinates in reporting hierarchy.
Step 2: Test Role in Desktop
Example: Test as "Sales_Region" role with "john@contoso.com"
Step 3: Publish to Service
📊 RLS Flow Diagram:
sequenceDiagram
participant User
participant Service as Power BI Service
participant Dataset
participant RLS as RLS Engine
User->>Service: Opens report
Service->>RLS: Who is this user?
RLS->>RLS: Check user's role assignments
RLS->>Dataset: Apply DAX filter for user's role
Dataset->>Dataset: Filter rows
Dataset-->>Service: Return filtered data
Service-->>User: Display report<br/>(only user's data)
Note over RLS,Dataset: Filter applied<br/>automatically
See: diagrams/05_domain4_rls_flow.mmd
Testing RLS in Service:
Advanced RLS Scenarios:
Dynamic RLS with Table:
Create UserRoles table:
| Region | |
|---|---|
| john@contoso.com | West |
| jane@contoso.com | East |
| admin@contoso.com | ALL |
Role filter DAX:
[Region] =
IF(
LOOKUPVALUE(UserRoles[Region], UserRoles[Email], USERPRINCIPALNAME()) = "ALL",
[Region], // No filter for ALL
LOOKUPVALUE(UserRoles[Region], UserRoles[Email], USERPRINCIPALNAME())
)
Multiple Roles:
User can be assigned to multiple roles → Filters combine with OR logic (sees union of all role filters).
⭐ Must Know: RLS Best Practices:
Build Permission:
What it is: Permission that allows users to create new reports connected to a shared dataset.
Why it matters: Separates dataset governance (one published dataset) from report creation (multiple analysts creating custom reports).
How to grant Build permission:
What Build permission allows:
What Build permission does NOT allow:
Scenario: Central BI team publishes certified sales dataset. Sales analysts get Build permission → They create custom reports for their needs using trusted dataset.
What they are: Classification labels (e.g., Public, Internal, Confidential, Highly Confidential) applied to Power BI content to indicate sensitivity level.
Why they exist: Compliance and data governance requirements mandate classifying and protecting sensitive data.
Requirements:
How to apply:
Label Inheritance:
What labels do:
Example Labels:
What it is: On-premises data gateway is software installed on-premises that enables secure data transfer between on-premises data sources and Power BI Service.
Why it exists: Corporate data often resides on-premises (SQL Server, file shares, legacy systems). Gateway provides secure bridge without opening firewall or moving data permanently to cloud.
Gateway Types:
1. On-premises data gateway (Standard)
2. On-premises data gateway (Personal mode)
Gateway Architecture:
graph LR
PBI[Power BI Service<br/>Cloud] <-->|Encrypted<br/>Outbound Only| GW[On-Premises<br/>Gateway]
GW <--> SQL[(SQL Server)]
GW <--> FILE[File Share]
GW <--> SAP[SAP System]
style PBI fill:#e1f5fe
style GW fill:#c8e6c9
style SQL fill:#fff3e0
See: diagrams/05_domain4_gateway_architecture.mmd
Gateway Installation & Configuration:
Prerequisites:
Installation Steps:
Adding Data Sources to Gateway:
Using Gateway in Dataset:
⚠️ Common Gateway Issues:
Issue 1: Gateway offline
Issue 2: Authentication failure
Issue 3: Firewall blocking
✅ Workspaces and Collaboration
✅ Content Distribution
✅ Row-Level Security (RLS)
✅ Permissions and Access Control
✅ Data Protection and Governance
✅ Gateway Management
Try these from your practice test bundles:
Workspace Roles:
Distribution Methods:
RLS Key Functions:
Gateway:
Permissions Hierarchy:
Next Steps: Proceed to 06_integration to learn how concepts from all domains integrate in real-world scenarios and cross-domain problem solving.
Row-Level Security (RLS) is one of the most tested topics. Understanding dynamic and complex RLS scenarios is critical.
The most common enterprise pattern uses a separate user mapping table.
Scenario: Regional sales managers can only see their assigned regions.
Setup:
Step 1: Create UserRegions table (in database or manually)
| UserEmail | Region |
|---|---|
| john@company.com | West |
| sarah@company.com | East |
| mike@company.com | West |
| admin@company.com | All |
Step 2: Load this table into Power BI model
Step 3: Create relationship (or not, depending on approach)
Approach A: With Relationship
Create relationship: UserRegions[Region] → Sales[Region]
RLS role "Regional Managers":
[UserEmail] = USERPRINCIPALNAME()
Filter on UserRegions table only.
How it works:
Approach B: Without Relationship (More flexible)
No relationship between UserRegions and Sales.
RLS role "Regional Managers":
VAR CurrentUser = USERPRINCIPALNAME()
VAR UserRegion =
CALCULATE(
VALUES(UserRegions[Region]),
UserRegions[UserEmail] = CurrentUser
)
RETURN
[Region] IN UserRegion
Filter on Sales table.
How it works:
Example with Multiple Regions:
UserRegions table:
| UserEmail | Region |
|---|---|
| john@company.com | West |
| john@company.com | South |
| sarah@company.com | East |
John sees West AND South. Sarah sees only East.
Scenario: Managers see their direct reports' sales plus their own.
Setup:
Employees table:
| EmployeeID | Name | ManagerID | |
|---|---|---|---|
| 1 | Alice | NULL | alice@company.com |
| 2 | Bob | 1 | bob@company.com |
| 3 | Carol | 1 | carol@company.com |
| 4 | Dave | 2 | dave@company.com |
Hierarchy: Alice manages Bob and Carol. Bob manages Dave.
Sales table has EmployeeID (who made the sale).
RLS DAX (on Sales table):
VAR CurrentUserEmail = USERPRINCIPALNAME()
VAR CurrentUserID =
CALCULATE(
VALUES(Employees[EmployeeID]),
Employees[Email] = CurrentUserEmail
)
VAR DirectReports =
CALCULATETABLE(
VALUES(Employees[EmployeeID]),
Employees[ManagerID] = CurrentUserID
)
VAR AllSubordinates =
PATH(Employees[EmployeeID], Employees[ManagerID])
VAR VisibleEmployees =
CALCULATETABLE(
VALUES(Employees[EmployeeID]),
PATHCONTAINS(AllSubordinates, CurrentUserID)
)
RETURN
[EmployeeID] IN VisibleEmployees || [EmployeeID] = CurrentUserID
Simpler version (one level only):
VAR CurrentUserID =
LOOKUPVALUE(
Employees[EmployeeID],
Employees[Email],
USERPRINCIPALNAME()
)
RETURN
[EmployeeID] = CurrentUserID ||
RELATED(Employees[ManagerID]) = CurrentUserID
Result:
Users can belong to multiple roles. Filters combine with OR logic.
Example Setup:
Role 1: "West Region"
[Region] = "West"
Role 2: "Product Managers"
[Category] = "Electronics"
User assigned to BOTH roles:
Best Practice: Use single comprehensive role instead:
Role: "West Product Managers"
[Region] = "West" && [Category] = "Electronics"
OR use user mapping table approach for better control.
Scenario: Students can see data for their enrolled classes.
Tables:
RLS Setup:
On Enrollment table:
VAR CurrentStudent =
LOOKUPVALUE(
Students[StudentID],
Students[Email],
USERPRINCIPALNAME()
)
RETURN
[StudentID] = CurrentStudent
Relationships:
Result: Student sees only their enrolled classes and grades.
In Power BI Desktop:
View shows:
In Power BI Service:
Two approaches:
Approach 1: Test Users
Approach 2: Built-in Testing
Common Test Cases:
| Test | Verify |
|---|---|
| User in single role | Sees only allowed data |
| User in multiple roles | Sees OR of both roles |
| User in no roles | Sees all data (or nothing, if "Everyone" role exists) |
| Admin (no RLS) | Sees all data |
| Non-existent user email | Error or no data |
RLS Performance Testing:
RLS can significantly impact performance:
Inefficient RLS (scans whole table):
VAR UserRegion =
LOOKUPVALUE(
UserRegions[Region],
UserRegions[Email],
USERPRINCIPALNAME()
)
RETURN
[Region] = UserRegion
Efficient RLS (uses indexed column):
[Region] =
LOOKUPVALUE(
UserRegions[Region],
UserRegions[Email],
USERPRINCIPALNAME()
)
Ensure source database has indexes on filtered columns.
❌ WRONG: Testing as yourself without role assigned
❌ WRONG: Filtering dimension table instead of fact table
❌ WRONG: Using USERNAME() instead of USERPRINCIPALNAME()
❌ WRONG: Multiple overlapping roles without planning
✅ CORRECT:
Understanding refresh capabilities and gateway configuration is essential for managing production reports.
Import Mode Refresh:
Characteristics:
Refresh Limits (Free/Pro license):
Refresh Limits (Premium/PPU):
Manual Refresh:
DirectQuery (no import refresh):
Live Connection (no import refresh):
For large datasets, refreshing everything takes too long. Incremental refresh only refreshes recent data.
When to Use:
How It Works:
Example: Sales table with 10 years of history
Without incremental refresh:
With incremental refresh:
Configuration:
Step 1: Create RangeStart and RangeEnd parameters
In Power Query:
RangeStart = #datetime(2024, 1, 1, 0, 0, 0) meta [IsParameterQuery=true, Type="DateTime"]
RangeEnd = #datetime(2025, 1, 1, 0, 0, 0) meta [IsParameterQuery=true, Type="DateTime"]
Step 2: Filter query using parameters
= Table.SelectRows(
Sales,
each [OrderDate] >= RangeStart and [OrderDate] < RangeEnd
)
Step 3: Configure incremental refresh policy
In Desktop:
Step 4: Publish to Service
Service takes over, applies policy automatically.
Result:
Requirements:
Exam Scenario:
"Sales table has 50 million rows covering 10 years. Daily refresh takes 6 hours and fails. What should you do?"
Answer: Configure incremental refresh
Gateways enable Power BI Service to access on-premises data sources.
Gateway Architecture:
Power BI Service (cloud)
↕ (encrypted connection)
Gateway (on-premises server)
↕ (local network)
Data Source (SQL Server, file share, etc.)
Installation Requirements:
Server:
Network:
Account:
Gateway Installation Steps:
Gateway Configuration:
Add Data Sources:
Add Users:
Testing:
Single gateway = single point of failure. Clusters provide redundancy.
Cluster Setup:
Primary Gateway:
Add Cluster Members:
Load Balancing:
Exam Tip: High availability = Use gateway cluster.
Issue: Refresh fails with "Can't reach data source"
Diagnosis:
Solutions:
Issue: Refresh very slow (takes hours)
Diagnosis:
Solutions:
Issue: "The credentials provided for the X source are invalid"
Solutions:
These data sources don't require a gateway:
Azure Services:
Cloud Services:
When Gateway IS Required:
Exam Tip: "Company wants to eliminate gateway dependency" → Migrate to Azure SQL Database or other cloud sources.
Sensitivity labels classify and protect data based on sensitivity level.
Label Hierarchy:
Typical organization labels:
What Labels Do:
Applying Labels in Power BI:
Option 1: Manual
Option 2: Automatic (Premium)
Option 3: Recommended
Label Inheritance:
Hierarchy:
Example:
Audit and Compliance:
With sensitivity labels, admins can:
Requirements:
Exam Scenario:
"Finance team's reports contain sensitive financial data. Reports should be marked confidential and encrypted. What should you do?"
Answer:
Power BI deployment pipelines automate moving content between Development, Test, and Production environments.
Pipeline Stages:
Development:
Test:
Production:
Deployment Process:
What Gets Deployed:
What Doesn't Get Deployed:
Deployment Rules:
Configure per stage to change parameters:
Example: Database connection differs per environment
Development:
Production:
Deployment rule:
When deploying Dev → Prod, rule automatically updates connection string.
Requirements:
Exam Scenario:
"Company wants to test reports before releasing to users. Reports use different databases in test vs production. What should you do?"
Answer:
See diagrams/05_domain4_workspace_flow.mmd for workspace collaboration.
See diagrams/05_domain4_rls_flow.mmd for RLS evaluation.
See diagrams/05_domain4_gateway_architecture.mmd for gateway architecture.
What it is: Dynamic RLS uses DAX functions like USERNAME() or USERPRINCIPALNAME() to automatically filter data based on who is viewing the report, without needing to create separate roles for each user.
Why it exists: Imagine a company with 500 salespeople, each needing to see only their own data. Creating 500 static RLS roles is impractical. Dynamic RLS solves this by using a single role with a formula that reads the current user's identity and filters accordingly.
Real-world analogy: It's like a hotel key card system. Instead of creating a unique key for each guest that only opens their specific room (500 static roles), you use smart cards that read the guest's ID and automatically grant access to their assigned room (1 dynamic role). Same result, dramatically simpler management.
How it works (Detailed step-by-step):
[UserEmail] = USERNAME() or [UserEmail] = USERPRINCIPALNAME()📊 Dynamic RLS Architecture Diagram:
graph TB
subgraph "Data Model"
SEC[User_Security Table<br/>UserEmail | Territory<br/>john@co.com | West<br/>jane@co.com | East]
SALES[Sales Table<br/>Territory | Amount<br/>West | $1000<br/>East | $800]
SEC -->|Relationship| SALES
end
subgraph "RLS Role: 'Sales Rep'"
RULE[DAX Filter:<br/>User_Security UserEmail<br/>= USERNAME]
end
subgraph "User: john@co.com Views Report"
U1[USERNAME Returns:<br/>john@co.com] --> F1[Filter Applied:<br/>UserEmail = john@co.com]
F1 --> R1[Sees Only:<br/>West Territory<br/>$1000]
end
subgraph "User: jane@co.com Views Report"
U2[USERNAME Returns:<br/>jane@co.com] --> F2[Filter Applied:<br/>UserEmail = jane@co.com]
F2 --> R2[Sees Only:<br/>East Territory<br/>$800]
end
style SEC fill:#e1f5fe
style RULE fill:#fff3e0
style R1 fill:#c8e6c9
style R2 fill:#c8e6c9
See: diagrams/05_domain4_dynamic_rls.mmd
Diagram Explanation: This diagram illustrates dynamic RLS in action. At the top is the data model with a User_Security table (containing user-to-territory mappings in blue) related to the Sales table. The middle shows a single RLS role with a DAX filter using USERNAME(). The bottom two sections demonstrate what happens when different users view the same report. When john@co.com views the report, USERNAME() returns "john@co.com", which filters the User_Security table to the West territory row, which in turn filters the Sales table to only West territory data ($1000 shown in green). When jane@co.com views the same report, USERNAME() returns "jane@co.com", filtering to East territory ($800). Both users access the same report with one role, but see different data automatically based on their identity.
Detailed Example 1: Sales Territory Security
Scenario: Your company has 200 sales representatives across 50 territories. Each rep should see only their assigned territory's data. Regional managers should see their entire region (multiple territories). VPs should see all data.
Data model setup:
Table 1: User_Security
UserEmail | Role | Territory | Region
--------------------|-----------|-----------|--------
john@co.com | Rep | CA-North | West
jane@co.com | Rep | NY-Metro | East
bob@co.com | Manager | NULL | West
alice@co.com | VP | NULL | NULL
Table 2: Territory (dimension)
Territory | Region | Country
-----------|--------|--------
CA-North | West | USA
CA-South | West | USA
NY-Metro | East | USA
TX-Central | Central| USA
Table 3: Sales (fact)
Date | Territory | Amount
---------|------------|-------
2024-1-1 | CA-North | $5000
2024-1-1 | NY-Metro | $3000
Relationships:
RLS roles configuration:
Role 1: Sales Territory Security
// Applied to User_Security table
User_Security[UserEmail] = USERNAME()
That's it! One role handles all scenarios:
Testing the role:
Detailed Example 2: Handling Multiple Security Attributes
Scenario: Users need filtering by BOTH department AND cost center. For example, HR can see all HR data across cost centers, but Finance in Cost Center 101 sees only Finance data in CC 101.
Complex security table:
UserEmail | Department | CostCenter
---------------|------------|------------
hr1@co.com | HR | NULL // All HR, all cost centers
hr2@co.com | HR | 101 // HR in CC 101 only
fin1@co.com | Finance | 101 // Finance in CC 101 only
fin2@co.com | Finance | NULL // All Finance, all cost centers
RLS DAX filter (handles both attributes):
// On User_Security table
User_Security[UserEmail] = USERNAME()
// On fact table (double security layer)
OR(
ISBLANK(LOOKUPVALUE(User_Security[Department], User_Security[UserEmail], USERNAME())),
Sales[Department] = LOOKUPVALUE(User_Security[Department], User_Security[UserEmail], USERNAME())
)
&&
OR(
ISBLANK(LOOKUPVALUE(User_Security[CostCenter], User_Security[UserEmail], USERNAME())),
Sales[CostCenter] = LOOKUPVALUE(User_Security[CostCenter], User_Security[UserEmail], USERNAME())
)
How this works:
Result:
Detailed Example 3: USERNAME() vs USERPRINCIPALNAME()
What's the difference:
When to use each:
Use USERNAME() when:
Use USERPRINCIPALNAME() when:
Example with USERPRINCIPALNAME():
// User_Security table filter
[UserEmail] = USERPRINCIPALNAME()
Your security table would have:
UserEmail (must match UPN)
-------------------------
john.doe@company.com
jane.smith@company.com
Testing considerations:
⭐ Must Know (Critical Facts):
This chapter shows how concepts from all four domains integrate in real-world scenarios. Exam questions often test multiple domains simultaneously.
Business Requirement: Create quarterly sales dashboard for regional managers showing sales trends, product performance, and customer insights with appropriate security.
Domain 1 - Prepare Data:
= {Number.From(#date(2020,1,1))..Number.From(#date(2025,12,31))}"Q" & Text.From(Date.QuarterOfYear([Date]))Domain 2 - Model Data:
Total Sales = SUM(Sales[Amount])Sales QTD = TOTALQTD([Total Sales], Date[Date])Sales PY = CALCULATE([Total Sales], SAMEPERIODLASTYEAR(Date[Date]))YoY Growth % = DIVIDE([Total Sales] - [Sales PY], [Sales PY])Domain 3 - Visualize:
Domain 4 - Manage & Secure:
[Region] = USERPRINCIPALNAME()Key Integration Points:
Problem: Report takes 30+ seconds to load, users complaining about slowness.
Domain 1 - Data Preparation:
WHERE OrderDate >= '2020-01-01'Domain 2 - Model Optimization:
CALCULATE(SUM, FILTER(ALL...)) with simpler filtersDomain 3 - Visual Optimization:
Domain 4 - Service Optimization:
Result: Load time reduced from 30s to 3s.
Pattern 1: Data Prep → Modeling
Question shows unprepared data and asks for model design.
Example: "You have OrderDate as text 'MM/DD/YYYY'. What should you do to enable time intelligence?"
Answer:
Pattern 2: Modeling → Visualization
Question asks which visual is appropriate given model structure.
Example: "You have fact table with OrderDate, ProductID, Quantity. You want to show quantity trend over time. Which visual?"
Answer: Line chart with OrderDate (D2 relationship) on axis, SUM(Quantity) on values (D3 visual selection)
Pattern 3: Security → Performance
Question about RLS impact on performance.
Example: "You have 1 million row sales table with RLS filtering by SalesRep. Users report slow performance. What to do?"
Answer:
Next Steps: Proceed to 07_study_strategies for exam preparation techniques.
Diagram Explanation: This diagram shows an end-to-end data flow from data ingestion through visualization. The workflow begins with raw data sources (SQL Server, Excel, APIs) connecting to Power Query for data transformation. Power Query applies cleaning, shaping, and business logic transformations before loading into the data model. The data model implements star schema with fact and dimension tables, relationships, and DAX calculations. Visuals query the data model using DAX measures in filter context, and the results are displayed in interactive reports. This complete pipeline demonstrates how all four exam domains work together in a real implementation.
Build a comprehensive sales analytics solution for a retail company with:
📊 Complete Solution Architecture:
graph TB
subgraph "Domain 1: Data Preparation"
SQL[(SQL Server<br/>Transactions)]
EXCEL[Excel File<br/>Sales Targets]
SP[(SharePoint<br/>Product Catalog)]
SQL --> PQ[Power Query]
EXCEL --> PQ
SP --> PQ
PQ --> TRANS[Transformations:<br/>- Clean nulls<br/>- Join tables<br/>- Add calculated columns<br/>- Filter last 3 years]
end
subgraph "Domain 2: Data Modeling"
TRANS --> DM[Data Model]
DM --> FACT[Fact: Sales<br/>Date | Product | Territory | Amount]
DM --> DIM1[Dim: Date<br/>Fiscal Calendar]
DM --> DIM2[Dim: Product<br/>Category | Subcategory]
DIM --> DIM3[Dim: Territory<br/>Region | Manager]
FACT -.->|Relationships| DIM1
FACT -.->|Relationships| DIM2
FACT -.->|Relationships| DIM3
DM --> DAX[DAX Measures:<br/>- Total Sales<br/>- YoY Growth %<br/>- Profit Margin<br/>- Target Variance]
end
subgraph "Domain 3: Visualization"
DAX --> VIS[Visuals:<br/>- KPI Cards<br/>- Trend Lines<br/>- Regional Map<br/>- Product Matrix]
VIS --> REP[Interactive Report:<br/>- Bookmarks<br/>- Drill-through<br/>- Mobile layout<br/>- Tooltips]
end
subgraph "Domain 4: Security & Deployment"
REP --> RLS[Row-Level Security:<br/>- Regional Managers<br/>- Sales Reps<br/>- Executives]
RLS --> WS[Workspace]
WS --> APP[Workspace App]
APP --> USERS[End Users]
WS --> REFRESH[Scheduled Refresh:<br/>Hourly + Nightly]
end
style PQ fill:#fff3e0
style DM fill:#e1f5fe
style VIS fill:#f3e5f5
style RLS fill:#c8e6c9
See: diagrams/06_integration_complete_solution.mmd
Step 1: Connect to SQL Server (DirectQuery vs Import decision)
Analysis:
Decision: Composite model
Power Query implementation:
// Historical Sales (Import)
Source = Sql.Database("server", "salesdb"),
Sales = Source{[Schema="dbo",Item="Sales"]}[Data],
FilterHistorical = Table.SelectRows(Sales, each [OrderDate] < Date.AddDays(Date.From(DateTime.LocalNow()), -7))
// Recent Sales (DirectQuery)
SourceDQ = Sql.Database("server", "salesdb", [Query="SELECT * FROM Sales WHERE OrderDate >= DATEADD(day, -7, GETDATE())"]),
Step 2: Clean and Transform Excel Targets
Challenge: Excel file has merged cells, inconsistent formatting, and header rows scattered throughout.
Power Query steps:
Table.SelectRows(#"Promoted Headers", each not List.IsEmpty(List.RemoveMatchingItems(Record.FieldValues(_), {"", null})))Date.FromText([Month] & " 1, 2024")Step 3: Integrate SharePoint Product Catalog
Challenge: SharePoint list has incremental updates, need to detect changes.
Solution: Use dataflow with incremental refresh
Benefits:
Step 1: Design Star Schema
Fact Table: Sales
Dimension Tables:
Date (generated in DAX):
Date =
ADDCOLUMNS(
CALENDAR(DATE(2022,1,1), DATE(2025,12,31)),
"Year", YEAR([Date]),
"Quarter", "Q" & FORMAT([Date], "Q"),
"Month", FORMAT([Date], "MMM"),
"MonthNum", MONTH([Date]),
"FiscalYear", IF(MONTH([Date]) <= 6, YEAR([Date]), YEAR([Date]) + 1),
"FiscalQuarter", "FQ" & IF(MONTH([Date]) <= 3, 4,
IF(MONTH([Date]) <= 6, 1,
IF(MONTH([Date]) <= 9, 2, 3)))
)
Product (from SharePoint dataflow):
Territory:
Step 2: Create Relationships
Step 3: Build DAX Measures
Total Sales:
Total Sales = SUM(Sales[TotalAmount])
Sales Last Year (time intelligence):
Sales LY =
CALCULATE(
[Total Sales],
SAMEPERIODLASTYEAR(Date[Date])
)
YoY Growth %:
YoY Growth % =
DIVIDE(
[Total Sales] - [Sales LY],
[Sales LY],
0
)
Profit Margin %:
Profit Margin % =
DIVIDE(
SUM(Sales[ProfitAmount]),
[Total Sales],
0
)
Target Variance:
Target Variance =
VAR CurrentSales = [Total Sales]
VAR TargetAmount = SUM(Targets[Target])
RETURN
DIVIDE(CurrentSales - TargetAmount, TargetAmount, 0)
Running Total Sales (for cumulative charts):
Running Total =
CALCULATE(
[Total Sales],
FILTER(
ALLSELECTED(Date[Date]),
Date[Date] <= MAX(Date[Date])
)
)
Report Page 1: Executive Overview
Layout:
KPI Card Configuration (Sales card):
Line Chart with Forecast:
Map Visual:
Report Page 2: Product Deep Dive
Decomposition Tree:
Matrix with Conditional Formatting:
Drill-through Configuration:
Report Page 3: Territory Analysis
Map with Custom Tooltips:
Bookmark Navigation:
Mobile Layout:
Step 1: Implement Row-Level Security
Security Table (User_Security):
UserEmail | Role | Region
--------------------|--------------|----------
exec1@co.com | Executive | NULL
exec2@co.com | Executive | NULL
mgr.west@co.com | Manager | West
mgr.east@co.com | Manager | East
rep1@co.com | Rep | West
rep2@co.com | Rep | East
Relationship: User_Security[Region] → Territory[Region] (many-to-one, both directions)
RLS Role 1: Regional Access
// On User_Security table
[UserEmail] = USERPRINCIPALNAME()
That's it! Security propagates through relationships:
Optional: Separate rep vs manager logic:
// On Territory table, if you want different access for reps vs managers
VAR CurrentUser = USERPRINCIPALNAME()
VAR UserRole = LOOKUPVALUE(User_Security[Role], User_Security[UserEmail], CurrentUser)
VAR UserRegion = LOOKUPVALUE(User_Security[Region], User_Security[UserEmail], CurrentUser)
RETURN
OR(
ISBLANK(UserRegion), // Executive - see all
Territory[Region] = UserRegion // Manager/Rep - see assigned region
)
Step 2: Deployment to Workspace
Create Premium Workspace:
Publish from Desktop:
Configure RLS Group Membership (in Service):
Configure Scheduled Refresh:
Create and Configure App:
Step 3: Monitor and Maintain
Usage Metrics:
Performance Optimization:
Domain 1 ↔ Domain 2:
Domain 2 ↔ Domain 3:
Domain 3 ↔ Domain 4:
All Domains:
Week 1-2: Domain Mastery
Week 3: Integration & Practice
Week 4: Final Prep
1. Hands-On Practice
2. Teach Someone
3. Create Flashcards
CALCULATE Function Mnemonic: "CAN"
Visual Selection: "CTRL"
Exam Stats:
Strategy:
Time Savers:
Step 1: Identify Domain (5 sec)
Step 2: Read Carefully (20 sec)
Step 3: Eliminate Wrong Answers (30 sec)
Step 4: Choose Best Answer (30 sec)
Trap 1: "Works but isn't optimal"
Trap 2: "Sounds right but technically wrong"
Trap 3: "Over-engineering"
Trap 4: "Missing prerequisites"
Domain 1 (Prepare Data):
Domain 2 (Model Data):
Domain 3 (Visualize):
Domain 4 (Manage & Secure):
When Stuck:
Never:
Next Steps: Proceed to 08_final_checklist for final week preparation checklist.
Why it works: Power BI is a practical tool. Reading about transformations or DAX won't cement understanding like actually building reports. Active practice creates muscle memory and reveals edge cases documentation doesn't cover.
Method 1: Rebuild Sample Reports
Why this works: Forces you to make design decisions independently, troubleshoot when stuck, and discover multiple solutions to the same problem.
Method 2: Daily DAX Challenge
Example daily challenge - CALCULATE:
Method 3: Error-Driven Learning
Common errors to explore:
// Error: Circular dependency
Measure1 = [Measure2] + 100
Measure2 = [Measure1] * 2
// Error: Cannot convert value to type
Text Measure = "Sales: " & SUM(Sales[Amount]) // Wrong
Text Measure Fixed = "Sales: " & FORMAT(SUM(Sales[Amount]), "$#,##0") // Correct
// Error: The function expects a table expression
Wrong = CALCULATE(Sales[Amount]) // Sales[Amount] is a column, not a measure
Correct = CALCULATE(SUM(Sales[Amount])) // SUM() returns a scalar
What it is: Reviewing material at increasing intervals (1 day, 3 days, 1 week, 2 weeks) to combat forgetting curve.
Power BI Spaced Repetition Plan:
Week 1: Initial Learning
Week 2: Continue + Review
Week 3-4: Deep Practice + Cumulative Review
Week 5-6: Pattern Recognition + Weak Area Focus
📊 Spaced Repetition Schedule:
gantt
title 6-Week Study Plan with Spaced Repetition
dateFormat YYYY-MM-DD
section Week 1
Domain 1 Initial Learn :done, d1, 2024-01-01, 2d
Domain 2 Initial Learn :done, d2, 2024-01-03, 2d
Review D1+D2 :active, r1, 2024-01-05, 1d
section Week 2
Domain 3 Initial Learn :d3, 2024-01-08, 2d
Review D1+D2 (Spaced) :r2, 2024-01-10, 1d
Domain 4 Initial Learn :d4, 2024-01-11, 1d
Full Review :r3, 2024-01-12, 1d
section Week 3
D1 Deep Dive :d1b, 2024-01-15, 1d
D2 Deep Dive + Review D1 :d2b, 2024-01-16, 1d
D3 Deep Dive :d3b, 2024-01-17, 1d
D4 Deep Dive + Review D2 :d4b, 2024-01-18, 1d
Practice Test 1 :pt1, 2024-01-19, 1d
section Week 4
Review Weak Areas :weak1, 2024-01-22, 3d
Practice Test 2 :pt2, 2024-01-25, 1d
Review All Domains :r4, 2024-01-26, 1d
section Week 5
Focused Practice :prac1, 2024-01-29, 4d
Practice Test 3 :pt3, 2024-02-02, 1d
section Week 6
Final Review :final, 2024-02-05, 4d
Exam Day :milestone, exam, 2024-02-09, 0d
See: diagrams/07_study_spaced_repetition.mmd
Pattern 1: Scenario-Based Transformation Questions
Question format: "You have a table with columns A, B, C. You need to achieve result X. What transformation should you use?"
How to approach:
Example keywords to watch:
Pattern 2: DAX Function Selection
Question format: "You need to calculate X that considers Y filter. Which function?"
Decision tree approach:
Does it need to modify filter context?
├─ Yes → CALCULATE or CALCULATETABLE
│ └─ Returning single value? → CALCULATE
│ └─ Returning table? → CALCULATETABLE
│
└─ No → Does it iterate row-by-row?
├─ Yes → Iterator function (SUMX, AVERAGEX, etc.)
└─ No → Simple aggregation (SUM, AVERAGE, etc.)
Common traps:
Pattern 3: Security Implementation Questions
Question format: "Users in Group A should see X, users in Group B should see Y. How to implement?"
Decision matrix:
Red flags in answer choices:
Time allocation strategy:
First pass (60 minutes):
Second pass (20 minutes):
Review pass (10 minutes):
Final 10 minutes:
Strategy 1: Eliminate by Category
Many questions have 4 options from different categories. Eliminate entire categories first.
Example: "Which DAX function calculates running total?"
Elimination logic:
Strategy 2: Keyword Matching
Question keywords → Likely answer type:
Strategy 3: Scenario Requirements Checklist
For complex scenarios, make a quick checklist of requirements:
Example: "Solution must: (1) real-time data, (2) <1GB dataset, (3) complex transformations"
Evaluate each answer:
Common exam tricks:
Cornell Method for Power BI:
Page layout:
┌─────────────────┬────────────────────────────────────┐
│ Cue Column │ Notes Column │
│ (Keywords) │ (Detailed Explanation) │
├─────────────────┼────────────────────────────────────┤
│ CALCULATE │ Changes filter context. Syntax: │
│ - When to use? │ CALCULATE(expression, filter1, │
│ - Common errors?│ filter2, ...). Use when need to │
│ │ override slicer/visual filters. │
│ │ Common error: Forgetting to wrap │
│ │ column in aggregation like SUM() │
├─────────────────┴────────────────────────────────────┤
│ Summary (Bottom section): │
│ CALCULATE is the most important DAX function. │
│ Master it by practicing filter modifications. │
└──────────────────────────────────────────────────────┘
Digital alternative - Notion/OneNote structure:
What to include:
Card format (use Anki, Quizlet, or physical cards):
Front:
Function: RELATED
Category: ?
Use case: ?
Back:
Category: Relationship function
Use case: Retrieve value from related table in calculated column (row context)
Syntax: RELATED(column)
Example: Product[CategoryName] = RELATED(Category[Name])
Related: RELATEDTABLE (opposite direction, returns table)
Topics to create flashcards for:
Review schedule:
Domain 1: Prepare the Data (27.5%)
Domain 2: Model the Data (27.5%)
Domain 3: Visualize and Analyze (27.5%)
Domain 4: Manage and Secure (17.5%)
If you checked fewer than 80%: Focus final week on specific gaps.
Hour 1: Quick Reference Review
Hour 2: Weak Areas Only
Hour 3: Mental Preparation
Don't:
When exam starts, immediately write down on provided materials:
DAX Formulas:
Key Numbers:
Decision Trees:
Time Management:
Do:
Don't:
If you pass:
If you don't pass:
Good luck on PL-300! 🎯
Next: Review 99_appendices for quick reference tables and glossary during final study sessions.
Data Source Connectivity (⭐ Must memorize):
Key decision - Import vs DirectQuery:
Data Transformation (⭐ Practice these):
Most tested transformation scenarios:
Query Folding (🎯 Exam focus):
Incremental Refresh (🎯 Frequently tested):
Relationships (⭐ Must memorize):
Star Schema Design:
Role-Playing Dimensions:
Core Functions (⭐ Must memorize syntax):
Aggregations:
SUM(column) - Total of a columnAVERAGE(column) - Mean valueCOUNT(column) - Count of non-blank valuesDISTINCTCOUNT(column) - Count of unique valuesMIN(column) / MAX(column) - Minimum/Maximum valueCALCULATE (🎯 Most important DAX function):
CALCULATE(expression, filter1, filter2, ...)CALCULATE([Total Sales], REMOVEFILTERS(Date[Year])) - ignore year filterFilter Functions:
ALL(table) or ALL(column) - Remove all filtersALLEXCEPT(table, column1, column2) - Remove all filters except specified columnsFILTER(table, condition) - Return filtered tableREMOVEFILTERS(table/column) - Same as ALL but more explicit (recommended)Time Intelligence (🎯 Frequently tested):
TOTALYTD(expression, dates) - Year-to-date totalSAMEPERIODLASTYEAR(dates) - Same period last yearDATEADD(dates, number, interval) - Shift dates by intervalDATESYTD(dates) - Returns year-to-date dates tableCommon Time Intelligence Pattern:
Sales LY = CALCULATE([Total Sales], SAMEPERIODLASTYEAR(Date[Date]))
YoY Growth % = DIVIDE([Total Sales] - [Sales LY], [Sales LY], 0)
Iterator Functions:
SUMX(table, expression) - Iterate and sumAVERAGEX(table, expression) - Iterate and averageCalculated Columns vs Measures (🎯 Frequently tested):
Common DAX Errors to Avoid:
Sales[Amount] → should be SUM(Sales[Amount])When to use each visual (🎯 Frequently tested):
Bookmarks (🎯 Frequently tested):
Drill-through:
Tooltips:
Sync Slicers:
Common exam scenario: "Apply conditional formatting to show negative values in red, positive in green"
Workspace Roles (⭐ Must memorize):
Publishing & Distribution:
Scheduled Refresh:
Row-Level Security (RLS) (🎯 Very frequently tested):
[Column] = VALUE() or dynamic with USERNAME()[UserEmail] = USERPRINCIPALNAME() filters per user automaticallyUSERNAME() vs USERPRINCIPALNAME():
Object-Level Security (OLS):
Sensitivity Labels:
YoY Growth = DIVIDE([This Year] - [Last Year], [Last Year], 0)
Running Total = CALCULATE([Total], FILTER(ALLSELECTED(Date), Date <= MAX(Date)))
% of Total = DIVIDE([Value], CALCULATE([Value], ALL(Dimension)))
[UserEmail] = USERPRINCIPALNAME()Time management:
Question approach:
Common traps to avoid:
✅ Passing score: 700/1000 (approximately 70%)
✅ You don't need 100% - Missing 15 questions still passes
✅ Some questions are experimental - Not all questions count toward score
✅ Time is generous - 100 minutes for 50 questions allows review
✅ Partial credit scenarios - Some multiple-answer questions give partial credit
If you pass ✅:
If you don't pass ❌:
You've got this! Trust your preparation, manage your time, and remember: This certification validates practical skills you'll use daily as a Power BI analyst. Good luck!
| Function | Syntax | Purpose | Example |
|---|---|---|---|
| TOTALYTD | TOTALYTD(<expression>, <dates>[, <filter>]) |
Calculates year-to-date total | TOTALYTD(SUM(Sales[Amount]), Date[Date]) |
| TOTALQTD | TOTALQTD(<expression>, <dates>[, <filter>]) |
Calculates quarter-to-date total | TOTALQTD([Total Sales], Date[Date]) |
| TOTALMTD | TOTALMTD(<expression>, <dates>[, <filter>]) |
Calculates month-to-date total | TOTALMTD([Total Sales], Date[Date]) |
| SAMEPERIODLASTYEAR | SAMEPERIODLASTYEAR(<dates>) |
Returns same period in previous year | CALCULATE([Total Sales], SAMEPERIODLASTYEAR(Date[Date])) |
| PREVIOUSMONTH | PREVIOUSMONTH(<dates>) |
Returns previous month's dates | CALCULATE([Total Sales], PREVIOUSMONTH(Date[Date])) |
| PREVIOUSQUARTER | PREVIOUSQUARTER(<dates>) |
Returns previous quarter's dates | CALCULATE([Total Sales], PREVIOUSQUARTER(Date[Date])) |
| PREVIOUSYEAR | PREVIOUSYEAR(<dates>) |
Returns previous year's dates | CALCULATE([Total Sales], PREVIOUSYEAR(Date[Date])) |
| DATEADD | DATEADD(<dates>, <number_of_intervals>, <interval>) |
Shifts dates by specified interval | DATEADD(Date[Date], -1, YEAR) |
| DATESBETWEEN | DATESBETWEEN(<dates>, <start_date>, <end_date>) |
Returns dates between two dates | DATESBETWEEN(Date[Date], DATE(2024,1,1), DATE(2024,12,31)) |
| DATESYTD | DATESYTD(<dates>[, <year_end_date>]) |
Returns dates from start of year to current date | DATESYTD(Date[Date]) |
| Function | Syntax | Purpose | Example |
|---|---|---|---|
| CALCULATE | CALCULATE(<expression>, <filter1>, <filter2>, ...) |
Modifies filter context | CALCULATE(SUM(Sales[Amount]), Products[Category]="Electronics") |
| FILTER | FILTER(<table>, <filter_expression>) |
Returns filtered table | FILTER(Products, Products[Price] > 100) |
| ALL | ALL(<table_or_column>) |
Removes all filters from table/column | CALCULATE(SUM(Sales[Amount]), ALL(Date)) |
| ALLEXCEPT | ALLEXCEPT(<table>, <column1>, <column2>, ...) |
Removes all filters except specified | CALCULATE(SUM(Sales[Amount]), ALLEXCEPT(Date, Date[Year])) |
| ALLSELECTED | ALLSELECTED(<table_or_column>) |
Removes context filters while keeping slicer filters | CALCULATE(SUM(Sales[Amount]), ALLSELECTED(Products)) |
| REMOVEFILTERS | REMOVEFILTERS(<table_or_column>) |
Removes filters (newer alternative to ALL) | CALCULATE([Total Sales], REMOVEFILTERS(Date)) |
| KEEPFILTERS | KEEPFILTERS(<filter>) |
Adds filter without removing existing | CALCULATE([Total Sales], KEEPFILTERS(Products[Color]="Red")) |
| USERELATIONSHIP | USERELATIONSHIP(<column1>, <column2>) |
Activates inactive relationship | CALCULATE([Total Sales], USERELATIONSHIP(Sales[ShipDate], Date[Date])) |
| Function | Syntax | Purpose | Example |
|---|---|---|---|
| SUMX | SUMX(<table>, <expression>) |
Row-by-row sum | SUMX(Sales, Sales[Quantity] * Sales[Price]) |
| AVERAGEX | AVERAGEX(<table>, <expression>) |
Row-by-row average | AVERAGEX(Products, Products[Price]) |
| MINX | MINX(<table>, <expression>) |
Row-by-row minimum | MINX(Sales, Sales[Quantity] * Sales[Price]) |
| MAXX | MAXX(<table>, <expression>) |
Row-by-row maximum | MAXX(Sales, Sales[Quantity] * Sales[Price]) |
| COUNTX | COUNTX(<table>, <expression>) |
Row-by-row count of non-blank | COUNTX(Sales, Sales[OrderID]) |
| RANKX | RANKX(<table>, <expression>[, <value>][, <order>]) |
Ranks value in table | RANKX(ALL(Products), [Total Sales]) |
| Function | Syntax | Purpose | Example |
|---|---|---|---|
| SUM | SUM(<column>) |
Sum of column | SUM(Sales[Amount]) |
| AVERAGE | AVERAGE(<column>) |
Average of column | AVERAGE(Products[Price]) |
| MIN | MIN(<column>) |
Minimum value | MIN(Sales[OrderDate]) |
| MAX | MAX(<column>) |
Maximum value | MAX(Sales[OrderDate]) |
| COUNT | COUNT(<column>) |
Count of non-blank values | COUNT(Sales[OrderID]) |
| COUNTA | COUNTA(<column>) |
Count of non-blank (any type) | COUNTA(Customers[Email]) |
| COUNTROWS | COUNTROWS(<table>) |
Count rows in table | COUNTROWS(Sales) |
| DISTINCTCOUNT | DISTINCTCOUNT(<column>) |
Count unique values | DISTINCTCOUNT(Sales[CustomerID]) |
| Function | Syntax | Purpose | Example |
|---|---|---|---|
| IF | IF(<logical_test>, <value_if_true>[, <value_if_false>]) |
Conditional logic | IF([Total Sales] > 10000, "High", "Low") |
| SWITCH | SWITCH(<expression>, <value>, <result>[, ...][, <else>]) |
Multiple conditions | SWITCH([Category], "Electronics", 0.1, "Clothing", 0.15, 0.05) |
| AND | AND(<logical1>, <logical2>) |
Both conditions true | IF(AND([Quantity]>10, [Price]>100), "Premium", "Standard") |
| OR | OR(<logical1>, <logical2>) |
Either condition true | IF(OR([Category]="A", [Category]="B"), "Priority", "Regular") |
| NOT | NOT(<logical>) |
Negates condition | NOT([IsActive]) |
| IFERROR | IFERROR(<value>, <value_if_error>) |
Handle errors | IFERROR(DIVIDE([Sales], [Quantity]), 0) |
| ISBLANK | ISBLANK(<value>) |
Checks if blank | IF(ISBLANK([CustomerName]), "Unknown", [CustomerName]) |
| Function | Syntax | Purpose | Example |
|---|---|---|---|
| CONCATENATE | CONCATENATE(<text1>, <text2>) |
Joins text | CONCATENATE([FirstName], " ", [LastName]) |
| LEFT | LEFT(<text>, <num_chars>) |
Left characters | LEFT([ProductCode], 3) |
| RIGHT | RIGHT(<text>, <num_chars>) |
Right characters | RIGHT([ProductCode], 2) |
| MID | MID(<text>, <start_num>, <num_chars>) |
Middle characters | MID([ProductCode], 4, 2) |
| LEN | LEN(<text>) |
Length of text | LEN([Description]) |
| UPPER | UPPER(<text>) |
Uppercase | UPPER([Status]) |
| LOWER | LOWER(<text>) |
Lowercase | LOWER([Email]) |
| TRIM | TRIM(<text>) |
Remove extra spaces | TRIM([ProductName]) |
| SUBSTITUTE | SUBSTITUTE(<text>, <old_text>, <new_text>) |
Replace text | SUBSTITUTE([Phone], "-", "") |
| Function | Syntax | Purpose | Example |
|---|---|---|---|
| RELATED | RELATED(<column>) |
Gets related value (many-to-one) | RELATED(Products[Category]) |
| RELATEDTABLE | RELATEDTABLE(<table>) |
Gets related table (one-to-many) | COUNTROWS(RELATEDTABLE(Sales)) |
| CROSSFILTER | CROSSFILTER(<column1>, <column2>, <direction>) |
Modifies filter direction | CALCULATE([Total Sales], CROSSFILTER(Sales[ProductID], Products[ProductID], Both)) |
| Function | Syntax | Purpose | Example |
|---|---|---|---|
| USERNAME | USERNAME() |
Returns domain\user | [Region] = LOOKUPVALUE(Users[Region], Users[Username], USERNAME()) |
| USERPRINCIPALNAME | USERPRINCIPALNAME() |
Returns user@domain.com | [SalesRep] = USERPRINCIPALNAME() |
| HASONEVALUE | HASONEVALUE(<column>) |
True if column filtered to one value | IF(HASONEVALUE(Products[Category]), VALUES(Products[Category]), "Multiple") |
| SELECTEDVALUE | SELECTEDVALUE(<column>[, <alternate_result>]) |
Gets single selected value | SELECTEDVALUE(Products[Category], "All Categories") |
| Question | Best Visual | Second Choice | Avoid |
|---|---|---|---|
| Compare categories | Bar/Column Chart | Table | Pie (>5 slices) |
| Show trend over time | Line Chart | Area Chart | Bar Chart |
| Show composition | Stacked Bar, Pie | Treemap | Multiple pies |
| Show distribution | Histogram | Scatter | Line |
| Show relationship | Scatter Plot | Bubble | Bar |
| Show part-to-whole | Pie, Donut | Treemap | Stacked column |
| Show ranking | Bar Chart (sorted) | Table (sorted) | Pie |
| Show exact values | Table, Matrix | Card | Charts |
| Show geographic | Map, Filled Map | Table with location | Bar |
| Show hierarchy | Matrix, Treemap | Decomposition Tree | Table |
| Show KPIs | Card, KPI | Gauge | Table |
| Show multiple measures | Combo Chart | Multiple charts | Single bar |
| Data Points | Visual Type | Why |
|---|---|---|
| 1 value | Card | Shows single number prominently |
| 2-5 values | Bar, Column, Pie | All categories visible at once |
| 6-20 values | Bar (sorted), Column | Readable comparisons |
| 21-50 values | Table, Matrix | Too many for chart |
| 50+ values | Table (with search), Treemap | Charts become cluttered |
| Time series (<20 points) | Line, Column | Shows trend clearly |
| Time series (20-100 points) | Line, Area | Column becomes cluttered |
| Time series (100+ points) | Line only | Other visuals unreadable |
| Data Type | Visual | Example |
|---|---|---|
| Categorical | Bar, Column, Pie | Product categories, Regions |
| Continuous | Line, Area | Temperature, Stock price |
| Geographic | Map, Filled Map | Sales by country |
| Temporal | Line, Area | Sales over time |
| Hierarchical | Matrix, Treemap | Category > Subcategory > Product |
| Relationship (2 measures) | Scatter | Price vs Quantity |
| Relationship (3 measures) | Bubble | Price vs Quantity sized by Profit |
| Part-to-whole | Pie, Donut, Stacked Bar | Market share |
| Deviation | Waterfall | Profit bridges |
| Distribution | Histogram | Age distribution |
| Operation | M Formula | Example |
|---|---|---|
| Add custom column | Table.AddColumn(source, "NewCol", each [Col1] * [Col2]) |
Table.AddColumn(Sales, "Total", each [Qty] * [Price]) |
| Filter rows | Table.SelectRows(source, each [Column] > value) |
Table.SelectRows(Sales, each [Amount] > 1000) |
| Remove columns | Table.RemoveColumns(source, {"Col1", "Col2"}) |
Table.RemoveColumns(Sales, {"CreatedBy", "ModifiedBy"}) |
| Rename column | Table.RenameColumns(source, {{"OldName", "NewName"}}) |
Table.RenameColumns(Sales, {{"Amt", "Amount"}}) |
| Change type | Table.TransformColumnTypes(source, {{"Col", type}}) |
Table.TransformColumnTypes(Sales, {{"Date", type date}}) |
| Replace values | Table.ReplaceValue(source, "old", "new", Replacer.ReplaceText, {"Col"}) |
Table.ReplaceValue(Sales, null, 0, Replacer.ReplaceValue, {"Qty"}) |
| Group by | Table.Group(source, {"GroupCol"}, {{"NewCol", each List.Sum([ValueCol]), type number}}) |
Table.Group(Sales, {"Product"}, {{"TotalSales", each List.Sum([Amount]), type number}}) |
| Sort | Table.Sort(source, {{"Column", Order.Ascending}}) |
Table.Sort(Sales, {{"Date", Order.Descending}}) |
| Function | Purpose | Example |
|---|---|---|
Date.Year([Date]) |
Extract year | Date.Year(#date(2024,3,15)) returns 2024 |
Date.Month([Date]) |
Extract month number | Date.Month(#date(2024,3,15)) returns 3 |
Date.Day([Date]) |
Extract day | Date.Day(#date(2024,3,15)) returns 15 |
Date.DayOfWeek([Date]) |
Day of week (0=Sunday) | Date.DayOfWeek(#date(2024,3,15)) returns 5 |
Date.DayOfYear([Date]) |
Day number in year | Date.DayOfYear(#date(2024,3,15)) returns 75 |
Date.MonthName([Date]) |
Month name | Date.MonthName(#date(2024,3,15)) returns "March" |
Date.DayOfWeekName([Date]) |
Day name | Date.DayOfWeekName(#date(2024,3,15)) returns "Friday" |
Date.QuarterOfYear([Date]) |
Quarter number | Date.QuarterOfYear(#date(2024,3,15)) returns 1 |
Date.AddDays([Date], n) |
Add days | Date.AddDays(#date(2024,3,15), 7) returns #date(2024,3,22) |
Date.AddMonths([Date], n) |
Add months | Date.AddMonths(#date(2024,3,15), 2) returns #date(2024,5,15) |
Date.AddYears([Date], n) |
Add years | Date.AddYears(#date(2024,3,15), 1) returns #date(2025,3,15) |
Date.From(value) |
Convert to date | Date.From("2024-03-15") returns #date(2024,3,15) |
| Function | Purpose | Example |
|---|---|---|
Text.Upper(text) |
Uppercase | Text.Upper("hello") returns "HELLO" |
Text.Lower(text) |
Lowercase | Text.Lower("HELLO") returns "hello" |
Text.Proper(text) |
Title case | Text.Proper("john smith") returns "John Smith" |
Text.Trim(text) |
Remove spaces | Text.Trim(" hello ") returns "hello" |
Text.Length(text) |
Text length | Text.Length("hello") returns 5 |
Text.Start(text, n) |
First n characters | Text.Start("hello", 3) returns "hel" |
Text.End(text, n) |
Last n characters | Text.End("hello", 3) returns "llo" |
Text.Middle(text, start, n) |
Middle characters | Text.Middle("hello", 1, 3) returns "ell" |
Text.Replace(text, old, new) |
Replace text | Text.Replace("hello", "ll", "yy") returns "heyyo" |
Text.Contains(text, substring) |
Check if contains | Text.Contains("hello", "ell") returns true |
Text.Combine(list, separator) |
Join text | Text.Combine({"A","B","C"}, "-") returns "A-B-C" |
| Function | Purpose | Example |
|---|---|---|
List.Sum(list) |
Sum list | List.Sum({1,2,3}) returns 6 |
List.Average(list) |
Average | List.Average({1,2,3}) returns 2 |
List.Min(list) |
Minimum | List.Min({3,1,2}) returns 1 |
List.Max(list) |
Maximum | List.Max({3,1,2}) returns 3 |
List.Count(list) |
Count items | List.Count({1,2,3}) returns 3 |
List.Distinct(list) |
Unique values | List.Distinct({1,2,2,3}) returns {1,2,3} |
List.Sort(list) |
Sort list | List.Sort({3,1,2}) returns {1,2,3} |
| Shortcut | Action |
|---|---|
| Ctrl + S | Save file |
| Ctrl + O | Open file |
| Ctrl + N | New file |
| Ctrl + Z | Undo |
| Ctrl + Y | Redo |
| Ctrl + F | Find (in data view) |
| Ctrl + C | Copy |
| Ctrl + V | Paste |
| Ctrl + X | Cut |
| Delete | Delete selected visual/item |
| Shortcut | Action |
|---|---|
| Ctrl + 1 | Report view |
| Ctrl + 2 | Data view |
| Ctrl + 3 | Model view |
| Shortcut | Action |
|---|---|
| Ctrl + Click | Multi-select visuals |
| Ctrl + G | Group visuals |
| Ctrl + Shift + G | Ungroup visuals |
| Ctrl + D | Duplicate visual |
| Alt + Shift + F10 | Filter pane |
| Alt + Shift + F12 | Analytics pane |
| Shortcut | Action |
|---|---|
| Ctrl + B | Bold (text box) |
| Ctrl + I | Italic (text box) |
| Ctrl + U | Underline (text box) |
| Shortcut | Action |
|---|---|
| Alt + Q | Open Power Query Editor |
| Ctrl + Alt + R | Refresh preview |
| Ctrl + Click column | Select multiple columns |
| Shift + Click column | Select range of columns |
| Right-click | Context menu |
| Delete | Remove selected step |
| Shortcut | Action |
|---|---|
| Ctrl + Space | Auto-complete |
| Ctrl + K, Ctrl + C | Comment line |
| Ctrl + K, Ctrl + U | Uncomment line |
| Ctrl + Enter | Commit measure |
| Esc | Cancel edit |
| Error | Cause | Solution |
|---|---|---|
| Expression.Error: The column 'X' of the table wasn't found | Column renamed/deleted in source | Update query to use correct column name |
| DataFormat.Error: We couldn't convert to Number | Non-numeric value in number column | Use Number.From() with error handling |
| DataSource.Error: Couldn't refresh the entity | Connection issue | Check credentials, network, source availability |
| Formula.Firewall: Query references other queries | Privacy levels conflict | Configure privacy levels in Options |
| Expression.Error: We cannot apply operator & to types Text and Number | Type mismatch | Convert to same type: Text.From([Number]) |
| Error | Cause | Solution |
|---|---|---|
| A single value for column 'X' cannot be determined | Multiple values returned where one expected | Use aggregation: SUM(), MAX(), etc. |
| The value for column 'X' in table 'Y' cannot be determined | Ambiguous relationship path | Use CALCULATE with USERELATIONSHIP |
| Circular dependency detected | Measure references itself directly/indirectly | Restructure measure logic |
| A function 'X' has been used in a True/False expression | Wrong return type | Ensure function returns true/false |
| The syntax for 'X' is incorrect | DAX syntax error | Check parentheses, commas, quotes |
| Error | Cause | Solution |
|---|---|---|
| Relationship cannot be created. Both columns must have unique values | Many-to-many without intermediate table | Create bridge table with unique keys |
| Circular dependency detected between tables | Relationship loop | Remove/deactivate one relationship |
| This table has no rows | Empty query result | Check source data and filters |
| Error | Cause | Solution |
|---|---|---|
| Couldn't load the visual | Visual not supported/corrupted | Remove and re-add visual |
| This visual has exceeded the available resources | Too much data | Reduce data volume or use sampling |
| No data available | All values filtered out | Check filters and slicers |
VAR) to avoid recalculationIF(HASONEVALUE())/ to handle division by zeroScenario: Filter data based on logged-in user email
DAX Filter:
[UserEmail] = USERPRINCIPALNAME()
When to use: Each user sees only their own data (e.g., salesperson sees own sales)
Scenario: Filter by user's assigned role/region
Setup:
UserEmail | RegionDAX Filter (on UserRoles table):
[UserEmail] = USERPRINCIPALNAME()
When to use: Users assigned to specific groups (e.g., regional managers)
Scenario: Manager sees own data + subordinates' data
DAX Filter:
PATHCONTAINS(
[ManagerPath],
LOOKUPVALUE(
Users[EmployeeID],
Users[Email],
USERPRINCIPALNAME()
)
)
When to use: Organizational hierarchies
Scenario: Look up user's allowed regions from separate table
DAX Filter:
[Region] = LOOKUPVALUE(
UserRegions[Region],
UserRegions[UserEmail],
USERPRINCIPALNAME()
)
When to use: Centralized security table separate from main data
Scenario: User can see multiple regions
Setup: UserRegions table with multiple rows per user
DAX Filter:
[Region] IN VALUES(UserRegions[Region])
When to use: Users with access to multiple categories/regions
Scenario: Users can only see current and future data
DAX Filter:
[Date] >= TODAY()
When to use: Restrict historical data access
| Scenario | Cardinality | Direction | Notes |
|---|---|---|---|
| Dimension → Fact | One-to-many (1:*) | Single (Dim→Fact) | Standard |
| Date → Fact | One-to-many (1:*) | Single (Date→Fact) | Most common |
| Fact → Fact | Many-to-many (:) | Both (with bridge) | Use bridge table |
| Dimension → Dimension | One-to-many (1:*) | Single | Snowflake (avoid) |
| Role-playing dimension | One-to-many (1:*) | Only one active | Use USERELATIONSHIP |
| Use Calculated Column When... | Use Measure When... |
|---|---|
| Need to filter/slice by result | Need aggregated value |
| Result is row-level | Result is context-dependent |
| Value doesn't change with filters | Value changes with filters |
| Example: Full Name = First + Last | Example: Total Sales = SUM(Amount) |
| Example: Age Group from BirthDate | Example: YoY Growth % |
General rule: Prefer measures over calculated columns for better performance.
| Data | Recommended Type | Why |
|---|---|---|
| IDs, SKUs | Text | May contain letters |
| Prices, amounts | Decimal (Currency) | Precision |
| Quantities | Integer | Whole numbers |
| Percentages | Decimal | Values like 0.15 |
| Dates | Date | Not datetime |
| Timestamps | Datetime | Includes time |
| True/False | Boolean | Yes/No |
Active Relationship: The default relationship used for filtering between two tables. Only one relationship between two tables can be active.
Aggregation: Combining multiple values into a single value (e.g., SUM, AVG, COUNT).
Bidirectional Relationship: A relationship where filters flow in both directions (from table A to B and B to A).
Bookmark: A saved state of a report page, including filter state, slicer selections, and visual properties.
Calculated Column: A column created using DAX that is computed row-by-row and stored in the model.
Calculated Table: An entire table created using DAX, computed when the model is refreshed.
Cardinality: The uniqueness of values in a column. High cardinality = many unique values; low cardinality = few unique values.
Composite Model: A data model that uses both Import and DirectQuery storage modes.
Cross-Filter Direction: The direction that filters flow in a relationship (single or both).
DAX (Data Analysis Expressions): The formula language used in Power BI for creating measures, calculated columns, and calculated tables.
Dataflow: A cloud-based ETL tool in Power BI Service for creating reusable data preparation logic.
Dimension Table: A table containing descriptive attributes (e.g., Products, Customers, Date).
DirectQuery: A storage mode where queries are sent directly to the data source rather than importing data.
Drill-Down: Navigating from summary level to detail level within a hierarchy (e.g., Year → Quarter → Month).
Drill-Through: Navigating from one report page to another with filtered context passed through.
Fact Table: A table containing measurable quantities and foreign keys to dimension tables (e.g., Sales, Orders).
Filter Context: The set of filters applied to a DAX calculation, including slicers, visual filters, and row filters.
Gateway: Software that connects Power BI Service to on-premises data sources for refresh.
Implicit Measure: An automatic aggregation (SUM, COUNT, etc.) created when you drag a column to a visual.
Inactive Relationship: A relationship that exists but is not used by default. Can be activated using USERELATIONSHIP().
Incremental Refresh: A refresh strategy that only refreshes new or changed data rather than the entire dataset.
M Language: The formula language used in Power Query for data transformation.
Many-to-Many Relationship: A relationship where both sides can have duplicate values. Requires a bridge table.
Measure: A DAX formula that performs calculations based on filter context. Recalculated dynamically.
Model View: The view in Power BI Desktop where you manage tables, relationships, and model properties.
One-to-Many Relationship: A relationship where one side has unique values and the other can have duplicates.
Premium Capacity: A Power BI licensing option that provides dedicated resources and advanced features.
Query Folding: When Power Query transformations are converted to native data source queries (e.g., SQL).
Row Context: The context of iterating row-by-row through a table, used in calculated columns and iterator functions.
Row-Level Security (RLS): Security that filters data at the row level based on user identity.
Slicer: A visual that filters other visuals on the page or across pages.
Star Schema: A data model design with fact tables in the center connected to dimension tables.
Storage Mode: How data is stored in Power BI (Import, DirectQuery, or Dual).
Tooltip: A small popup that appears when hovering over a data point in a visual.
Workspace: A container in Power BI Service for organizing and collaborating on content.
Business Requirements:
Implementation Steps:
Data Preparation:
Data Modeling:
Total Sales = SUM(Sales[Amount])Total Units = SUM(Sales[Quantity])Avg Transaction = DIVIDE([Total Sales], DISTINCTCOUNT(Sales[TransactionID]))Sales LY = CALCULATE([Total Sales], SAMEPERIODLASTYEAR(Date[Date]))Sales Growth % = DIVIDE([Total Sales] - [Sales LY], [Sales LY])Visualization:
Security:
[Region] = USERPRINCIPALNAME()Business Requirements:
Key Measures:
Current Stock =
CALCULATE(
SUM(Inventory[Quantity]),
FILTER(
Inventory,
Inventory[Date] = MAX(Inventory[Date])
)
)
Stock Value =
SUMX(
Inventory,
Inventory[Quantity] * RELATED(Products[Cost])
)
Low Stock Alert =
IF(
[Current Stock] < [Reorder Point],
"⚠️ Reorder",
"✅ OK"
)
Inventory Turnover =
DIVIDE(
[Total Sales],
AVERAGE(Inventory[StockValue])
)
Visuals:
Business Requirements:
Key Measures:
Total Employees =
DISTINCTCOUNT(Employees[EmployeeID])
Active Employees =
CALCULATE(
[Total Employees],
Employees[Status] = "Active"
)
Attrition Rate =
VAR TerminatedThisYear =
CALCULATE(
[Total Employees],
Employees[TerminationDate] >= DATE(YEAR(TODAY()), 1, 1),
Employees[TerminationDate] <= TODAY()
)
VAR AvgHeadcount =
CALCULATE(
AVERAGE(Employees[Headcount]),
Date[Year] = YEAR(TODAY())
)
RETURN
DIVIDE(TerminatedThisYear, AvgHeadcount)
Average Tenure =
AVERAGEX(
FILTER(Employees, Employees[Status] = "Active"),
DATEDIFF(Employees[HireDate], TODAY(), YEAR)
)
Security:
Final Note: This appendix is designed as a quick reference during your final study sessions. Bookmark frequently referenced sections (DAX functions, visual matrix, keyboard shortcuts) for easy access during practice tests.
Previous Chapter: Return to 08_final_checklist
End of Study Guide 📚
| Function | Syntax | Purpose | Example | Notes |
|---|---|---|---|---|
| SUM | SUM(column) |
Total of numeric column | SUM(Sales[Amount]) |
Most common aggregation |
| AVERAGE | AVERAGE(column) |
Mean value | AVERAGE(Sales[Amount]) |
Ignores blanks |
| COUNT | COUNT(column) |
Count non-blank values | COUNT(Sales[OrderID]) |
Any data type |
| COUNTA | COUNTA(column) |
Count non-blank (alternate) | COUNTA(Sales[Status]) |
Includes text |
| COUNTROWS | COUNTROWS(table) |
Count rows in table | COUNTROWS(Sales) |
Preferred over COUNT |
| DISTINCTCOUNT | DISTINCTCOUNT(column) |
Count unique values | DISTINCTCOUNT(Sales[CustomerID]) |
Use for customer counts |
| MIN | MIN(column) |
Minimum value | MIN(Sales[Date]) |
Works with dates too |
| MAX | MAX(column) |
Maximum value | MAX(Sales[Date]) |
Works with dates too |
| Function | Syntax | Purpose | Example | When to Use |
|---|---|---|---|---|
| SUMX | SUMX(table, expression) |
Iterate and sum | SUMX(Sales, [Qty] * [Price]) |
Row-by-row calculation needed |
| AVERAGEX | AVERAGEX(table, expression) |
Iterate and average | AVERAGEX(Products, [Price] * [Cost]) |
Average of calculated values |
| COUNTX | COUNTX(table, expression) |
Count non-blank results | COUNTX(Sales, IF([Amount]>100,1)) |
Conditional counting |
| MINX | MINX(table, expression) |
Minimum of expression | MINX(Sales, [Amount]/[Qty]) |
Min of calculation |
| MAXX | MAXX(table, expression) |
Maximum of expression | MAXX(Sales, [Amount]/[Qty]) |
Max of calculation |
| RANKX | RANKX(table, expression, value, order) |
Rank value in table | RANKX(ALL(Product), [Total Sales]) |
Product ranking |
Key difference: Regular aggregations (SUM, AVERAGE) operate on a column. Iterator functions (SUMX, AVERAGEX) iterate row-by-row, allowing complex calculations.
Example showing why SUMX matters:
// WRONG - You can't multiply two columns directly in a measure
Total Revenue = SUM(Sales[Quantity]) * SUM(Sales[UnitPrice]) // Incorrect!
// CORRECT - SUMX iterates each row, multiplies, then sums
Total Revenue = SUMX(Sales, Sales[Quantity] * Sales[UnitPrice]) // Correct!
| Function | Syntax | Purpose | Example | Notes |
|---|---|---|---|---|
| CALCULATE | CALCULATE(expr, filter1, ...) |
Modify filter context | CALCULATE([Sales], Year=2024) |
Most important function |
| CALCULATETABLE | CALCULATETABLE(table, filter1, ...) |
Modify context, return table | CALCULATETABLE(Sales, Year=2024) |
Like CALCULATE but returns table |
| FILTER | FILTER(table, condition) |
Filter table by condition | FILTER(Sales, [Amount] > 100) |
Returns filtered table |
| ALL | ALL(table/column) |
Remove filters from table/column | ALL(Sales) or ALL(Date[Year]) |
Ignores slicers |
| ALLEXCEPT | ALLEXCEPT(table, col1, col2, ...) |
Remove all filters except specified | ALLEXCEPT(Sales, Sales[Region]) |
Keep Region filter only |
| ALLSELECTED | ALLSELECTED(table/column) |
Remove filters but keep visual context | ALLSELECTED(Sales) |
Respects visual filters |
| REMOVEFILTERS | REMOVEFILTERS(table/column) |
Remove filters (explicit) | REMOVEFILTERS(Date) |
Preferred over ALL |
| VALUES | VALUES(column) |
Distinct values in filter context | VALUES(Product[Category]) |
Visible categories |
| DISTINCT | DISTINCT(column) |
Distinct values (alternate) | DISTINCT(Product[ID]) |
Similar to VALUES |
Filter function hierarchy:
Common pattern - % of total:
% of Total Sales =
DIVIDE(
[Total Sales],
CALCULATE([Total Sales], ALL(Product)), // Total sales ignoring product filter
0
)
| Function | Syntax | Purpose | Example | Requirements |
|---|---|---|---|---|
| TOTALYTD | TOTALYTD(expr, dates, filter) |
Year-to-date total | TOTALYTD([Sales], Date[Date]) |
Contiguous date table |
| TOTALQTD | TOTALQTD(expr, dates, filter) |
Quarter-to-date total | TOTALQTD([Sales], Date[Date]) |
Contiguous date table |
| TOTALMTD | TOTALMTD(expr, dates, filter) |
Month-to-date total | TOTALMTD([Sales], Date[Date]) |
Contiguous date table |
| SAMEPERIODLASTYEAR | SAMEPERIODLASTYEAR(dates) |
Dates from last year | CALCULATE([Sales], SAMEPERIODLASTYEAR(Date[Date])) |
Date table |
| PARALLELPERIOD | PARALLELPERIOD(dates, number, interval) |
Parallel period (month/quarter/year) | PARALLELPERIOD(Date[Date], -1, YEAR) |
Date table |
| DATEADD | DATEADD(dates, number, interval) |
Shift dates by interval | DATEADD(Date[Date], -12, MONTH) |
Date table |
| PREVIOUSMONTH | PREVIOUSMONTH(dates) |
Previous month dates | PREVIOUSMONTH(Date[Date]) |
Date table |
| PREVIOUSQUARTER | PREVIOUSQUARTER(dates) |
Previous quarter dates | PREVIOUSQUARTER(Date[Date]) |
Date table |
| PREVIOUSYEAR | PREVIOUSYEAR(dates) |
Previous year dates | PREVIOUSYEAR(Date[Date]) |
Date table |
| DATESYTD | DATESYTD(dates, yearend) |
Year-to-date dates | DATESYTD(Date[Date]) |
Date table |
| DATESQTD | DATESQTD(dates) |
Quarter-to-date dates | DATESQTD(Date[Date]) |
Date table |
| DATESMTD | DATESMTD(dates) |
Month-to-date dates | DATESMTD(Date[Date]) |
Date table |
CRITICAL Requirements for Time Intelligence:
Common YoY Growth Pattern:
Sales LY =
CALCULATE(
[Total Sales],
SAMEPERIODLASTYEAR(Date[Date])
)
YoY Growth = [Total Sales] - [Sales LY]
YoY Growth % =
DIVIDE(
[YoY Growth],
[Sales LY],
0
)
| Function | Syntax | Purpose | Example | Context |
|---|---|---|---|---|
| RELATED | RELATED(column) |
Get related value (many-side) | RELATED(Category[Name]) |
Calculated column |
| RELATEDTABLE | RELATEDTABLE(table) |
Get related rows (one-side) | RELATEDTABLE(Sales) |
Calculated column |
| USERELATIONSHIP | USERELATIONSHIP(col1, col2) |
Activate inactive relationship | CALCULATE([Sales], USERELATIONSHIP(Sales[ShipDate], Date[Date])) |
In CALCULATE |
| CROSSFILTER | CROSSFILTER(col1, col2, direction) |
Change cross-filter direction | CROSSFILTER(Sales[ProductID], Product[ID], Both) |
In CALCULATE |
RELATED vs RELATEDTABLE:
Example:
// In Sales table (many side), get product category
Category = RELATED(Product[Category]) // Returns single category name
// In Product table (one side), count related sales
Sales Count = COUNTROWS(RELATEDTABLE(Sales)) // Returns count of sales for this product
| Function | Syntax | Purpose | Example | Notes |
|---|---|---|---|---|
| IF | IF(condition, true_value, false_value) |
Conditional logic | IF([Sales] > 1000, "High", "Low") |
Most common |
| SWITCH | SWITCH(expr, val1, result1, val2, result2, default) |
Multiple conditions | SWITCH([Category], "A", 1, "B", 2, 0) |
Cleaner than nested IF |
| AND | AND(condition1, condition2) |
Logical AND | IF(AND([Sales]>100, [Qty]>10), "Yes", "No") |
Both must be true |
| OR | OR(condition1, condition2) |
Logical OR | IF(OR([Status]="New", [Status]="Pending"), "Active", "Closed") |
Either must be true |
| NOT | NOT(condition) |
Logical NOT | NOT([IsActive]) |
Negation |
| ISBLANK | ISBLANK(value) |
Check if blank | IF(ISBLANK([Value]), 0, [Value]) |
Handle nulls |
| IFERROR | IFERROR(expression, value_if_error) |
Handle errors | IFERROR([Sales]/[Target], 0) |
Avoid divide-by-zero |
SWITCH vs Nested IF:
// Nested IF (hard to read)
Rating =
IF([Score] >= 90, "A",
IF([Score] >= 80, "B",
IF([Score] >= 70, "C", "F")
)
)
// SWITCH (cleaner)
Rating =
SWITCH(TRUE(),
[Score] >= 90, "A",
[Score] >= 80, "B",
[Score] >= 70, "C",
"F"
)
| Function | Syntax | Purpose | Example | Notes |
|---|---|---|---|---|
| CONCATENATE | CONCATENATE(text1, text2) |
Join two texts | CONCATENATE([First], [Last]) |
Use & operator instead |
| CONCATENATEX | CONCATENATEX(table, expr, delimiter) |
Iterate and join | CONCATENATEX(Products, [Name], ", ") |
Useful for comma-separated lists |
| FORMAT | FORMAT(value, format) |
Format value as text | FORMAT([Amount], "$#,##0.00") |
For display |
| LEFT | LEFT(text, num_chars) |
Left N characters | LEFT([ProductCode], 3) |
Extract prefix |
| RIGHT | RIGHT(text, num_chars) |
Right N characters | RIGHT([ProductCode], 3) |
Extract suffix |
| MID | MID(text, start, num_chars) |
Middle substring | MID([ProductCode], 4, 2) |
Extract middle |
| LEN | LEN(text) |
Length of text | LEN([Description]) |
Character count |
| UPPER | UPPER(text) |
Convert to uppercase | UPPER([Name]) |
Case conversion |
| LOWER | LOWER(text) |
Convert to lowercase | LOWER([Email]) |
Case conversion |
| TRIM | TRIM(text) |
Remove extra spaces | TRIM([Name]) |
Clean whitespace |
| Function | Syntax | Purpose | Example | Returns |
|---|---|---|---|---|
| SUMMARIZE | SUMMARIZE(table, col1, col2, "Name", expression) |
Group and aggregate | SUMMARIZE(Sales, Product[Category], "Total", [Total Sales]) |
Table |
| SUMMARIZECOLUMNS | SUMMARIZECOLUMNS(col1, col2, "Name", expression) |
Group and aggregate (preferred) | SUMMARIZECOLUMNS(Product[Category], "Total", [Total Sales]) |
Table |
| ADDCOLUMNS | ADDCOLUMNS(table, "NewCol", expression) |
Add calculated columns | ADDCOLUMNS(Products, "Revenue", [Sales]) |
Table |
| SELECTCOLUMNS | SELECTCOLUMNS(table, "NewName", column) |
Select and rename columns | SELECTCOLUMNS(Sales, "Amount", [Total]) |
Table |
| GROUPBY | GROUPBY(table, col1, col2, "Name", expression) |
Group by columns | GROUPBY(Sales, Product[Cat], "Total", SUMX(CURRENTGROUP(), [Amount])) |
Table |
| UNION | UNION(table1, table2, ...) |
Combine tables vertically | UNION(Sales2023, Sales2024) |
Table |
| INTERSECT | INTERSECT(table1, table2) |
Common rows | INTERSECT(Customers_A, Customers_B) |
Table |
| EXCEPT | EXCEPT(table1, table2) |
Rows in table1 not in table2 | EXCEPT(AllCustomers, ActiveCustomers) |
Table |
| CROSSJOIN | CROSSJOIN(table1, table2) |
Cartesian product | CROSSJOIN(Products, Regions) |
Table |
| Function | Syntax | Purpose | Example | Notes |
|---|---|---|---|---|
| STDEV.P | STDEV.P(column) |
Population standard deviation | STDEV.P(Sales[Amount]) |
Entire population |
| STDEV.S | STDEV.S(column) |
Sample standard deviation | STDEV.S(Sales[Amount]) |
Sample data |
| VAR.P | VAR.P(column) |
Population variance | VAR.P(Sales[Amount]) |
Entire population |
| VAR.S | VAR.S(column) |
Sample variance | VAR.S(Sales[Amount]) |
Sample data |
| MEDIAN | MEDIAN(column) |
Median value | MEDIAN(Sales[Amount]) |
Middle value |
| PERCENTILE.INC | PERCENTILE.INC(column, k) |
Kth percentile (inclusive) | PERCENTILE.INC(Sales[Amount], 0.95) |
95th percentile |
| RANK.EQ | RANK.EQ(value, column, order) |
Rank of value | RANK.EQ([Sales], ALL(Product[Sales]), DESC) |
Ranking |
| Function | Syntax | Purpose | Example | Notes |
|---|---|---|---|---|
| DATE | DATE(year, month, day) |
Create date | DATE(2024, 12, 31) |
Specific date |
| TODAY | TODAY() |
Current date | TODAY() |
No time component |
| NOW | NOW() |
Current date and time | NOW() |
Includes time |
| YEAR | YEAR(date) |
Extract year | YEAR([OrderDate]) |
2024 |
| MONTH | MONTH(date) |
Extract month | MONTH([OrderDate]) |
1-12 |
| DAY | DAY(date) |
Extract day | DAY([OrderDate]) |
1-31 |
| WEEKDAY | WEEKDAY(date, returntype) |
Day of week | WEEKDAY([OrderDate], 2) |
1=Monday (type 2) |
| WEEKNUM | WEEKNUM(date, returntype) |
Week number | WEEKNUM([OrderDate]) |
1-53 |
| EOMONTH | EOMONTH(date, months) |
End of month | EOMONTH([OrderDate], 0) |
Last day of month |
| CALENDAR | CALENDAR(start_date, end_date) |
Generate date table | CALENDAR(DATE(2020,1,1), DATE(2025,12,31)) |
Table of dates |
| CALENDARAUTO | CALENDARAUTO(fiscal_year_end_month) |
Auto generate date table | CALENDARAUTO(6) |
Fiscal year ends June |
| Function | Syntax | Purpose | Example | Notes |
|---|---|---|---|---|
| USERNAME | USERNAME() |
Current user (domain\user) | USERNAME() |
For RLS (on-prem) |
| USERPRINCIPALNAME | USERPRINCIPALNAME() |
Current user (email/UPN) | USERPRINCIPALNAME() |
For RLS (cloud) |
| HASONEVALUE | HASONEVALUE(column) |
True if single value in context | IF(HASONEVALUE(Product[ID]), ...) |
Conditional logic |
| HASONEFILTER | HASONEFILTER(column) |
True if single filter applied | IF(HASONEFILTER(Date[Year]), ...) |
Conditional logic |
| ISFILTERED | ISFILTERED(column) |
True if filtered | IF(ISFILTERED(Product[Category]), ...) |
Detect filtering |
| ISCROSSFILTERED | ISCROSSFILTERED(column) |
True if cross-filtered | IF(ISCROSSFILTERED(Sales[ProductID]), ...) |
Detect cross-filter |
| SELECTEDVALUE | SELECTEDVALUE(column, alternate) |
Value if single, else alternate | SELECTEDVALUE(Product[Name], "Multiple") |
Simplified HASONEVALUE |
| Function | Purpose | Example | Notes |
|---|---|---|---|
| Table.SelectRows | Filter rows | Table.SelectRows(Source, each [Amount] > 100) |
Conditional filter |
| Table.RemoveRows | Remove rows by position | Table.RemoveRows(Source, 0, 5) |
Remove first 5 rows |
| Table.FirstN | Keep first N rows | Table.FirstN(Source, 1000) |
Top 1000 rows |
| Table.SelectColumns | Keep specific columns | Table.SelectColumns(Source, {"ID", "Name"}) |
Column filter |
| Table.RemoveColumns | Remove columns | Table.RemoveColumns(Source, {"Temp1", "Temp2"}) |
Drop columns |
| Table.RenameColumns | Rename columns | Table.RenameColumns(Source, {{"Old", "New"}}) |
Column rename |
| Table.AddColumn | Add calculated column | Table.AddColumn(Source, "Total", each [Qty] * [Price]) |
Calculated column |
| Table.Sort | Sort rows | Table.Sort(Source, {{"Amount", Order.Descending}}) |
Sort operation |
| Table.Distinct | Remove duplicates | Table.Distinct(Source, {"CustomerID"}) |
Deduplication |
| Table.Group | Group and aggregate | Table.Group(Source, {"Category"}, {{"Total", each List.Sum([Amount]), type number}}) |
Group by |
| Table.Pivot | Pivot (long → wide) | Table.Pivot(Source, List.Distinct(Source[Month]), "Month", "Sales") |
Pivot operation |
| Table.Unpivot | Unpivot (wide → long) | Table.Unpivot(Source, {"Jan", "Feb", "Mar"}, "Month", "Value") |
Unpivot operation |
| Function | Purpose | Example | Notes |
|---|---|---|---|
| Table.NestedJoin | Merge queries (join) | Table.NestedJoin(Table1, "ID", Table2, "ID", "Table2", JoinKind.LeftOuter) |
Horizontal join |
| Table.Combine | Append queries (union) | Table.Combine({Table1, Table2}) |
Vertical stack |
| Table.Join | Join (alternate) | Table.Join(Table1, "ID", Table2, "ID", JoinKind.Inner) |
Inner join |
JoinKind Options:
JoinKind.Inner - Inner join (matching rows only)JoinKind.LeftOuter - Left outer join (all from left + matching from right)JoinKind.RightOuter - Right outer joinJoinKind.FullOuter - Full outer join (all from both)JoinKind.LeftAnti - Left anti join (left rows WITHOUT match in right)JoinKind.RightAnti - Right anti join| Function | Purpose | Example | Notes |
|---|---|---|---|
| Text.Upper | Uppercase | Text.Upper("hello") |
"HELLO" |
| Text.Lower | Lowercase | Text.Lower("HELLO") |
"hello" |
| Text.Proper | Title case | Text.Proper("hello world") |
"Hello World" |
| Text.Trim | Remove spaces | Text.Trim(" hello ") |
"hello" |
| Text.Length | Text length | Text.Length("hello") |
5 |
| Text.Start | First N characters | Text.Start("hello", 3) |
"hel" |
| Text.End | Last N characters | Text.End("hello", 3) |
"llo" |
| Text.Middle | Substring | Text.Middle("hello", 1, 3) |
"ell" (0-indexed) |
| Text.Split | Split by delimiter | Text.Split("a,b,c", ",") |
{"a", "b", "c"} |
| Text.Combine | Join with delimiter | Text.Combine({"a", "b"}, "-") |
"a-b" |
| Text.Replace | Replace text | Text.Replace("hello", "l", "r") |
"herro" |
| Text.Contains | Check if contains | Text.Contains("hello", "ell") |
true |
| Function | Purpose | Example | Notes |
|---|---|---|---|
| Number.From | Convert to number | Number.From("123") |
123 |
| Text.From | Convert to text | Text.From(123) |
"123" |
| Date.From | Convert to date | Date.From("2024-01-01") |
#date(2024,1,1) |
| DateTime.From | Convert to datetime | DateTime.From("2024-01-01 10:00") |
#datetime(...) |
| Logical.From | Convert to logical | Logical.From("true") |
true |
| Function | Purpose | Example | Notes |
|---|---|---|---|
| if...then...else | Conditional | if [Amount] > 100 then "High" else "Low" |
Basic condition |
| and | Logical AND | if [Amount] > 100 and [Qty] > 10 then ... |
Both true |
| or | Logical OR | if [Status] = "New" or [Status] = "Pending" then ... |
Either true |
| not | Logical NOT | if not [IsActive] then ... |
Negation |
| Shortcut | Action | Context |
|---|---|---|
| Ctrl + S | Save report | Desktop |
| Ctrl + O | Open file | Desktop |
| Ctrl + N | New file | Desktop |
| Ctrl + C | Copy visual | Report canvas |
| Ctrl + V | Paste visual | Report canvas |
| Ctrl + X | Cut visual | Report canvas |
| Ctrl + Z | Undo | Report canvas |
| Ctrl + Y | Redo | Report canvas |
| Ctrl + A | Select all visuals | Report canvas |
| Ctrl + G | Group visuals | Selection |
| Ctrl + Shift + G | Ungroup visuals | Grouped selection |
| Alt + Shift + F10 | Open selection pane | Report view |
| Alt + F1 | Insert visual | Report canvas |
| F5 | Preview report (reading view) | Desktop |
| Ctrl + F | Search/Find | Data view |
Solution: Row-Level Security with dynamic filtering
// On User_Territory table
[UserEmail] = USERPRINCIPALNAME()
Key: Relationship from User_Territory to dimension tables propagates filter.
Solution: Unpivot columns in Power Query
Solution: CALCULATE with date filter modification
Running Total =
CALCULATE(
[Total Sales],
FILTER(
ALLSELECTED(Date[Date]),
Date[Date] <= MAX(Date[Date])
)
)
Solution: Time intelligence function
Sales LY =
CALCULATE(
[Total Sales],
SAMEPERIODLASTYEAR(Date[Date])
)
Requirement: Marked date table with relationships.
Solution: Composite model with DirectQuery + Aggregations
Solution: Bookmarks with buttons
Solution: Fix Power Query transformation to enable folding
Solution: CALCULATE with ALL to remove filters
% of Total =
DIVIDE(
[Total Sales],
CALCULATE([Total Sales], ALL(Product)),
0
)
End of Appendices