Power Query Basics
1. What is Power Query?
Power Query is a data connectivity and data preparation technology built into Excel, Power BI, and other Microsoft tools. It allows users to import, transform, and clean data from hundreds of sources (databases, files, web pages, APIs) using a visual interface or the underlying M language. All transformation steps are recorded as a query, making the process repeatable and auditable. It is also known as Get & Transform in Excel.
2. What is the M language?
M (also called the Power Query Formula Language) is the functional, case-sensitive programming language that powers Power Query. Every visual transformation step in the Power Query editor generates M code behind the scenes, which can be viewed and edited in the Advanced Editor. M is expression-based and works by passing data through a chain of transformation steps, where each step builds on the previous one. Understanding M enables advanced transformations not available through the visual interface.
3. What are the main data sources Power Query can connect to?
Power Query connects to over 100 data sources including flat files (CSV, Excel, JSON, XML), relational databases (SQL Server, MySQL, PostgreSQL, Oracle), cloud services (Azure, SharePoint, Salesforce, Google Analytics), web pages and REST APIs, ODBC and OLE DB connections, Hadoop and Spark clusters, and data warehouses such as Snowflake and BigQuery. Power BI has more connectors than Excel and new connectors are regularly added by Microsoft.
4. What is the difference between Power Query in Excel and Power BI?
Power Query is functionally very similar in both tools — both use the M language and provide the same visual editor. Power BI Desktop has a broader range of data connectors and supports DirectQuery and live connections to certain sources. Excel Power Query is called Get & Transform and produces queries that output to worksheets or the Data Model. In Power BI, queries feed directly into the data model. Power BI also supports incremental refresh policies on Power Query queries.
5. How do you load data into Power Query?
In Power BI Desktop, data is loaded via Home → Get Data → choose source → enter connection details → Transform Data to open the Power Query Editor. In Excel, it is accessed via Data → Get Data. Once in the Editor, transformations are applied. When done, clicking Close & Apply loads the transformed data into the data model. The query is saved and can be refreshed manually or on a schedule to re-import updated data from the source.
6. What is the Query Dependencies view?
The Query Dependencies view (View → Query Dependencies) in Power Query shows a visual diagram of how all queries in the workbook or Power BI file relate to each other — which queries depend on other queries as sources. It helps understand complex query structures, identify performance bottlenecks, and troubleshoot refresh errors. Queries that serve as inputs to multiple others should be marked as staging queries and have Load to Report disabled to avoid redundant processing.
7. What are the three main panes in the Power Query editor?
The Query Pane (left) lists all queries in the file. The Data Preview pane (centre) shows a preview of the transformed data for the selected query. The Applied Steps pane (right) lists every transformation step applied to the query in order, generated automatically as you apply transformations. Steps can be renamed, reordered, deleted, or edited. Clicking a step shows the data state at that point, which is invaluable for debugging transformations.
8. What is a staging query and why is it used?
A staging query (also called a reference query or helper query) loads and lightly transforms a data source and is then referenced by other queries rather than being loaded directly into the data model. They prevent the same data from being loaded from the source multiple times. A staging query is set to not load to the report (right-click → Enable Load → uncheck) so it only exists as an intermediate step. This follows the separation of concerns principle in data modelling.
9. How do you handle errors in Power Query?
Errors are handled using the Remove Errors option (right-click column → Remove Errors) to remove rows with errors. Replace Errors (right-click column → Replace Errors) replaces errors with a specified value. In M, try ... otherwise syntax catches errors: try Number.FromText([col]) otherwise null. Table.TransformColumns with try handles column-level errors. The Error.Record() function provides error details. The Quality bar below column headers shows the percentage of errors, empty values, and valid values.
10. What is the difference between Close & Apply and Close & Discard?
Close & Apply saves all transformations and loads the data into the data model, making it available for reporting and DAX calculations. Close & Discard exits the Power Query editor without saving any changes made in the current session — useful if you accidentally changed a query and want to revert. Any queries loaded before the current session remain unchanged when discarding. All changes in the current session are lost permanently on Close & Discard.
Data Transformation
11. How do you unpivot columns in Power Query?
Unpivoting transforms multiple columns into rows. Select the columns to keep as identifiers, then right-click → Unpivot Other Columns to pivot all other columns into two columns: Attribute (the original column header) and Value (the data). Alternatively, select the columns to unpivot and choose Unpivot Columns. This converts wide-format data into long format, which is required for proper data modelling. It is the equivalent of pd.melt() in Pandas and is commonly needed for crosstab data from Excel.
12. How do you merge queries (join) in Power Query?
Merging combines two queries based on a matching column, equivalent to a SQL JOIN. Home → Merge Queries → select the join key columns from both tables → choose join kind (Inner, Left Outer, Right Outer, Full Outer, Left Anti, Right Anti). The result has a new column containing the related table, which is expanded by clicking the expand icon to select which columns to bring in. The key columns do not need to have the same name but must have the same data type.
13. How do you append queries in Power Query?
Appending stacks two or more queries vertically (like SQL UNION ALL). Home → Append Queries → Two tables or Three or more tables → select the queries to combine. Column matching is done by column name, not position — columns with the same name are combined, and columns that don't match appear in the result with nulls for rows from the other query. It is used to combine multiple files or sheets with the same structure, such as monthly sales files.
14. How do you use the Group By function?
Group By aggregates rows with the same value in one or more columns. Home → Group By → select grouping columns → add aggregation operations (Sum, Average, Count, Min, Max, All Rows). All Rows keeps all rows as a nested table, allowing subsequent operations. In M, it is Table.Group(Source, {"GroupCol"}, {{"Total", each List.Sum([Amount]), type number}}). Group By is equivalent to SQL GROUP BY and is used to summarise data, compute totals, or count occurrences.
15. How do you add a custom column using an M formula?
Add Column → Custom Column opens a dialog where you write an M formula using column references in square brackets. For example, [Quantity] * [Unit Price] calculates revenue. M functions such as Text.Upper([Name]), Date.Year([OrderDate]), and if [Score] >= 60 then "Pass" else "Fail" are used for string manipulation, date extraction, and conditional logic. The Conditional Column option provides a point-and-click interface for if-then-else logic without writing M directly.
16. How do you split a column in Power Query?
A column is split using Transform → Split Column, with options including By Delimiter (split on a character like comma or space), By Number of Characters (split after a fixed number of characters), By Positions, By Uppercase/Lowercase Transition, or By Digit/Non-Digit Transition. Splitting by delimiter at each occurrence creates multiple columns, while splitting into rows creates a new row per segment. In M, this uses Table.SplitColumn() and Splitter.SplitTextByDelimiter().
17. How do you filter rows in Power Query?
Rows are filtered using the dropdown arrow on any column header, similar to Excel's AutoFilter. Options include filtering by value, by condition (equals, greater than, contains), by date-specific options (is in this year, before a date), or by removing/keeping top/bottom N rows, alternate rows, or rows with errors. Multiple filters stack as separate Applied Steps. In M, Table.SelectRows(Source, each [Amount] > 1000) is the equivalent programmatic approach.
18. What is query folding and why does it matter?
Query folding is the ability of Power Query to translate transformations back into the native query language of the data source (such as SQL) so they execute on the source system rather than in Power Query. This dramatically improves performance for large datasets because data is filtered, aggregated, and transformed at the source before being transferred. Not all transformations support folding — adding custom columns, certain merges, or using certain M functions break the fold. The View → Diagnostics → Step Diagnostics shows whether steps are folded.
19. How do you use parameters in Power Query?
Parameters are named variables that can be referenced in queries, making them reusable and dynamic. Created via Home → Manage Parameters → New Parameter, they can hold text, numbers, dates, or logical values. Parameters are referenced in query steps as variables (e.g., to dynamically set a folder path, SQL filter, or date range). They can be surfaced as What-If parameters in Power BI for interactive filtering or integrated into report-level date slicers for incremental refresh.
20. How do you combine multiple files from a folder?
Data → Get Data → Folder → select the folder path → Power Query lists all files. Clicking Combine & Transform opens a sample file transformation window where you define transformations to apply to each file. Power Query auto-generates a helper query and function that applies the same transformation to every file. This pattern is used to combine monthly sales files, regional reports, or any scenario where many identically-structured files need to be unified.
DAX Fundamentals
21. What is DAX?
DAX (Data Analysis Expressions) is the formula language used in Power BI, Analysis Services, and Power Pivot to create calculated columns, measures, and tables. It uses familiar functions similar to Excel but is designed for relational data models and analytical queries. DAX is context-aware — the same measure can return different results depending on the filter context applied by a slicer, row, or visual. Understanding filter context and row context is fundamental to writing correct DAX.
22. What is the difference between a measure and a calculated column?
A measure is a dynamic aggregation formula evaluated at query time based on the current filter context — it exists only in memory and is recalculated with every visual interaction. A calculated column is evaluated at data refresh time for each row, stored as a new column in the table, and is always available regardless of context. Use measures for aggregations (SUM, AVERAGE) and KPIs; use calculated columns for row-level attributes needed for slicing, filtering, or relationships.
23. What is filter context in DAX?
Filter context is the set of active filters applied to the data model at the moment a measure is evaluated, including filters from slicers, visual filters, page filters, and report filters. When a visual renders, DAX evaluates each measure within the filter context defined by the intersection of all active filters. Understanding filter context is essential because the same measure can produce different results in different rows of a table visual due to the applied filters.
24. What is row context in DAX?
Row context exists when DAX iterates through rows of a table, occurring in calculated columns (each row is evaluated in its own row context) and iterator functions like SUMX, AVERAGEX, and FILTER. In row context, column references return the value of that column for the current row. Row context does not automatically filter related tables; you must use RELATED() to access values from related tables. CALCULATE() can convert row context to filter context.
25. What is the difference between SUM and SUMX?
SUM(Table[Column]) aggregates all values in the specified column in the current filter context. SUMX(Table, Expression) iterates through each row of the table, evaluates the expression for each row in that row context, and then sums the results. For example, SUMX(Sales, Sales[Quantity] * Sales[Price]) multiplies quantity by price for each row before summing, while SUM(Sales[Revenue]) sums an existing revenue column. Use SUMX when the calculation requires row-level operations before aggregation.
26. What does CALCULATE() do?
CALCULATE(expression, filter1, filter2, ...) is the most powerful and commonly used DAX function. It evaluates an expression in a modified filter context — the filters specified in CALCULATE replace or add to the existing filter context. For example, CALCULATE(SUM(Sales[Amount]), Sales[Region] = "North") returns total sales filtered to the North region, regardless of what slicers are applied. CALCULATE also transitions row context to filter context, enabling calculations within iterators.
27. What is ALL() and when is it used?
ALL(Table or Column) removes all filters from a table or column and is used inside CALCULATE to override the current filter context. ALL(Sales) removes all filters on the Sales table. ALL(Sales[Product]) removes filters on just the Product column. It is commonly used in ratio calculations: Measure / CALCULATE(Measure, ALL(Table)) computes a metric as a percentage of the grand total. ALLEXCEPT(Table, Column) removes all filters except the specified columns.
28. What is the difference between RELATED() and RELATEDTABLE()?
RELATED(RelatedTable[Column]) is used in calculated columns or row context to retrieve a scalar value from a related table on the one side of a one-to-many relationship. RELATEDTABLE(Table) returns the full table of related rows from the many side of the relationship for the current row context. For example, in a Products calculated column, RELATED(Category[CategoryName]) retrieves the category name, while RELATEDTABLE(Sales) returns all sales rows for that product.
29. What are time intelligence functions in DAX?
Time intelligence functions perform calculations over date periods and require a contiguous date table with a date column marked as a date table. Common functions include TOTALYTD(measure, Dates[Date]) for year-to-date, SAMEPERIODLASTYEAR(Dates[Date]) for the same period one year ago, DATEADD(Dates[Date], -1, MONTH) to shift a date range, DATESYTD(Dates[Date]) to get all dates from year start, and PREVIOUSMONTH(Dates[Date]) for the prior month. They simplify period-over-period comparisons.
30. What is a calculated table in DAX?
A calculated table is a new table created using a DAX expression, evaluated at data refresh time and stored in the model. It is created using Table → New Table in Power BI Desktop. Examples include SUMMARIZE(Sales, Products[Category], "Total Sales", SUM(Sales[Amount])) for aggregated summary tables, CALENDAR(DATE(2020,1,1), DATE(2026,12,31)) for date tables, and DISTINCT(Sales[CustomerID]) for unique value tables. Calculated tables are useful for date tables, disconnected filter tables, and bridge tables.
DAX Measures & Calculations
31. How do you create a Year-to-Date measure?
A Year-to-Date (YTD) measure uses TOTALYTD: Sales YTD = TOTALYTD(SUM(Sales[Amount]), Dates[Date]). This accumulates the sum from the first day of the current year through the last visible date in the filter context. A custom year-end date can be specified: TOTALYTD(SUM(Sales[Amount]), Dates[Date], "03-31") for a fiscal year ending March 31. An equivalent expression is CALCULATE(SUM(Sales[Amount]), DATESYTD(Dates[Date])).
32. How do you calculate the previous year's value for comparison?
Sales PY = CALCULATE(SUM(Sales[Amount]), SAMEPERIODLASTYEAR(Dates[Date])) returns the sum for the same period one year prior. YoY Growth = ([Sales Current] - [Sales PY]) / [Sales PY] computes the year-over-year growth rate. DATEADD(Dates[Date], -1, YEAR) is an alternative that shifts the current date selection back by one year. PARALLELPERIOD(Dates[Date], -1, YEAR) returns the entire prior year period as a full period comparison.
33. What is a running total (cumulative sum) measure in DAX?
Running Total = CALCULATE(SUM(Sales[Amount]), FILTER(ALL(Dates[Date]), Dates[Date] <= MAX(Dates[Date]))) calculates the cumulative sum up to the current date in context. FILTER(ALL(Dates[Date]), ...) removes the existing date filter and reapplies only the condition that dates must be less than or equal to the current maximum date. Running totals are used in trend charts, cumulative revenue tracking, and target vs. actuals analysis.
34. How do you calculate a percentage of total?
% of Total = DIVIDE(SUM(Sales[Amount]), CALCULATE(SUM(Sales[Amount]), ALL(Sales[Product]))) calculates each product's sales as a percentage of total sales across all products. DIVIDE(numerator, denominator, alternate_result) is preferred over the / operator because it handles division by zero gracefully by returning the alternate result (default 0) instead of an error. ALL() removes the product filter context to compute the grand total denominator.
35. What is DIVIDE() and why is it preferred over the division operator?
DIVIDE(numerator, denominator, [alternateResult]) performs safe division that returns the alternate result (default blank or 0) when the denominator is zero or blank, preventing BLANK/0 errors that would otherwise propagate through visuals. The / operator returns an error for division by zero. In KPIs and percentage calculations where the denominator might occasionally be zero (e.g., no sales last year), DIVIDE ensures a clean, error-free report. It is considered a DAX best practice.
36. What is RANKX() and how is it used?
RANKX(ALL(Products[Product]), [Sales Measure]) ranks each product by its sales measure across all products. RANKX(ALL(Products[Product]), [Sales Measure], , DESC, Dense) specifies descending order and dense ranking (no gaps). Setting ALL(Products[Product]) ensures ranking is computed over all products regardless of slicer context. The resulting measure returns the rank number for each row in a visual. RANKX is used for leaderboards, top-N analysis, and performance rankings.
37. How do you use SWITCH() in DAX?
SWITCH(expression, value1, result1, value2, result2, ..., else) evaluates an expression and returns the corresponding result for the first matching value, similar to a CASE statement in SQL. SWITCH(TRUE(), [Score] >= 90, "A", [Score] >= 80, "B", [Score] >= 70, "C", "F") creates a grade banding measure using SWITCH with TRUE() to evaluate conditions. It is cleaner than nested IF statements and is commonly used for creating conditional labels, categories, or dynamic display logic.
38. What is the difference between COUNTROWS() and COUNT()?
COUNTROWS(Table) counts the number of rows in a table or the table returned by a filter expression. COUNT(Column) counts the number of non-blank values in a column. COUNTA(Column) counts non-blank values including text. COUNTBLANK(Column) counts blank values. DISTINCTCOUNT(Column) counts unique non-blank values. COUNTROWS(FILTER(Sales, Sales[Amount] > 1000)) counts rows meeting a condition. COUNTROWS is generally preferred as it is more explicit and works with any table expression.
39. What is HASONEVALUE() and SELECTEDVALUE()?
HASONEVALUE(Column) returns TRUE if the column has exactly one value in the current filter context — useful for conditional logic that should only execute when a single item is selected. SELECTEDVALUE(Column, alternative) returns the single selected value if only one value is in context, or the alternative if multiple values or no values are selected. These are used to display the selected dimension in card visuals, create dynamic titles, or trigger conditional calculations based on slicer selections.
40. How do you create a dynamic title or label using DAX?
Title = "Sales for " & SELECTEDVALUE(Region[Region], "All Regions") creates a dynamic text measure that shows the selected region in a card visual or text box. "Period: " & FORMAT(MAX(Dates[Date]), "MMM YYYY") formats the current date context. IF(HASONEVALUE(Product[Product]), "Product: " & VALUES(Product[Product]), "All Products") handles single vs. multiple selection. Dynamic titles are referenced in Format → Title → fx in Power BI visuals to make reports self-explanatory.
Advanced DAX
41. What is USERELATIONSHIP() and when is it needed?
CALCULATE(measure, USERELATIONSHIP(Table1[Col1], Table2[Col2])) activates an inactive relationship between two tables for the duration of the expression. Inactive relationships exist when there are multiple relationships between two tables (e.g., Sales has both OrderDate and ShipDate linked to the Date table, but only one can be active). USERELATIONSHIP temporarily switches which relationship is used, enabling date calculations on both order date and ship date without duplicating the date table.
42. What is bidirectional cross-filtering and when should it be used?
By default, filters propagate from the one side to the many side of a relationship. Bidirectional cross-filtering allows filters to propagate in both directions. It is useful in many-to-many relationship scenarios or when filtering a dimension table based on facts. However, bidirectional filtering can cause ambiguity, circular dependencies, and performance problems. It should be used sparingly — only when necessary and when the model is simple enough to avoid ambiguity. DAX functions like CROSSFILTER() can activate it selectively.
43. What is the CALCULATE modifier KEEPFILTERS()?
KEEPFILTERS(filter) modifies how CALCULATE applies a filter — instead of replacing the existing filter context for a column, it intersects (AND) with the existing context. Without KEEPFILTERS, CALCULATE(measure, Sales[Product] = "A") overrides any product slicer selection. With KEEPFILTERS, CALCULATE(measure, KEEPFILTERS(Sales[Product] = "A")) only returns values where both the slicer filter AND the explicit filter are satisfied. It is used when you want a filter to be additive rather than overriding.
44. What is a disconnected table in Power BI?
A disconnected table (also called a parameter table or slicer table) has no relationship to other tables in the model and is used exclusively to drive slicer selections that control DAX measure logic. For example, a Metric Selector table containing {"Revenue", "Units", "Margin"} allows users to select which KPI to display. A measure then uses SWITCH(SELECTEDVALUE(MetricSelector[Metric]), "Revenue", [Revenue Measure], ...) to return the appropriate value. This pattern creates dynamic, flexible reports without complex filtering.
45. How do you optimise DAX performance?
DAX performance is improved by using measures instead of calculated columns for aggregations to leverage in-memory optimisation. Avoiding deeply nested CALCULATE calls and complex iterators on large tables. Using SUMMARIZE or ADDCOLUMNS instead of complex CROSSJOIN operations. Enabling query folding in Power Query to reduce data loaded. Avoiding FILTER(ALL(Table), complex condition) in favour of targeted column filters. Using the Performance Analyzer in Power BI to identify slow visuals and DAX queries and using DAX Studio to profile query execution.
46. What is TREATAS() in DAX?
TREATAS(table_expression, Column1, Column2, ...) treats the values in a table expression as if they were filters applied to the specified columns, establishing a virtual relationship. For example, CALCULATE(measure, TREATAS(VALUES(Region[Region]), Budget[Region])) applies the current Region filter context to the Budget table even if there is no physical relationship between Region and Budget tables. It is used to establish context transitions between tables that are not directly related in the model.
47. What is the VAR keyword in DAX?
VAR name = expression stores the result of a DAX expression in a named variable, and RETURN specifies what the measure should return. Variables are evaluated once and reused, improving both readability and performance: VAR TotalSales = SUM(Sales[Amount]) VAR TotalCost = SUM(Sales[Cost]) RETURN DIVIDE(TotalSales - TotalCost, TotalSales). Variables in DAX capture the filter context at the point of declaration, which is important to understand when using them inside iterators.
48. What is ALLSELECTED() and how does it differ from ALL()?
ALLSELECTED(Column) removes filters applied directly to the column but preserves filters coming from slicers and report-level filters — it returns values within the "visible" selection. ALL(Column) removes all filters regardless of source. ALLSELECTED is used for percentage-of-selected-total calculations where the denominator should be the total of what is currently visible, not the absolute grand total. For example, DIVIDE([Sales], CALCULATE([Sales], ALLSELECTED(Products[Product]))) gives % of selected products.
49. What is the difference between EARLIER() and variables in calculated columns?
EARLIER(Column, 1) accesses the value of a column in an outer row context during nested iterations, commonly used in calculated columns to rank or compare rows against each other: COUNTROWS(FILTER(Sales, Sales[Amount] > EARLIER(Sales[Amount]))). However, EARLIER is considered legacy and is difficult to read. Variables (VAR CurrentAmount = Sales[Amount]) are the modern replacement — they capture the current row's value and can be referenced inside nested FILTER or CALCULATE calls clearly and without ambiguity.
50. What is a what-if parameter in Power BI?
A What-If parameter (Modeling → New Parameter) creates a numeric slicer with a generated single-column disconnected table and an accompanying measure that returns the selected value. It allows users to dynamically adjust an input value (such as a discount rate, growth assumption, or FX rate) and see its effect on KPI measures. The parameter table is referenced in DAX: Profit Scenario = [Revenue] * (1 - 'Discount Rate'[Discount Rate Value]). What-If parameters enable interactive scenario analysis within reports.