Power BI can be used as “self-service” analytics tooling, where you ingest, transform and analyze data. However, it can also work with Datasets that are already developed by others.
What is a Dataset?
A Power BI Dataset is a Data Model (a collection of tables and relations between tables) ready for analysis. You can embed all kinds of logic into a Dataset, so new questions can be answered easily with a few clicks. A few terms are used extensively, and thus worth explaining:
- Measures are the figures you want to analyse.
- Often simple, additive values (“number of widgets sold”)
- A Dataset makes it possible to create more advanced measures, like:
- Percentage of budget used
- Number of screws in a particular warehouse
- Year-to-Date sales of screws
- By creating a well-defined Dataset even complex measures will be calculated correctly along several dimensions
- Dimensions are the “axes” used to break down measures:
- In your analysis requirements, listen for the per:
- Sales per year
- Homicides per district per month
- Residential garbage per street
- These per definitions are often dimensions:
- District
- Period (month/year/date)
- Product Category
- etc.
Datasets also make it possible to have the extraction and modeling of data done by people other than making the reports themselves. But even if you do everything by yourself, you always create a dataset underwater.
Before we can start with this dataset, you need to prepare the dataset in your own Power BI environment. To do this, take the following steps:
You have just set up a Power BI dataset for yourself. This was a bit of preparation that we now had to do ourselves, but normally this can also be done by someone else. For example, by someone from a Business Intelligence or Data Engineering team, who has prepared data that is widely used in various places in the organization.
Attention: it is important that all Power BI windows are closed after completing these steps (if not, close them now).
We will now connect tot the Power BI Dataset we just published in a new Power BI report. This way, we simulate the situation where the Dataset has already been prepared for us.
The case
In this course we look at a fictional company that Microsoft often uses as an example in its courses: AdventureWorks.
AdventureWorks sells bicycles and bicycle parts - both directly to > customers and through distributors. They have an office in the Netherlands, and sell bicycles in the US, Germany and Spain - both through the various distributors and directly through the website to customers.
Because we have tapped into an existing data set, we can use high-quality data. This has already been checked for missing values, inconsistency and other issues that we often run into when we load data ourselves. The data model also indicates how the various relationships between the tables lie, and measures are usually created in which calculations are “ready” that are often more difficult for ourselves. So we can easily combine data from multiple tables - the dataset ensures the right interactions.
In this first exploration, we’re going to look at data from AdventureWorks. Specifically, we look at the Internet Total Sales (the total amount of Internet sales) and the Internet Total Tax amt (the amount of tax they pay through Internet sales). To get a sense of what the data looks like, we make a few intersections with it.
We will do this by creating two visualizations:
This one is fairly simple:
As you can see, this chart actually goes well automatically. And although the amount of tables and columns can be very large, you can easily search in the list of fields.
Because Total Children is a numeric value, Power BI automatically creates a measure that can be summed. So you see the sum of all children of all customers now displayed next to the tax paid. When you use the sigma character Σ before a field, this is an indication that Power BI can use it as measure (although the dataset didn’t explicitly provide it as measure!)
This indicates that “Total Children” is not a field that we want to add, but that we want to put on the axis, to divide along it. The result is now as follows:
Suppose we now want to make an extra breakdown according to the number of children who are still at home (Number of Children at Home), then we also see a sigma sign in the fields list.
However, we can also explicitly specify how Power BI adds this field to a chart:
Click on one of the countries in the first chart. As you can see, the other graph changes with it.
Since Power BI is new within AdventureWorks, your manager is very curious about what its possibilities. Among other things, he has heard a lot about the geographic representations that are standard in it, and has asked you to try a few things with it.
Below are the steps to create a report at a slightly higher level. This is deliberate (so that your experience with the tool is increased) - but don’t worry if you get stuck, just ask for help!
To discover what a Card visual is, you will have to explore the names of the visualizations. The name of a visual is displayed as tooltip when you rest the cursor on it for a while:
For example, the end result might look like this:
Here’s the endpoint of this lab: 02-Solution
Here is a Walkthrough video (no sound)
The next module is Module 3: Visuals and interaction. Below is a complete overview of all available modules: