With the finals weekend of the Australian Open coming up, a dataset about tennis would make a good example for the ‘filter problem’. In this tutorial, we’re going to use a dataset with data of the ATP Men’s tour of the past decade. Click here for the dataset.
In this dataset, one match equals one line, with the name of the loser in one column, and the name of the winner in another. With this dataset, we have the problem that we cannot filter on the name of a player and see all matches in which he participated. We can only filter on the winner or the loser. As a Federer fan, I would love to see only the matches of Federer as a winner, but for the sake of the tutorial let’s try and make it possible to filter on a player in general (as a winner and a loser).
The solution to our problem is, in short, to double our dataset. For each tennis match we need two rows: one for the winner and one for the loser. We add an extra column to show the name of the player. In the case of a winners’ row, the name of the winner, or in the case of a losers’ row, the name of the loser. We can use this column as a filter. Since the visuals of Power BI do not show duplicate data, the duplicate rows do not appear in the other visuals.
Step 1: Add a column to your original dataset
In this step we add a column to define whether the row is a winner or a loser row. In the example, I have named the original dataset ‘Loser’ (to make things more convenient). In this table, we add a (custom) column and name it ‘LoserWinner’. We want the column to have the value “Loser” for every row. Therefor we enter in the custom column formula = “Loser”. In your dataset, you can see the column you just added.
Step 2: Make a new table, based on your original table
When you right-click on your original dataset query, you have the option to ‘reference’ to this query. Power BI creates a new query, in which the source for the query is based on your original dataset (‘Loser’). I have named our new query/dataset ‘Winner’.
Step 3: In the new query, edit the custom column
In your new table, the value of the ‘LoserWinner’ column is still the same as for the original dataset. We have to adjust this to make sure we know that this data is a view of the ‘winner’ side. Therefore, we alter the values of ‘LoserWinner’ to ‘Winner’.
Step 4: Combine the two tables
When you open the advanced editor, the first few steps are already defined. The source is the original dataset on which your query is based. This line was generated when we made a new query by referencing to ‘Loser’. The second line was generated when we replaced the value in our custom column (step 3).
In this step, we want to combine the two tables. Therefore, we add a new step in the advanced editor and call it ‘Appended Query’. The formula we use is Table.Combine. In this formula we define which two tables we want to combine. The first table is the one we are currently working on, the ‘Winner’ table. To reference this table, we use the name of our previous step ‘Replaced Value’. The second table is the ‘Loser’ table. Now we have combined the two tables, which means we have for each tennis match two rows: one for the winner and one for the loser.
Step 5: Add a custom column with the filter value
Here comes the magic. We add a column (with the Table.AddColumn formula) to our new table, in which we want to define the name of the player. We call the new column ‘Player’. For each row in the dataset we are going to evaluate: is the row from the ‘Winner’ table (which we indicated with the value ‘Winner’ in the ‘LoserWinner’ column), then we want to show the name of the winner column. If the value is ‘Loser’ we want to show the name of the loser column.
And there it is: our filter value ‘Player’. Since Power BI visuals ignore duplicate values, we can filter on our new column, without having duplicate tennis matches in our visual.