6

I have 2 queries:

Premium:

enter image description here

and Losses:

enter image description here

How can I simply summarize data from Premium query and LEFT JOIN it to summarized data in Losses query using DAX?

In SQL it would be like that:

declare @PremiumTable table (PolicyNumber varchar(50), Premium money)
insert into @PremiumTable values 
                                ('Pol1', 100),
                                ('Pol1', 50),
                                ('Pol2', 300),
                                ('Pol3', 500),
                                ('Pol3', 200),
                                ('Pol4',400)

declare @LossesTable table (PolicyNumber varchar(50), Losses money)
insert into @LossesTable values ('Pol1',115),
                                ('Pol1',25),
                                ('Pol2',0),
                                ('Pol3',110),
                                ('Pol3',75)


select  p.PolicyNumber, 
        sum(p.Premium) as Premium,
        sum(l.Losses)as Losses  
from @PremiumTable p 
        LEFT JOIN @LossesTable l on p.PolicyNumber = l.PolicyNumber
group by p.PolicyNumber

Result:

enter image description here

I tried using NATURALLEFTOUTERJOIN but it gives me an error:

*An incompatible join column, (''[PolicyNumber]) was detected. 'NATURALLEFTOUTERJOIN' doesn't support joins by using columns with different data types or lineage.*

MyTable = 
    VAR Premium = 
            SELECTCOLUMNS(
                fact_Premium,
                "PolicyNumber",fact_Premium[PolicyNumber],
                "Premium", fact_Premium[Premium]
                )
    VAR Losses = 
                SELECTCOLUMNS(
                    fact_Losses,
                    "PolicyNumber", fact_Losses[PolicyNumber],
                    "Losses", fact_Losses[PaymentAmount]
                             )
    VAR Result = NATURALLEFTOUTERJOIN(Premium,Losses)
    RETURN Result
Serdia
  • 4,242
  • 22
  • 86
  • 159

3 Answers3

7

There are a few interdependent "bugs" or limitations around the use of variables (VAR) and NATURALLEFTOUTERJOIN which makes this a weird case to debug.

Some notable limitations are:

VAR:

Columns in table variables cannot be referenced via TableName[ColumnName] syntax.

NATURALLEFTOUTERJOIN:

Either:

The relationship between both tables has to be defined before the join is applied AND the names of the columns that define the relationship need to be different.

Or:

In order to join two columns with the same name and no relationships, it is necessary that these columns to have a data lineage.

(I'm a bit confused because the link mentioned do not have a data lineage; while official documentation said only columns from the same source table (have the same lineage) are joined on.)


Come back to this case.

  1. SUMMARIZE should be used instead of SELECTCOLUMNS to obtain summary tables for Premium and Losses, i.e.:

    Premium = 
    SUMMARIZE(
        fact_Premium,
        fact_Premium[PolicyNumber],
        "Premium", SUM(fact_Premium[Premium])
    )
    
    Losses = 
    SUMMARIZE(
        fact_Losses,
        fact_Losses[PolicyNumber],
        "Losses", SUM(fact_Losses[Losses])
    )
    
  2. When we apply NATURALLEFTOUTERJOIN to the above two tables, it'll return error No common join columns detected because of they have no relationship established.

enter image description here

  1. To resolve this, we can make use of TREATAS as suggested in this blog post. But to use TREATAS, we have to reference the column names in Premium and Losses table, so we can't use VAR to declare them, but have to actually instantiate them.

To conclude, the solution would be:

  1. Create calculate tables for Premium and Losses as mentioned above.

premium

losses

  1. Use TREATAS to mimic a data lineage and join Premium table with Losses_TreatAs instead.

    MyTable = 
    VAR Losses_TreatAs = TREATAS(Losses, Premium[PolicyNumber], Losses[Losses])
    RETURN NATURALLEFTOUTERJOIN(Premium, Losses_TreatAs)
    

Results:

results

Foxan Ng
  • 6,883
  • 4
  • 34
  • 41
3

There's a sleazy hack that can successfully work around this awful limitation (what were the product designers thinking?).

If you add zeros (e.g. + 0) or concatenate an empty string (e.g. & "") to each join column within SELECTCOLUMNS, it breaks out of the data lineage straitjacket and runs the NATURALLEFTOUTERJOIN just using column names.

You can use this in a Measure to run dynamic logic (based on the query context from filters etc), not just while creating a calculated table.

Here's an tweaked version of your code:

MyTable = 
VAR Premium = 
        SELECTCOLUMNS(
            fact_Premium,
            "PolicyNumber",fact_Premium[PolicyNumber] & "",
            "Premium", fact_Premium[Premium]
            )
VAR Losses = 
            SELECTCOLUMNS(
                fact_Losses,
                "PolicyNumber", fact_Losses[PolicyNumber] & "",
                "Losses", fact_Losses[PaymentAmount]
                         )
VAR Result = NATURALLEFTOUTERJOIN(Premium,Losses)
RETURN Result

H/T to example #7 on this page, which shows this in code (without really explaining it). https://www.sqlbi.com/articles/from-sql-to-dax-joining-tables/#code7

Mike Honey
  • 14,523
  • 1
  • 24
  • 40
1

Hello I suggest you this way:

in PowerQuery, built up a table with policyNumber like that:

  1. Duplicate Premium table, and remove the premium column on the duplicate. Call it PremiumPol
  2. Duplicate the Losses table, and remove the losses column on duplicate. Call it LossesPol
  3. Then use the button Append Query, to Append PremiumPol and LossesPol. Call it policynumber
  4. Last remove duplicate from the appended tables
  5. Then click on close and Apply

Check that your model is like that: enter image description here

Then, to add losses and premium on a policy base is trivial, go on and select a table visual and these fields: enter image description here

the result is like this:

enter image description here

Hope that helps!

Nelson Gomes Matias
  • 1,787
  • 5
  • 22
  • 35