4

How to lookup the first non blank value in unrelated table. I want an equivalent of:

  • SQL outer apply (select top 1 ...) or
  • Excel VLOOKUP.

The DAX LOOKUPVALUE function as used in that code does not work.

LOOKUPVALUE(
      SearchTable[name]
    , SearchTable[id] -- how to get the fist value, if here are multiple id?
    , ThisTable[id]
    )

DAX LOOKUPVALUE function works fine, if only one unique value is returned as a result. The function LOOKUPVALUE returns error message, if multiple values are found in the search table for one key:

A table of multiple values was supplied where a single value was expected.

error message I do not want error message, I am happy with the top 1 result.

Sample data file:

DAX lookup top 1 value.pbix

Comment to accepted answer

I favor Alexis Olson's second approach the most, though I prefer VAR to EARLIER variation:

MyLookup = 
var LookupKey = ThisTable[Product]
return
CALCULATE (
    MAX ( SearchTable[Category] ),
    SearchTable[Product] = LookupKey
)

I accepted it though I have a strong feeling that this is not the best answer which may exist. The function returns MAX instead of TOP 1 which is obviously additional effort.

The sample data now contains 4 working solutions. I have numbered them according to my preferences. The first two are by Alexis Olson. As far as I could learn from DAX Studio, both solutions are similarly fast and have a simple execution plan. The second solution is the only solution that performs just two table scans (all other approaches have 3 scans). The third solution has a complicated execution plan, and during the third scan it passes through the entire search table, regardless of whether it is a million rows.

This question has a follow up question where I try to compare the queries in Dax Studio:

How to check in Daxstudio which DAX query has better performance?

Update seeking further improvements

I wonder if it is possible to achieve a shorter query plan then the current winning answer. The winning code will be evaluated in DaxStudio and should have shorter query plan then total 20 line of both physical and logical lines of query plan.

-- Bounty code
EVALUATE
ADDCOLUMNS (
    VALUES ( ThisTable[Product] ),
    "FilterLookup",
    VAR LookupKey = ThisTable[Product]
    RETURN
        CALCULATE ( MAX ( SearchTable[Category] ), SearchTable[Product] = LookupKey )
)
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
Przemyslaw Remin
  • 6,276
  • 25
  • 113
  • 191
  • This might help: https://stackoverflow.com/questions/52525377/return-top-value-ordered-by-another-column – Alexis Olson Dec 12 '19 at 17:55
  • For the benefit of other readers, I am cross-referencing here the suggestions I have made [about bounty negotiations in questions](https://stackoverflow.com/questions/59931193/filtering-summarizecolumns). The summary points I have made are: (a) bounty negotiating is not of interest to most readers, who do not sign in; (b) on this platform, bounty negotations are sometimes used as a way to obtain free work, and this is normally frowned upon here; (c) once such a bounty has been paid, the outdated offer is rarely removed from where it has been added. – halfer Feb 10 '20 at 13:35
  • (I will continue the conversation on the other thread as required). – halfer Feb 10 '20 at 13:35
  • Please don’t put meta notices in your question posts. You can use a comment or the bounty notice field. – Martijn Pieters Feb 13 '20 at 21:08

4 Answers4

1

Try this:

Column =
VAR row_id = ThisTable[id] 
VAR ret_val =
    SELECTCOLUMNS (
        TOPN (
            1;
            CALCULATETABLE (
                SearchTable;
                SearchTable[id] = row_id;
                SearchTable[name] <> BLANK ()
            )
        );
        "name"; SearchTable[name]
    )
RETURN
    ret_val
Karl Anka
  • 2,529
  • 1
  • 19
  • 30
  • Thank you. Works great. I included your solution in sample data as expected results. – Przemyslaw Remin Dec 13 '19 at 10:01
  • Do you have any idea how your query could be optimized? The 2nd and 3rd query proposed by Alexis Olson seem to be better in terms of performance as far as I can figure it out from DAX Studio. The drawback of your query is that in the last scan, it scans entire SearchTable (be it a milion rows). Alexis's scans seem to be going only through the number of unique key rows. Also Alexis's query plans are way simpler and shorter. The query plan is bad, but your code delivers fresh idea to cracking problems without MAXX. – Przemyslaw Remin Dec 16 '19 at 15:03
1

There are plenty of ways to do this. Here are the top three that come to mind:

This simplest is a lookup function but only works if there is a single category associated with each product in the SearchTable.

LookupFunction =
LOOKUPVALUE ( SearchTable[Category], SearchTable[Product], ThisTable[Product] )

The following will should still work with multiply categorized products:

MaxxLookup =
MAXX (
    FILTER ( SearchTable, SearchTable[Product] = ThisTable[Product] ),
    SearchTable[Category]
)
CaclculateMaxLookup =
CALCULATE (
    MAX ( SearchTable[Category] ),
    SearchTable[Product] = EARLIER ( ThisTable[Product] )
)

If you want 2nd or 3rd as a measure instead of a calculated column, they'd need to be adjusted.


Note: The above work fine as calculated columns. If you want them to work as both calculated columns and as measures then adjust them as follows:

LookupMeasure =
LOOKUPVALUE (
    SearchTable[Category],
    SearchTable[Product], CALCULATE ( SELECTEDVALUE ( ThisTable[Product] ) )
)
MaxxMeasure=
VAR CurrProduct = CALCULATE ( SELECTEDVALUE ( ThisTable[Product] ) )
RETURN
    MAXX (
        FILTER ( SearchTable, SearchTable[Product] = CurrProduct ),
        SearchTable[Category]
    )
CaclculateMaxMeasure =
VAR CurrProduct = CALCULATE ( SELECTEDVALUE ( ThisTable[Product] ) )
RETURN
    CALCULATE ( MAX ( SearchTable[Category] ), SearchTable[Product] = CurrProduct )

The difference is that when you're writing a calculated column, there is row context to tell you which product to use. When you write a measure, you need to specify which value from the column ThisTable[Product] you intend to use.

Note: If there are multiple products in your filter context and you want the measure to look up the category of the maximal one, then use MAX instead of SELECTEDVALUE. The latter will return a blank if there are multiple values.


Edit:

The above formulas worked for the original set of data provided. However, if there are multiple categories associated with a single product in the SearchTable then the LOOKUPVALUE function will break (since it expects only a single value) and you will need to use one of the MAX or MAXX versions instead.

Alexis Olson
  • 38,724
  • 7
  • 42
  • 64
  • I am surprised that the first solution works with sample data. So far I have been experiencing this behavior: https://radacad.com/find-the-data-value-using-lookupvalue-dax-function-in-power-bi-simple-and-useful (please scroll down to "What if multiple values as the output?" I wonder in what conditions it works. – Przemyslaw Remin Dec 13 '19 at 17:39
  • I made comment to your answer in my question. Please see update. – Przemyslaw Remin Dec 13 '19 at 17:51
  • The last one could be rewritten in this way: `CaclculateMaxLookup = var LookupKey = ThisTable[Product] return CALCULATE ( MAX ( SearchTable[Category] ), SearchTable[Product] = LookupKey )` – Przemyslaw Remin Dec 13 '19 at 18:17
  • I don't think you understood me. I am grateful you extended your answer to measures, but all the question is about calculated column. So we have one value of ThisTable in row context. The problem starts when we have two or more different values found in the SearchTable for one key. Please update my sample data or just input two different values in Category field for `apple`. You will not be able to create calculated column using your first approach which is actually the same as the one I used in my question. – Przemyslaw Remin Dec 16 '19 at 09:42
  • It was obvious mistake in sample data. This solution was banned from the beginning:-) Thank you! – Przemyslaw Remin Dec 17 '19 at 11:58
1

I think I solved this issue: Measure0 is measure Measure1 is Calculated Column measure Apple has no Category because it has 3 categories and can’t pick one, so it returns blank

Measure0 =
FIRSTNONBLANK (
    SearchTable[Category],
    CALCULATE (
        MAX ( SearchTable[Product] ),
        TREATAS ( VALUES ( ThisTable[product] ), SearchTable[Product] )
    )
)
Measure1 =
CALCULATE (
    FIRSTNONBLANK (
        SearchTable[Category],
        CALCULATE (
            MAX ( SearchTable[Product] ),
            TREATAS ( VALUES ( ThisTable[product] ), SearchTable[Product] )
        )
    )
)
Przemyslaw Remin
  • 6,276
  • 25
  • 113
  • 191
  • Thank you. I hope you do not mind that I have used daxformatter.com for structuring your DAX code. I will examine performance of your proposal with the benchmark solution. – Przemyslaw Remin Aug 22 '20 at 13:08
  • Can you please explain the logic of the two codes you have proposed? – Przemyslaw Remin Aug 22 '20 at 13:10
  • Simply, I used CALCULATE to achieve context transition because FIRSTNONBLANK is an iterator, i.e. doesn't understand row context only. TREATAS to achieve a data lineage of the unrelated values. I didn't consider any optimization in those 2 queries but maybe the 2nd one is more optimized (I am not sure). – Ahmed Hafidh Aug 24 '20 at 02:24
  • These methods have lower performance than the two approaches by Alexis Olsen. However the use of TREATAS might be a hint for improvement of the accepted answer. I appreciate your contribution. – Przemyslaw Remin Aug 25 '20 at 11:44
0

There is a FIRSTNONBLANK(column, expression) function. Doesn't it help?

Dreekun
  • 422
  • 2
  • 12
  • Please see the edit of my question. I uploaded sample data. I would be grateful if you gave it a shot. I think that FIRSTNOBLANK approach needs relationship. – Przemyslaw Remin Dec 13 '19 at 10:50