0

I'm looking for insights into some intriguing issues while working with SQL Server.

Background

  • We have few SSRS reports running on APS (SQL Server PDW/MPP)
  • The reports datasets have the embedded queries are run directly on the database. Now one of the bigger table has been made to split into a "current" and a "history" table ("Current" table has recent years data), apparently to improve performance of these reports.
  • I need to change the SSRS reports in a way that the same reports should now redirect to either of these table based upon what "year" has been selected as one of the parameter.
  • My design approach: I decided to use database stored procedure (SP) in the SSRS reports datasets, building the required logic in the stored procedure.

Problem statement

  • Incidentally, the multi-valued params would not work the way I assumed.

  • When I looked up on this, it seemed this is a known issue and one of the poorly supported feature of SSRS. It seems the multiple values in the multi-valued parameter are considered as single value only, even though when printed they look like multiple values as in

    val1,val2,val3

  • Here are few posts I referred to: Link1 Link2

  • I then decided to find some workaround to get this working, following some of the suggestion in the above and below posts.

Link3

  • After many attempts, I managed to get this working by building the logic as something like below

    • Step 1: From SSRS, the parameter is changed as:

      =Join(Parameters!param_1.Value,"|")
      
    • Step2: Used a piece of code in SP to convert/format the multi-valued parameter and store it in a new variable (SQLString)

      --Relevant code:
      
      -- Block of conversion code
      -- to convert/format the multivalued param and store it in a new variable
      -- after this the SQLSTRING variable's final value is set as 
      -- |val1|val2|val3|
      
      -- Relevant where clause
      AND 
          (@SQLString LIKE '%|' + tbl.attr_1 + '|%')  
      
  • This worked! Although, the caveats mentioned this is certainly not performant. The results were quick in SSRS report (though it was just a sample query).

  • Now I decided to convert "Block of conversion code" into a scalar function, referencing below, so that I can call this function each time for each of the multivalued parameter. Link 4

  • This also worked but it was TOO slow (for the same sample query with only one multivalued param i.e. one call to this function).

Question(s)

  • It is puzzling to me (I don't have much knowledge of SQL Server internals) as to WHY converting a piece of code to a function would slow down query execution to that extent.

Would appreciate any pointers.

Thanks in advance.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Ankur
  • 11,239
  • 22
  • 63
  • 66
  • Leading wildcards kill performance. – avery_larry Apr 23 '20 at 19:01
  • 1
    Fundamentally, platforms like PDW/MPP shouldn't require these front end acrobatics to get required performance. The table is being "manually" partitioned here. You should be able to achieve this transparently in the back end using distributions, partitioning, Ordered CCI, non clustered indexes. How many rows does this table have? – Nick.Mc Apr 24 '20 at 02:20
  • @avery_larry True. But I don't understand why. – Ankur Apr 24 '20 at 07:14
  • @Nick.McDermaid Agreed. The decision maker is aware of this. But this has been decided for now as a temporary solution. Distributions are already in place. And the reporting queries utilize these distributions when joining multiple such tables. I'll explore on the rest (partitioning, ordered cci, non clustured indexes) w.r.t. to this model on APS. I don't much experience in these areas, would appreciate any pointers here. This table has 2342 million rows (before the split into current and history). – Ankur Apr 24 '20 at 07:28
  • @Ankur https://www.brentozar.com/archive/2010/06/sargable-why-string-is-slow/ – avery_larry Apr 24 '20 at 13:39

2 Answers2

0

I always manipulate the joined parameters into a table in the stored procedure and then inner join to that table.

Sample using my own data:

The SSRS join I use:

=Join(Parameters!param_1.Value,"#~--~#")

Then the stored procedure:

create proc whatever
@profs varchar(8000)
as

declare @profs_t table (ProfName varchar(70))
declare @delim varchar(10) = '#~--~#'
declare @delimlen int = len(@delim)
declare @idx int = 1 - @delimlen

while charindex(@delim, @profs, @idx + @delimlen) <> 0
    begin
        insert into @profs_t
        select substring(@profs, @idx + @delimlen, charindex(@delim, @profs, @idx + @delimlen) - @idx - @delimlen)
        set @idx = charindex(@delim, @profs, @idx + @delimlen)
    end
insert into @profs_t
select substring(@profs, @idx + @delimlen , len(@profs))

select *
from @profs_t pt
inner join professionals p on p.profname = pt.profname

This transforms the joined parameters from SSRS into a table variable (you could just as easily use a temp table). This is fairly generic and can use any delimiters, even multiple character delimiters (as shown). I suppose the delimiter could be passed to the proc as a parameter too.

avery_larry
  • 2,069
  • 1
  • 5
  • 17
  • Thanks for suggesting another implementation. Apparently "table" data type is not supported in PDW (https://learn.microsoft.com/en-us/sql/t-sql/data-types/table-transact-sql?view=sql-server-ver15), otherwise I would used it earlier. Presently I'm using/building a string variable to store the results after processing the multi-valued parameter. The problem is as soon as I put this "processing logic" in a function (to make it re-usable for other multi-valued params) it starts to run too slow whereas if this logic is in main StoredProc, it works fast. I'm wondering whats going on here. – Ankur Apr 24 '20 at 07:12
  • Use a "regular" temp table instead . . ? – avery_larry Apr 24 '20 at 13:37
0

I think I found the answer to my question as explained in the following page. Scaler UDF

Ankur
  • 11,239
  • 22
  • 63
  • 66