0

Background

Once again, I am working in a very restricted T-SQL environment, an application where one may only define the "body" of a VIEW: presumably the ... in

CREATE VIEW My_View AS ...

under the hood. The ... must be written (as SQL) within its own text field in the application's GUI, which imposes certain inflexibilities described under Note.

Here is my @@VERSION:

Microsoft SQL Server 2019 (RTM-CU19) (KB5023049) - 15.0.4298.1 (X64)
    Jan 27 2023 16:44:09  
    Copyright (C) 2019 Microsoft Corporation     
    Web Edition (64-bit) on Linux (Amazon Linux 2) <X64>

Note

The body (...) is limited to a single SQL statement! Furthermore, stored PROCEDUREs are out of play.

Goal

I wish to "simulate" the presence of "local" variables, so certain criteria can be dynamically updated, rather than hardcoded as literals throughout the query. Specifically, I want these "variables" to be at the top, so they can be easily adjusted.

Issue

Unfortunately, variables of the form @x are unavailable in views.

Approach

I have postulated this approach, with a vars CTE whose "fields" are the variables. Their values can be accessed as a subquery of the form (SELECT x FROM vars):

/*********************
** Variable Scalars **
**********************/
WITH vars AS (SELECT
    'Greg'    AS c_name,  -- A client named "Greg".
    GETDATE() AS t_date   -- Transactions occurring today.
    
    
    
/**********
** Query **
**********/
) SELECT
    *
FROM
    Transactions
WHERE
--                     |-------- 'Greg' -------|
    Client_Name      = (SELECT c_name FROM vars) AND
    Transaction_Date = (SELECT t_date FROM vars)
--                     |--------- Today -------|

I further suggest that variable lists of values can be specified as further CTEs, derived via STRING_SPLIT():

/*********************
** Variable Scalars **
**********************/
WITH vars AS (SELECT
    'Greg' AS c_name
    
    
    
/******************
** Variable List **
*******************/
), vars_dates AS (SELECT DISTINCT CAST([Value] AS DATE) AS vals FROM STRING_SPLIT('
    2023-06-28
    2023-06-01
    2023-01-01
', Char(10)) WHERE [Value] <> ''



/**********
** Query **
**********/
) SELECT
    *
FROM
    Transactions
WHERE
--                      |----------- 'Greg' ----------|
    Client_Name      =  (SELECT c_name FROM vars      ) AND
    Transaction_Date IN (SELECT vals   FROM vars_dates)
--                      |-----------------------------|
--                         {1/1/23, 6/1/23, 6/28/23}

Notice now the values in vars_dates are listed each on a separate line, so the user need not worry about punctuation (commas, etc.) clashing with any delimiter.

Question

So far, I have been able to call (SELECT x FROM vars) wherever I need the value of x. Indeed, both vars and vars_dates have worked perfectly throughout my use cases.

But are there any situations where these expressions will not work, in place of literal values and lists? More generally, what are the limitations of this approach?

Greg
  • 3,054
  • 6
  • 27
  • The values are hard coded though, in the CTE - there are no `@parameters` here – Stu Jun 28 '23 at 19:31
  • @Stu Indeed. The goal is to _"simulate"_ the presence of parameters, for use in the query criteria. That way, if I use a criterion many times throughout the query, I only need to update it once: in `params` at the top. – Greg Jun 28 '23 at 19:33
  • I don't see this serves any purpose other than to add overhead you may as well use the string literals inline and benefit from accurate statistics. As far as limitations go, it's valid t-sql so it should work wherever you wish to use it. – Stu Jun 28 '23 at 19:36
  • `VIEW`s cannot be parametrised, use an inline table value function if you want a "parametrised view". – Thom A Jun 28 '23 at 19:47
  • 1
    *I can't talk for anyone else, but your post reads a lot like a blog, not a Q&A post; it's a little jarring, as [so] isn't a blog site.* – Thom A Jun 28 '23 at 19:48
  • @Stu _"...and benefit from accurate statistics."_ Do you mean this might introduce inaccuracy? Please elaborate. _"I don't see this serves any purpose other than to add overhead..."_ I have colleagues who are not as well versed in SQL, and if they want to "tweak" the criteria in a complex query, I want to make it as straightforward as possible for them (and for me!) to update a recurring criterion. Otherwise, a user might update one instance of that criterion (`'Greg'` → `'Devon'`) while overlooking another (`'Greg'` unaltered) buried deep in the query, and the query will fail. – Greg Jun 28 '23 at 19:49
  • 1
    SQL Server should be able to use column stats with this approach as though you had used a literal. e.g. the estimates are accurate here https://i.stack.imgur.com/30jaG.png – Martin Smith Jun 28 '23 at 19:53
  • 1
    If the solution works for you then use it! You could also run into problems if someone adds more than one row in the CTE. I'm not sure without testing it but I would assume the optimizer can't make use of the content of a string_split at compile time...? – Stu Jun 28 '23 at 19:55
  • 1
    I would assume that too and that having a multi row `VALUES` clause might make things less opaque – Martin Smith Jun 28 '23 at 19:56
  • @ThomA _"`VIEW`s cannot be parametrised..."_ That's why I'm trying to _"simulate"_ parametrized behavior, within the `VIEW`s to which I am limited. – Greg Jun 28 '23 at 19:58
  • 1
    I might find the `SELECT FROM` to get a bit tedious though and use this approach instead https://dbfiddle.uk/tBuC9WbA - but depends on what the rest of the code in the view looks like – Martin Smith Jun 28 '23 at 20:00
  • Why not actually use the right object type? Why constrain yourself to a `VIEW` when it's the wrong object type here. Or, otherwise, `SELECT` against the `VIEW` with parameters from the calling application? – Thom A Jun 28 '23 at 20:00
  • @ThomA _"I can't talk for anyone else, but your post reads a lot like a blog, not a Q&A post; it's a little jarring, as [Stack Overflow](https://stackoverflow.com) isn't a blog site."_ I generally research a bit before posting a question on SO, to see if I can solve it myself. So when I _do_ post, I'm already a ways down the rabbit hole, and I try to supply necessary context on how I got there. Otherwise I get a lot of _"why do you even **want** this"_, or _"why not do it [some way I've already had to eliminate]"_. I break it into sections to let the _"TLDR"_ folks skip to what matters to them – Greg Jun 28 '23 at 20:01
  • @ThomA _"Why constrain yourself to a VIEW when it's the wrong object type here?"_ I'm working in a GUI application that allows you to add `VIEW`s, but it only permits you to define the _"body"_ of the `VIEW`, within the editor. For example, I cannot write `CREATE VIEW My_View AS ...`. Instead, I must enter `My_View` into the **Name** field of the GUI, and compose the query `...` within the **SQL** field of the GUI. How would you suggest I clarify my **Background** section, where I tried to explain these constraints? – Greg Jun 28 '23 at 20:09
  • You say you want to "simulate" parameters but these are all literals, and aren't going to change. If you want parameters then the values would need to change some how. How are you expecting the calling application to change the definition of the `VIEW` "on the fly" here? The intent of the question, and what you describe you're doing don't really seem to align in my opinion. – Thom A Jun 28 '23 at 20:09
  • So can this application not pass parameters to the query it runs when it calls these `VIEW`s either? If you are arbitrarily required to use `VIEW`s then to have a parametrised query you would do so when referencing the `VIEW`: `SELECT FROM dbo.YourView WHERE Some column = @SomeParam;`. Why not code your `VIEW`s definition so that a query against it can be easily parametrised? – Thom A Jun 28 '23 at 20:11
  • @ThomA _"You say you want to "simulate" parameters but these are all literals."_ Perhaps I should have said _"simulate **VARIABLES**"_. The idea is simply to update the "settings" for a particular `VIEW`, in one centralized location rather than in every instance throughout the query. _"So can this application not pass parameters to the query it runs when it calls these `VIEW`s either?"_ Correct. The only thing I can do is add a `VIEW` via the GUI, with name and "body" entered in separate text fields. That is why I cannot use an inline table-valued function, let alone any stored procedure – Greg Jun 28 '23 at 20:15
  • So what's the point of the `VIEW` is you can't use them..? Seems an odd application at this point. I feel like we are missing context at this stage. – Thom A Jun 28 '23 at 20:16
  • 1
    Variables and parameters are quite different, especially in this context. – Thom A Jun 28 '23 at 20:19
  • @ThomA It is indeed an odd application, but it's what my company has to work with, and these constraints are present even when I'm logged in as the _super user_. **The whole thing is maladaptive**, but I'm just trying to optimize under these constraints. – Greg Jun 28 '23 at 20:20
  • I just don't understand the point in these `VIEW`s at this stage if you aren't even going to query them; why make a `VIEW` that isn't going to be used..? Is it just for experience with this application? – Thom A Jun 28 '23 at 20:21
  • Working within your constraints, you could create a vew that contains all of your "variables" instead of having a CTE that needs to be amended in each view, then reference this view to get your relevant variables, then only this single view needs to be amended. – Stu Jun 28 '23 at 20:25
  • @ThomA _"why make a `VIEW` that isn't going to be used..?"_ The `VIEW`s are indeed used: they are treated as tables in their own right, and via a (very convoluted GUI) these tables can be linked and combined and have relationships defined. The result is built as its own analytics database, and the GUI once again "gatekeeps" the queries that can be run on it. It's frustrating, since I myself could do (via `JOIN` and `FOREIGN KEY`) what the GUI "covers up" under the hood...and I could do it with far less pain. But even with max privileges, the best I can do in SQL is write the "body" of a `VIEW` – Greg Jun 28 '23 at 20:30
  • 1
    Perhaps you should be talking to the developer of the application on your goals and they might be able to explain best how to achieve it (I suspect all of this is an [XY Problem](//xyproblem.info), especially after the parameters -> variables change). If you truly wanted parameters, and had to use a `VIEW` then you'd just parametrise the statement that references the `VIEW`. Trying to simulate parameters isn't possible, and a variable would be replaced by a literal or (sub)query that against a table. The benefit of a variable is it's assigned once, rather than maybe running several times. – Thom A Jun 28 '23 at 20:35
  • @Stu _Working within your constraints, you could create a view that contains all of your 'variables'..."_ I like the idea! So far, I've only had occasion to "declare" some "local" variables within an individual `VIEW`. But should the occasion arise, your idea would let me define "global" variables accessible by multiple `VIEW`s, which would keep those `VIEW`s "in sync". I should note that when creating a new custom `VIEW`, I can access all the custom `VIEW`s I have defined in the past, as if they were tables in their own right. – Greg Jun 28 '23 at 20:35
  • It is worth noting that nesting `VIEW`s can have some (significant) impacts on performance. The data engine can end up not inlining the statements, which can result in some very bad, and expensive, query plans. – Thom A Jun 28 '23 at 20:37
  • @ThomA _"I suspect all of this is an [XY Problem](https://xyproblem.info), especially after the parameters -> variables change"_. I think you're right: my misuse of "parameters" over "variables" was likely responsible for 90% of the confusion. _"The benefit of a variable is it's assigned once, rather than maybe run many times"_. That's my goal in a nutshell! – Greg Jun 28 '23 at 20:38
  • 1
    Basically you are just using the approach here but scoped within a view rather than being accessible to other modules https://learn.microsoft.com/en-us/archive/blogs/sql_server_appendix_z/sql-server-variables-parameters-or-literals-or-constants - so "locally scoped pseudo constant" – Martin Smith Jun 28 '23 at 20:44
  • 1
    Regarding your edit, my suggestion is a single statement - the rest of it is just setting up some dummy data outside the parameters. So `#T` is taking the place of `Transactions`. I'm not suggesting that the `CREATE TABLE`/`INSERT`/`DROP TABLE` can be part of a view. I was pointing out that `FROM Params CROSS APPLY YourTable` or `FROM Params CROSS JOIN YourTable` is potentially less tedious than doing a load of `select from` sub queries – Martin Smith Jun 28 '23 at 20:52
  • @MartinSmith Good catch! Sorry I misinterpreted your example: my brain was on overload, and I thought you were using `#T` in place of a `vars*` CTE, with a more elegant approach than `STRING_SPLIT()`. – Greg Jun 28 '23 at 21:06
  • For that you could consider doing something like `SELECT CAST([Value] AS DATE) AS vals FROM (VALUES ('2023-06-28'), ('2023-06-01'), ('2023-01-01')) V([Value])` - I'm not sure whether potentially it would be better off estimates wise with all the values cast to date rather than having them as strings and a cast – Martin Smith Jun 28 '23 at 21:11
  • 1
    @MartinSmith _"so 'locally scoped pseudo constant'"_ Pretty much! I didn't realize before that this idea existed or was considered credible, which is why I inquired on Stack Overflow. I suppose my idea is simply a "locally scoped psuedo constant" (LSPC), but adapted to constraints where only a single SQL statement was permitted in total...hence the `vars` CTE (maybe it should be `const`?) preceding the actual query for data. **Great find, and thanks for taking my idea seriously!** – Greg Jun 28 '23 at 21:14
  • @MartinSmith _"For that you could consider doing something like `SELECT ... V([Value])`"_. Interesting! I didn't realize you could assemble complex data structures via `(..., ...)` and `VALUES`. This has the clear benefit of avoiding any string parsing, though I confess I like the ergonomics and visual clarity of typing the `2023-0*-**` dates on their own lines, without any SQL syntax getting in the middle. **I don't suppose there's a syntax to enter them within a single list: `('2023-06-28', '2023-06-01', ...)`?** – Greg Jun 28 '23 at 21:18
  • 1
    No - entering them in a single list would create multiple columns `SELECT * FROM (VALUES ('2023-06-28', '2023-06-01', '2023-01-01')) V(a, b, c)` – Martin Smith Jun 28 '23 at 21:23
  • @MartinSmith _"a single list would create multiple columns"_. Thanks for the tip! I really appreciate your thoughtfulness on this, so if you want to post a very short answer, I'll be happy to accept and upvote. No rush, though. :) IMO, the highlights were: **(1)** the reassurance that `(SELECT x FROM vars)` will work pretty much anywhere, in place of a literal; **(2)** the `CROSS APPLY` snippet, to clean up the syntax; **(3)** the precedent for "psuedo constants"; and **(4)** the `VALUES ((...) ...)` syntax, which stabilizes the `vars_dates` better than parsing via `STRING_SPLIT()`. – Greg Jun 28 '23 at 21:29

1 Answers1

1

You just want to define some values once at the top and then use them in repeated points throughout the view so you can change them in one place in the future.

Given the constraints you are working with the approach you have seems fine (if these values are not required to be shared in other views).

It is basically a more locally scoped version of the "Pseudo-Constant" idea here and at least for scalar values SQL Server should be able to use the underlying literal value for cardinality estimation (so better than an actual TSQL @variable in that respect).

For the scalar values rather than needing sub queries such as SELECT c_name FROM vars I'd be minded to just CROSS JOIN the single row vars so then the columns are available for easier use. See this example here:

-- An example table, in place of 'Transactions'.
CREATE TABLE #T(X INT);

INSERT INTO #T VALUES (1),(2),(2),(3),(3),(3);


-- The CROSS technique itself.
WITH vars AS 
(
SELECT 3 AS foo
)
SELECT #T.*
FROM vars 
CROSS JOIN #T 
WHERE X = vars.foo


-- Cleanup.
DROP TABLE #T

The below demonstrates the cardinality estimation was done using the value 3 and it correctly estimates 3 rows.

enter image description here

For the lists I'd likely use a Table Value Constructor rather than string splitting. It should allow SQL Server to see the number of rows in the lists easier (and maybe have other benefits for cardinality estimation if it looks at the actual values in these).

-- ...

/******************
** Variable List **
******************/
), vars_dates AS (SELECT vals FROM (VALUES
    (CAST('2023-06-28' AS date)), /*So the column from the constant scan is typed as date*/
    ('2023-06-01'),
    ('2023-01-01')
) tbl(vals)

-- ...

Unlike an actual variable this approach won't guarantee that the value will stay the same throughout the query if you use a non deterministic function but you don't have any way of actually guaranteeing that with your current constraints.

Martin Smith
  • 438,706
  • 87
  • 741
  • 845
  • 1
    Thank you, Martin! I've [updated](https://stackoverflow.com/posts/76576910/timeline#history_52508d59-8a07-442e-a636-b08d80682c80) your answer to include the [**db<>fiddle** example](https://dbfiddle.uk/tBuC9WbA) that you kindly shared ([here](https://stackoverflow.com/questions/76576105/simulate-variables-in-a-view#comment135014506_76576105)) in the comments. – Greg Jun 28 '23 at 21:56
  • Hi again, Martin! I have **_tentatively_** [updated](https://stackoverflow.com/posts/76576910/timeline#history_7024f3ef-f761-4393-aea5-fefebd257d21) your answer, to include a helpful illustration of the Table Value Constructor, very similar to your suggestion ([here](https://stackoverflow.com/questions/76576105/simulate-variables-in-a-view#comment135015150_76576105)) in the comments. Since this is technically not your work _verbatim_ — I changed a the temporary aliases to `tbl` and `col` — I'm letting you know, so I don't "put words in your mouth". – Greg Jun 28 '23 at 22:14
  • 1
    @Greg - That's fine, I slightly changed it to remove a compute scalar/cast from the plan and just have the constant scan emit the right data type (`date` has higher data type precedence than `varchar` so casting one of the values is enough) – Martin Smith Jun 28 '23 at 22:30
  • 1
    Here's an added bonus for the Table Value Constructor, which makes it further superior to `STRING_SPLIT()`: one can easily insert _dynamic_ values into the list, as with `VALUES ((` **`GETDATE()`** `), ('2023-06-01'), ...) tbl(vals)`. – Greg Jun 28 '23 at 22:52