2

I have a TSQL view. Apart from a few columns it is pretty basic in that it simply does a few joins and then glues everything together to present a nice view as it should be. However, the few columns which are not so simple makes the view code very hard to extend, now that new requirements have come in that invalidates the complicated columns' business logic.

Without going too much into detail, there is a table in my database:

tblEmployment

This consists of rows of "employments". Every time any of the columns in a row, for a given employment changes (let's say the employmentTitle changes), then the current row is pushed into another table tblEmploymentHistory, and the row in tblEmployment is changed so it contains the newest employmentTitle.

Essentially, what the view does, is that it tries to join the tblEmployment on tblEmploymentHistory with the unique EmploymentIdentifier, which makes sense.

The more complicated column in the view will try to calculate a number (elapsedTime for each row) by doing various calculations from the rows that it has joined together (i.e. from tblEmployment and tblEmploymentHistory). To get the elapsed time it performs calculations based on dictated business logic e.g. only specific datetime columns in the history table should count towards the total elapsedTime and it should only do so if other columns in that row is set to specific values etc.

Now that new requirements have come in, the business logic is much more complicated than before. I find it hard to extend the view to include this since it's getting very messy and I feel that this could be done much more structured in the application layer where the rest of business logic also resides!

Is it "correct" to scrap the view and instead move it to the application layer of my application? Obviously the benefit by having the view is that it's fast, and doing the calculations in the code for roughly 100.000 rows will take some time. However, it can be optimised by filtering away rows to make the number around 10.000.

What is the "standard" and the cleanest way of tackling this problem?

Force444
  • 3,321
  • 9
  • 39
  • 77
  • why a view and not a stored procedure? – Paolo Sep 09 '15 at 12:59
  • @Paolo I don't know. Would a stored proc be the way to go here? I haven't created the view myself, my task it just to extend it, but it's very hard to do so. If I use a stored proc, would it be better? – Force444 Sep 09 '15 at 13:02
  • using a stored procedure you can rely upon CTE and temp tables that imho will greatly simplify your task. you didn't share any code so i can't tell for sure but it is very likely so. – Paolo Sep 09 '15 at 13:24
  • @Paolo you can have a CTE in a view. – Sean Lange Sep 09 '15 at 13:36
  • @Paolo, If you just collect data a SP is the wrong tool to go with. Use a single-statement ("inline") UDF for your Business Logic. The biggest advantage: You can have your View with the "easy" columns, and add your complicated columns simply by adding a JOIN to your FROM clause. This way your VIEW and your buisiness logic are nicely separated. – Shnugo Sep 09 '15 at 13:41
  • @Shnugo without any technical detail all suggestions are just a matter of taste (read the IMHO in my first comment): i prefer stored procedures others may prefer functions. agreed that the proper choice must be made knowing the whole picture, with data and requirements at hand. – Paolo Sep 09 '15 at 13:49
  • @Paolo, no, I totally disagree... It's not a matter of taste. A sinlge-statement UDF is **real** ad-hoc SQL. The optimizer will integrate it as if this was written within the VIEW itself. A data collecting SP is closed for the optimizer. The result cannot be joined directly to another query, you must insert this into some kind of table first. Doing this you'll lose all advantages of indexes and statistics... SPs are for **doing** something and UDFs are for **reading** data. – Shnugo Sep 09 '15 at 13:58
  • @Shnugo So with your UDF approach, if I'm having problems extending the code which calculates data for one column in the view, then if I outsource this to a UDF, will it enable me to write out the code in a cleaner way? and allow me to extend the code? E.g. would you use cursors and other common code seen in SPs inside the UDF? – Force444 Sep 09 '15 at 14:04
  • 1
    @D.Singh, I'll fix up a small example in an answer. Just in short: No! No cursors, no loop whenever possible (anyway, they are not allowed in a single-statement UDF...) Btw: There are very rare situations where you should think of a multi-statement UDF. – Shnugo Sep 09 '15 at 14:07
  • @Shnugo you made lots of assumptions without technical information about the context of the question. IMHO udf and sp can be valid alternatives in the right context but in this particular case i can't tell which one si the best because of the missing information (please read my 2nd comment). – Paolo Sep 09 '15 at 14:20
  • @Paolo, it is a clear fact, that SPs are the wrong tool if you only want to read data. The OP stated, that all the work is done within a VIEW. So it is clear, that there's no need for the abilities of a SP. Just try to join a SP's result to a VIEW and you'll see that this is not the best approach... – Shnugo Sep 09 '15 at 14:24
  • @Shnugo it's clear to me that you prefer functions and want to sell your point: that's fine but without technical details there is no chance to tell for sure that udf are the proper choice for the OP to handle the **new and unknown business logic**. also please don't write (as you did) that sp are closed for the optimizer: it is [plainly wrong](http://dba.stackexchange.com/questions/6534/does-sql-server-optimize-or-pre-parse-stored-procedures). if sp are used as sp (and not to mimic udf) they get optimized as any other sql query. – Paolo Sep 09 '15 at 14:53
  • @Paolo, Sorry if I'd been to rough... Obviously you know what you're talking about... Yes, you are right, that a SP is not closed for the optimizer, but its result is closed. It is the same problem with multi-statement functions and - in a way - even with CTEs. The optimizer cannot predict their results ...The point is **joining data**. If you want to split the work in a VIEW and some extended (calculated) columns, the inline-function is the best approach. Sorry again for my wording and happy coding! – Shnugo Sep 09 '15 at 15:04

2 Answers2

3

The answer depends on a few things. First, make sure that the database is doing set-based work. If you start getting into cursors (generally speaking) or some sort of loop, you'd be better off placing that in the application. Relational databases aren't efficient working in that manner. Another thing I'd consider is what is standard for the environment you're working in? Are other things maintained in the DB like this where you are? If so you may want to stay consistent.

In the end whatever returns those results most efficiently, and without affecting other queries, ought to be the answer.

Jeffrey Van Laethem
  • 2,601
  • 1
  • 20
  • 30
  • For the environment that I'm working in, I have rarely seen business logic in views. From what I understand it should just be as a virtual table, so only showing columns that exists already but in different tables. At most a case statement. I don't know if this is the general way a view should be used? Have you seen "complex" columns in views before? Where the column is based on a several lines of computation? I noticed you said set-based work, so maybe I could reduce the view to do the set based work, then pull it into the application layer and apply the business logic based on that? – Force444 Sep 09 '15 at 13:53
  • 1
    I've seen all sorts... many of the views I work with regularly are a bit complex. I think your last sentence, doing the set-based work in the view and the remainder in the application, is a decent idea, as long as all involved parties understand what's going on. A benefit to that is you will have a more generic view that more bits of the application could potentially use. – Jeffrey Van Laethem Sep 09 '15 at 13:56
  • Good point! There is actually another view which does largely the same joins, but then has a few columns where some business logic takes place. So reducing that to one view, and then using that in the application to apply different business logic seems like a clean way to do it. – Force444 Sep 09 '15 at 13:59
1

As promised, I'll give you an example of the UDF-approach: In one of my project I use this with more than 40 different and hierarchically structured UDFs to get a result set with nearly 1000 columns back.

CREATE TABLE dbo.TestTable(Col1 INT,Col2 INT,Col3 INT);
INSERT INTO dbo.TestTable VALUES(1,2,3),(4,5,6),(7,8,9);
GO

CREATE FUNCTION dbo.TestFunc(@Prm1 INT,@Prm2 INT)
RETURNS TABLE
AS
RETURN
    SELECT @Prm1 AS Func_Prm1 --use names, which will be unique in any usage, this makes things much easier!
          ,@Prm2 AS Func_Prm2
          ,@Prm1 * @Prm2 AS Func_Calculated;
GO

--This would be your simple VIEW, consisting of any columns you can easily get
SELECT *
FROM dbo.TestTable
--This is how you join the "buiness logic" to your view
CROSS APPLY dbo.TestFunc(TestTable.Col1,TestTable.Col2) AS func

DROP TABLE dbo.TestTable;
GO
DROP FUNCTION dbo.TestFunc;      
GO
Shnugo
  • 66,100
  • 9
  • 53
  • 114
  • Thanks :). Gives me something to think about in terms of whether to use your approach, or just create a simple view and do some of the business logic in the application layer of the application instead. – Force444 Sep 09 '15 at 14:31
  • 1
    @D.Singh, Both should be fine... If you need the calculated columns else where without the application (e.g. reports, statistics...) the pure SQL solution should be better. If your application is a "single point of entry" I'd prefer the application. – Shnugo Sep 09 '15 at 14:33