I have several programs written in R
that now I need to translate in T-SQL to deliver them to the client. I am new to T-SQL and I'm facing some difficulties in translating all my R
functions.
An example is the numerical derivative function, which for two input columns (values and time) would return another column (of same length) with the computed derivative.
My current understanding is:
I can't use SP, because I'll need to use this functions inline with
select
statement, like:SELECT Customer_ID, Date, Amount, derivative(Amount, Date) FROM Customer_Detail
I can't use UDF, because they can take, as input parameter, only scalar. I'll need vectorised function due to speed and also because for some functions I have, like the one above, running row by row wouldn't be meaningful (for each value it needs the next and the previous)
- UDA take whole column but, as the name says..., they will aggregate the column like
sum
oravg
would.
If the above is correct, which other techniques would allow me to create the type of function I need? An example of SQL
built-in function similar to what I'm after is square()
which (apparently) takes a column and returns itself^2. My goal is creating a library of functions which behave like square
, power
, etc. But internally it'll be different cause square
takes and returns each scalar is read through the rows. I would like to know if is possible to have User Defied with an accumulate method (like the UDA) able to operates on all the data at the end of the import and then return a column of the same length?
NB: At the moment I'm on SQL-Server 2005 but we'll switch soon to 2012 (or possibly 2014 in few months) so answers based on any 2005+ version of SQL-Server are fine.
EDIT: added the R
tag for R developers who have, hopefully, already faced such difficulties.
EDIT2: Added CLR
tag: I went through CLR
user defined aggregate as defined in the Pro t-sql 2005 programmers guide. I already said above that this type of function wouldn't fit my needs but it was worth looking into it. The 4 methods needed by a UDA are: Init
, Accumulate
, Merge
and Terminate
. My request would need the whole data being analysed all together by the same instance of the UDA
. So options including merge
methods to group together partial results from multicore processing won't be working.