How would I return the result of SQL math operations?

Question

So I was taking a test recently with some higher level SQL problems. I only have what I would consider "intermediate" experience in SQL and I've been working on this for a day or so now. I just can't figure it out.

Here's the problem:

You have a table with 4 columns as such:

EmployeeID     int unique
EmployeeType   int  
EmployeeSalary int  
Created        date

Goal: I need to retrieve the difference between the latest two EmployeeSalary for any EmployeeType with more than 1 entry. It has to be done in one statement (nested queries are fine).

Example Data Set: http://sqlfiddle.com/#!9/0dfc7

EmployeeID | EmployeeType | EmployeeSalary | Created
-----------|--------------|----------------|--------------------
1          | 53           | 50             | 2015-11-15 00:00:00
2          | 66           | 20             | 2014-11-11 04:20:23
3          | 66           | 30             | 2015-11-03 08:26:21
4          | 66           | 10             | 2013-11-02 11:32:47
5          | 78           | 70             | 2009-11-08 04:47:47
6          | 78           | 45             | 2006-11-01 04:42:55

So for this data set, the proper return would be:

EmployeeType | EmployeeSalary 
-------------|---------------
66           | 10
78           | 25

The 10 comes from subtracting the latest two EmployeeSalary values (30 - 20) for the EmployeeType of 66. The 25 comes from subtracting the latest two EmployeeSalary values (70-45) for EmployeeType of 78. We skip EmployeeID 53 completely because it only has one value.

This one has been destroying my brain. Any clues?

Thanks!

http://sqlfiddle.com/#!9/0dfc7 I think I did it properly. Thanks for the help! — SalsaGuy, Nov 23 '15 at 18:00
you did. Thank you. Now that is a great way to doc it. I am sure you will get an answer soon. If not, I will be back soon — Drew, Nov 23 '15 at 18:06

Lukasz Szozda · Accepted Answer · 2015-11-23T19:15:45.397

How to make really simple query complex?

One funny way(not best performance) to do it is:

SELECT final.EmployeeType, SUM(salary) AS difference
FROM (
  SELECT b.EmployeeType, b.EmployeeSalary AS salary
  FROM tab b
  JOIN (SELECT EmployeeType, GROUP_CONCAT(EmployeeSalary ORDER BY Created DESC) AS c
        FROM tab
        GROUP BY EmployeeType
        HAVING COUNT(*) > 1) AS sub
    ON b.EmployeeType = sub.EmployeeType
    AND FIND_IN_SET(b.EmployeeSalary, sub.c) = 1
  UNION ALL
  SELECT b.EmployeeType, -b.EmployeeSalary AS salary
  FROM tab b
  JOIN (SELECT EmployeeType, GROUP_CONCAT(EmployeeSalary ORDER BY Created DESC) AS c
        FROM tab
        GROUP BY EmployeeType
        HAVING COUNT(*) > 1) AS sub
    ON b.EmployeeType = sub.EmployeeType
    AND FIND_IN_SET(b.EmployeeSalary, sub.c) = 2
) AS final
GROUP BY final.EmployeeType;

SqlFiddleDemo

EDIT:

The keypoint is MySQL doesn't support windowed function so you need to use equivalent code:

For example solution in SQL Server:

SELECT EmployeeType, SUM(CASE rn WHEN 1 THEN EmployeeSalary 
                                 ELSE -EmployeeSalary END) AS difference
FROM (SELECT *,
       ROW_NUMBER() OVER(PARTITION BY EmployeeType ORDER BY Created DESC) AS rn
      FROM #tab
     ) AS sub
WHERE rn IN (1,2)
GROUP BY EmployeeType
HAVING COUNT(EmployeeType) > 1

LiveDemo

And MySQL equivalent:

SELECT EmployeeType, SUM(CASE rn WHEN 1 THEN EmployeeSalary 
                          ELSE -EmployeeSalary END) AS difference
FROM (
       SELECT t1.EmployeeType, t1.EmployeeSalary,
        count(t2.Created) + 1 as rn
      FROM #tab t1
      LEFT JOIN #tab t2
        ON t1.EmployeeType = t2.EmployeeType
       AND t1.Created < t2.Created
      GROUP BY t1.EmployeeType, t1.EmployeeSalary
     ) AS sub
WHERE rn IN (1,2)
GROUP BY EmployeeType
HAVING COUNT(EmployeeType) > 1;

LiveDemo2

Hey lad2025! I'm also getting timeouts on fiddle, but I tested the code locally and it worked! Thanks for all your help. I'm going to comb through it now to try and understand what I missed. You're the best! — SalsaGuy, Nov 23 '15 at 18:31
@SalsaGuy This code is for fun, you should not use it in production because there exists better ways (magnitude faster) :) — Lukasz Szozda, Nov 23 '15 at 18:32
I've been doing software dev for about a decade and didn't do much SQL until recently. I didn't know SQL could even be this difficult. When I started doing challenges I was amazed at how powerful it is. I'll use your code to learn what's goin on and try to optimize it. Thanks again for your help! — SalsaGuy, Nov 23 '15 at 18:45
You might want to look up how to emulate LEAD/LAG in mysql to optimize your query. [This](http://stackoverflow.com/questions/11303532/simulate-lag-function-in-mysql) is one example. — Julien Blanchard, Nov 23 '15 at 18:53
@JulienBlanchard Yes, this can be optimized in many ways, the quickest is to emulate by `ROW NUMBER`. I don't like MySQL because it lacks of windowed function. **[SQL Server Demo](https://data.stackexchange.com/stackoverflow/query/397237)** Working on emulation :) — Lukasz Szozda, Nov 23 '15 at 18:54

Strawberry · Answer 2 · 2015-11-23T19:04:14.313

The dataset of the fiddle is different from the example above, which is confusing (not to mention a little perverse). Anyway, there's lots of ways to skin this particular cat. Here's one (not the fastest, however):

SELECT a.employeetype, ABS(a.employeesalary-b.employeesalary) diff
  FROM 
     ( SELECT x.*
            , COUNT(*) rank 
         FROM employees x 
         JOIN employees y 
           ON y.employeetype = x.employeetype 
          AND y.created >= x.created 
        GROUP
           BY x.employeetype
            , x.created
     ) a
  JOIN
     ( SELECT x.*
            , COUNT(*) rank 
         FROM employees x 
         JOIN employees y 
           ON y.employeetype = x.employeetype 
          AND y.created >= x.created 
        GROUP
           BY x.employeetype
            , x.created
     ) b
    ON b.employeetype = a.employeetype
   AND b.rank = a.rank+1
 WHERE a.rank = 1;

a very similar but faster solution looks like this (although you sometimes need to assign different variables between tables a and b - for reasons I still don't fully understand)...

SELECT a.employeetype
     , ABS(a.employeesalary-b.employeesalary) diff
  FROM 
     ( SELECT x.* 
            , CASE WHEN @prev = x.employeetype THEN @i:=@i+1 ELSE @i:=1 END i
            , @prev := x.employeetype prev
         FROM employees x
            , (SELECT @prev := 0, @i:=1) vars
        ORDER 
           BY x.employeetype
            , x.created DESC
     ) a
  JOIN
     ( SELECT x.* 
            , CASE WHEN @prev = x.employeetype THEN @i:=@i+1 ELSE @i:=1 END i
            , @prev := x.employeetype prev
         FROM employees x
            , (SELECT @prev := 0, @i:=1) vars
        ORDER 
           BY x.employeetype
            , x.created DESC
     ) b
    ON b.employeetype = a.employeetype
   AND b.i = a.i + 1
 WHERE a.i = 1;

How would I return the result of SQL math operations?

2 Answers2