0

I am currently doing something very similar as here: MySQL - Subtracting value from previous row, group by

My current query is:

SELECT a.x, a.y, a.z, COALESCE(a.z - b.z,0) AS diff
FROM [bla] AS a
LEFT JOIN EACH
[bla] AS b
ON b.x=a.x
AND b.y = (SELECT MAX(y) FROM [bla] WHERE x = a.x AND y < a.y)

However, I end up with the following error:

Error: An internal error occurred and the request could not be completed.

This error is not really helpful and I do not know what is wrong here. The problem seems to be the final ON statement with the SELECT subquery.

Community
  • 1
  • 1
fsociety
  • 1,791
  • 4
  • 22
  • 32

3 Answers3

3

Using data from above mentioned link - MySQL - Subtracting value from previous row, group by : Solution for BigQuery as simple as below statement

SELECT SN, Date, COALESCE(ROUND(Value - NextValue, 2), 0) as consumption 
FROM ( 
    SELECT *, LAG(Value, 1) OVER (PARTITION BY SN ORDER BY Date) as NextValue 
    FROM temp.EnergyLog)
ORDER BY SN, Date

Now, below is attempt to write it with your [bla] table:

SELECT x, y, z, COALESCE(ROUND(z - Nextz, 2), 0) as diff 
FROM ( 
    SELECT *, LAG(z, 1) OVER (PARTITION BY x ORDER BY y) as Nextz 
    FROM temp.bla)
ORDER BY x, y

I think above has good chance to work - but you might need to do some extra adjustments

Community
  • 1
  • 1
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
1

Another solution is based on recently introduced JS UDF.
It can look more heavy than one I propsed already above, but I like it too as it gives great/fine control on analytics logic.

I doubt this will be your practical choice, but conceptually this can be useful

So, for example from MySQL - Subtracting value from previous row, group by the solution would be

SELECT SN, Date, ROUND(consumption,2) as consumption FROM 
js( // input table
(SELECT SN, NEST(STRING(Date) + ',' + STRING(Value)) as Metric
FROM temp.EnergyLog GROUP BY SN) ,
// input columns
SN, Metric,
// output schema
"[{name: 'SN', type: 'integer'},
{name: 'Date', type: 'string'},
{name: 'consumption', type: 'float'}]",
// function
"function(r, emit){
  pair = r.Metric.sort(function (a,b) {return a > b;});
  val = pair[0].split(','); Date = val[0];
  emit({SN: r.SN, Date: Date, consumption: 0});
  for (var i=0; i<pair.length -1; i +=1){
    val = pair[i].split(','); Date = val[0]; Value1 = val[1]; 
    val = pair[i+1].split(','); Value2 = val[1];
    emit({SN: r.SN, Date: Date, consumption: Value2 - Value1});
  }
}" 
) ORDER BY SN, Date  

You can check UDF documentation here: https://cloud.google.com/bigquery/user-defined-functions

The output will be exactly the same as for previosly suggested solution using LAG

Hope you will be able "translate" above code to your case with [bla] table

Community
  • 1
  • 1
Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
0

I don't know the specific reason for your internal error, but note that join conditions in BigQuery must be conjunctions of equalities (e.g., a.x = b.x AND a.y = b.y). You cannot put constants, inequalities, or subqueries in your join conditions.

Also, I would discourage the use of self-joins in BigQuery, since they usually lead to performance problems. It looks like you're trying to find something like the largest y for any given x? If so, you might be able to use an analytic function instead (e.g., MAX(y) OVER(PARTITION BY x))?

Jeremy Condit
  • 6,766
  • 1
  • 28
  • 30
  • No, I like to take the difference of a current value with the value of the next row ordered by time. So if you imagine having 7 rows for each day, I would like to subtract the value from Tuesday from Monday, then the one from Wednesday from Tuesday and so on. – fsociety Sep 03 '15 at 23:14
  • Then LAG and LEAD are your friends. :-) Looks like Mikhail's answer got it! – Jeremy Condit Sep 04 '15 at 20:48