Arithmetic overflow depending on group by columns

Question

I'm getting this error but only when grouping by specific columns:

Arithmetic overflow error converting expression to data type int.

And I can't wrap my head around why. This is the query causing it (the sum-function is the culprit):

SELECT  a.AtgardAvvattningId, 
        a.ObjektId,
        sum(p.SlutLopLangd - p.StartLopLangd) As TotalLangd
    FROM AtgardAvvattning a 
        INNER JOIN Objekt o ON o.ObjektId = a.ObjektId 
        INNER JOIN Position p ON p.AvvattningAtgardId = a.AtgardAvvattningId 
        INNER JOIN Vna v ON v.PositionId = p.PositionId 
    WHERE v.OID IN (...) 
    GROUP BY a.AtgardAvvattningId, a.ObjektId, o.AtgardsDatum
    ORDER BY a.ObjektId

p.SlutLopLangd and p.StartLopLangd are both int columns. If I convert the values to bigints before sumation it works:

    sum(CONVERT(bigint, p.SlutLopLangd - p.StartLopLangd)) As TotalLangd

Giving this result:

AtgardAvvattningId	ObjektId	TotalLangd
DC9...	9B2...	25684
ECD...	9B2...	25700
3D0...	9B2...	170005
959...	9B2...	170005
BEC...	214...	11814
C31...	214...	11815

As you can see, no sum is even near the limit for int. The wierd thing is if I include the positionId in the group by clause like this it doesn't raise an error:

SELECT  a.AtgardAvvattningId, 
        a.ObjektId,
        sum(p.SlutLopLangd - p.StartLopLangd) As TotalLangd
    FROM AtgardAvvattning a 
        INNER JOIN Objekt o ON o.ObjektId = a.ObjektId 
        INNER JOIN Position p ON p.AvvattningAtgardId = a.AtgardAvvattningId 
        INNER JOIN Vna v ON v.PositionId = p.PositionId 
    WHERE v.OID IN (...) 
    GROUP BY a.AtgardAvvattningId, a.ObjektId, o.AtgardsDatum, p.PositionId
    ORDER BY a.ObjektId

In this case it's a 1-to-1 relationship between AtgardAvvattning and Position. This query gives the exact same result as above.

Why is it raising an Arithmetic overflow in the first place when the values are so small? And why does it work in the second? What's different? I know it's probably hard to give an answer without data and table structures but any hint would be helpful.

Update:

When removing the group by completly with this query:

    SELECT  a.AtgardAvvattningId, 
            a.ObjektId,
            p.PositionId,
            v.VnaId,
            p.StartLopLangd,
            p.SlutLopLangd,
            p.SlutLopLangd - p.StartLopLangd as Subtraction
        FROM AtgardAvvattning a 
            INNER JOIN Objekt o ON o.ObjektId = a.ObjektId 
            INNER JOIN Position p ON p.AvvattningAtgardId = a.AtgardAvvattningId 
            INNER JOIN Vna v WITH (NOLOCK) ON v.PositionId = p.PositionId 
        WHERE v.OID IN (...) 
        ORDER BY a.ObjektId

The result is not many rows at all:

AtgardAvvattningId	ObjektId	PositionId	VnaId	StartLopLangd	SlutLopLangd	Subtraction
DC96...	9B2...	473...	1345183	168501	174922	6421
ECD4...	9B2...	07E...	1252649	74602	81027	6425
ECD4...	9B2...	07E...	1252651	74602	81027	6425
ECD4...	9B2...	07E...	1252652	74602	81027	6425
ECD4...	9B2...	07E...	1252650	74602	81027	6425
DC96...	9B2...	473...	1345180	168501	174922	6421
DC96...	9B2...	473...	1345181	168501	174922	6421
DC96...	9B2...	473...	1345182	168501	174922	6421
3D08...	9BC...	F18...	1374284	199000	233001	34001
3D08...	9BC...	F18...	1374283	199000	233001	34001
9590...	9BC...	A2D...	1374285	16591	50592	34001
9590...	9BC...	A2D...	1374286	16591	50592	34001
9590...	9BC...	A2D...	1374287	16591	50592	34001
9590...	9BC...	A2D...	1374289	16591	50592	34001
9590...	9BC...	A2D...	1374288	16591	50592	34001
3D08...	9BC...	F18...	1374281	199000	233001	34001
3D08...	9BC...	F18...	1374280	199000	233001	34001
3D08...	9BC...	F18...	1374282	199000	233001	34001
C31B...	214...	B20...	1349999	32756	44571	11815
BEC3...	214...	F21...	1349998	205022	216836	11814

And however you sum the rows it should be hard to reach the int overflow limit.

A `DISTINCT` with a `GROUP BY` is always a sign of a flaw in your query. A `GROUP BY` already causes your data to be returned in *distinct* sets, so if you are getting duplicates, it likely means your `GROUP BY` is wrong. Otherwise the `DISTINCT` is redundant and unneeded overhead. — Thom A, Jan 20 '22 at 14:52
@Larnu I have now removed the DISTINCT from the query, still the same results and not really relevant to the question. — Johan, Jan 20 '22 at 14:56
It *is* relevant to your SQL though, @Johan . `DISTINCT` and `GROUP BY` should rarely (read never) mix. — Thom A, Jan 20 '22 at 15:00
Double-check that the columns in the expression are `Int`. Does the query also work if your `CONVERT` uses `Int` instead of `BigInt`? That the result returns a number well below the limits of the int data type could be hiding the issue. You could have one row whose expression return value is out of range, but when summed together they are not. Try running a `MAX(CONVERT(bigint, p.SlutLopLangd - p.StartLopLangd))` — FlexYourData, Jan 20 '22 at 15:01
@FlexYourData If I convert to int I still get the same error. Running a MAX returns even lower numbers than the sum (all subtractions return positive numbers). — Johan, Jan 20 '22 at 15:10
Long and short of it is that SQL Server does not take into account possible errors happening when building a query plan. So a calculation could for example be calculated over values that are later filtered out anyway. Therefore you must code defensively and ensure that errors cannot happen even on data that would logically be filtered out. For example, you should cast values to `bigint` *before* summing them, you should null out `0` *before* using it as a divisor, you should use `TRY_CONVERT` instead of `CONVERT` — Charlieface, Jan 21 '22 at 00:51

score 5 · Accepted Answer · answered Jan 20 '22 at 14:59

5

The final value doesn't actually matter. What is likely happening, is that at some point in your SUM you are going over the maximum value (2,147,483,647) or minimum value (-2,147,483,648) for an int and getting the error.

Take this example:

SELECT SUM(V.I)
FROM (VALUES(2147483646),
            (2),
            (-2006543543))V(I);

This will likely generate the same error:

Arithmetic overflow error converting expression to data type int.

The result of the SUM however, would be 140,940,105 (well below the maximum). This is because if 2147483646 and 2 are summed first, then you get 2147483648, which is larger than the maximum value of an int. If you CAST/CONVERT the value first, you don't get the error:

SELECT SUM(CONVERT(bigint,V.I))
FROM (VALUES(2147483646),
            (2),
            (-2006543543))V(I);

answered Jan 20 '22 at 14:59

Thom A

88,727
11
45
75

Yea, that's what I suspected. But what I dont understand is why does it calculate differently when adding positionId to the group by when there's a 1-to-1 relationship. It should be the exact same numbers to sum – Johan Jan 20 '22 at 15:07
But not in the same groups, @Johan . *If*, for example, the value `2` in the above was part of a different group, then you wouldn't get the error, despite the same numbers are being `SUM`ed; they're just in different "boxes". – Thom A Jan 20 '22 at 15:09
2

@Johan any change in the query text can lead to a different plan (and potentially different estimates for which and how many rows will flow into and out of any operator, which operators are used, where implicit conversions may happen, which calculations are performed before or after any filtering, etc). [Similar question yesterday, about string lengths vs. numeric range](https://stackoverflow.com/a/70770249/61305). – Aaron Bertrand Jan 20 '22 at 15:12
Yea, this has to be the explanation. But I still don't understand how it manages to reach the overflow limit with so few rows and so low numbers (I updated the question with the results without the group by). However the rows flow I don't see how it's possiple, but clearly it is somehow. – Johan Jan 20 '22 at 15:31
Your sample data doesn't replicate the error, @Johan . [db<>fiddle](https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=21eadf53eec4a8965041d4d055ff0321) – Thom A Jan 20 '22 at 15:37
@Larnu The source tables are a bit more complicated and has a lot more data in them, I just posted the output of the filtered data, which I thought was what was summed in the end. But Charliefaces explanation in a comment above seems reasonable, that the calculations are done before all the filtering is. When I look at the execution plan that seems to be the case ("compute scalar" is done in the middle before two more index seeks are done). – Johan Jan 21 '22 at 07:26

Arithmetic overflow depending on group by columns

1 Answers1