1

I'm doing geography calculations, and ultimately end up with a latitude and longitude to store in a Geography::Point object. Both latitude and longitude can have 7 digits at most (which also gives precision up to 11 mm, which is plenty).

The problem is: if the value of a field cannot be stored correctly in a Double, MS SQL rounds towards the nearest number that can, but does so by adding a bunch of digits.
=> e.g. 5.9395772 is stored as 5.9395771999999996

The problem this creates, is that [Position].ToString() then exceeds the maximum amount of characters is allowed for that column (and no, I can't increase that limit).
Since we're dealing with Latitude, Longitude, Altitude and Accuracy, there's space for exactly 11 characters for Latitude and Longitude each:
String.Format(CultureInfo.InvariantCulture, "{0:##0.0######}", num)

I've tried simply Math.Round()ing to 6 digits, but then other numbers (e.g. 6.098163 to 6.0981629999999996) get the same problem.

How do I Math.Round towards the nearest 7-digit valid bit representation?

EDIT/ADD

    Public Function ToString_LatLon(ByVal num As Double) As String
        num = Math.Round(num, 7, MidpointRounding.AwayFromZero)
        Return String.Format(CultureInfo.InvariantCulture, "{0:##0.0######}", num) 
    End Function 'IN = 5.9395772, OUT = 5.9395772
  • The above code receives a Double and correctly returns the String representation. I've checked it, this is correct also for troubling numbers.

  • It's stored in SQL Server through the framework we use. I think the problem occurs when storing the value

  • When I retrieve the value, I get an error in VB, saying the value is wider than the framework allows (max of 50 characters).

  • If I run a query in SSMS, I find e.g. POINT (X.0981629999999996 XX.664725 NULL 15602.707) (51 characters, anonimized).

EDIT 2

I've done some more research and some calculations. It seems that the stored value 5.9395772 is converted to binary and returned as 5.9395771999999996, which is stored as a double inside the database (in a binary Geography::Point object, not to worry.) Convert the binary 0 10000000001 0111110000100010000010000110100010000100010011011101 back to decimal, and you get 5.93957719999999955717839839053340256214141845703125, but abbreviated at 16 decimals - whereas I would like it abbreviated at 7 decimals.

Solutions:

  1. Round the value down/up to the nearest value where everything from the 8th decimal onward is 0 (or enough zeroes before another nonzero digit is found)
  2. Query for only so many decimals.
  3. Query the actual (hexadecimal) value, and convert that (instead of the string representation)
  4. Keep the string representation, but round the values before storing and after retrieving to the required amount of decimals.

Discussions:

  1. Both in office and here (at @RobertBaron's answer): this is quite tricky, might have a huge decrease in precision, and is basically a lot of work.
  2. Perhaps this is possible, I don't know.
  3. This would be the cleanest solution, as my colleagues and I agree, however this is a lot of work in developing and testing.
  4. Instead of caring about the value in memory to be equal to the value in the database, we don't care about the value in the database (too much).

In the end, after quite some whiteboard bit-calculations and a lengthy discussion, we've gone with option 4. After we retrieve the [Position].ToString() (for which we've increased the string limit) from the database, we convert that as we're already doing, and as additional step before using it anywhere we round the value to the required amount of decimals. When returning the value to the database, we once again round the value to the amount of decimals, and don't care what the database really does with it.
Essentially, this is option 2, but then on the program-side instead of database-side.

Diamundo
  • 148
  • 11
  • 7
    Why allow it to use a float here at all? Decimal would be a much safer option. – JNevill Jul 01 '19 at 14:26
  • 6
    Possible duplicate of [Difference between numeric, float and decimal in SQL Server](https://stackoverflow.com/questions/1056323/difference-between-numeric-float-and-decimal-in-sql-server). Use `DECIMAL` if you need exact 7 decimal precision. – Tim Biegeleisen Jul 01 '19 at 14:27
  • 2
    It's pretty unclear where your troubles with precision come in. By the time you call `.ToString()`, SQL Server is not involved at all anymore, and the `String.Format` will definitely cut down the representation to the number of desired digits. What is the type of values you are reading, from where, using which code, and where does it go wrong? Your literal question ("how do I use `Round` to round to a 7 digit representation") has no answer, as you cannot round a `Double` (a *binary* floating-point type) in such a way that it can store an exact number of *decimal* digits. – Jeroen Mostert Jul 01 '19 at 14:30
  • What's the code for retrieving the value? It's true that if you call (e.g.) `SELECT geography::Point(5.9395772, 5.9395772, 4326).ToString()`, you get `POINT (5.9395771999999996 5.9395771999999996)`, but ideally code should not be reading these values through the string representation at all (but the `SqlGeography`/`Geometry` types). If it does choose to use a string, the restriction to 50 characters seems to be purely arbitrary. – Jeroen Mostert Jul 01 '19 at 14:44
  • @JeroenMostert, Personally I agree. But that's not something I can change... That's why I'm looking for a solution to store it in a semi-correct notation, to prevent errors like the one I'm dealing with now. – Diamundo Jul 01 '19 at 14:52
  • To be clear, you cannot change the fact that SQL Server uses floating-point to store these values internally (that's just how the `geography` type works), nor can you affect the way it chooses to round these values when it converts them to a string (i.e., not at all), so *something* has to give. It is possible to write formatting code on the T-SQL end (e.g. `SELECT CONCAT('POINT (', FORMAT(geography::Point(5.9395772, 4.9395772, 4326).Lat, '0.#######'), ' ', FORMAT(geography::Point(5.9395772, 4.9395772, 4326).Long, '0.#######'), ')')`), but this will not generalize to arbitrary objects. – Jeroen Mostert Jul 01 '19 at 14:56
  • If you need to get perfect round-tripping for a `String` representation, your only recourse would be to use a `VARCHAR`. Of course this will be far less efficient than storing it as `geography`, and you would need to convert the value every time you need it as that. – Jeroen Mostert Jul 01 '19 at 14:59
  • It is also, technically, theoretically possible to "jitter" the values if they're very close to the desired value but "too long as a `String`", so you get a value that's also "close" to your exact value and ends up OK when rounded (i.e. `5.93957721` does not suffer from the same problem, *not* because it has an exact representation but because the string representation happens to be pleasingly rounded). This comes closest to your original question. Writing code for that would be tricky, though (I can't think of an easy way that will work in all cases), and quite the ugly hack. – Jeroen Mostert Jul 01 '19 at 15:09
  • Incidentally, the core issue is this: when formatting the floating-point components of a `geography` instance, SQL Server will directly call into a .NET assembly which formats them using `Double.ToString` with the `R` (roundtrip) format specifier. The implementation of this has some quirks that causes it to sometimes use an excess of digits necessary to format a round-trippable value. This is fixed in .NET Core 3 (where `(5.9395772).ToString("R") == "5.9395772"`), but it will be a long time before SQL Server uses this framework (if ever). – Jeroen Mostert Jul 02 '19 at 11:16

2 Answers2

1

This is only a partial answer.

If by valid bit representation you mean exact bit representation, then this is possible. The decimal numbers that have exact bit representation are 1/2, 1/4, 3/4, 1/8, 3/8, 5/8, 7/8, 1/16, 3/16, ...

The challenge is to characterize among these powers of two, those whose base 10 representation has 7 digits or less, and then to round any base 10 number to the closest of these numbers.

I am posting this in the hope that it may get you one step further toward a solution.

RobertBaron
  • 2,817
  • 1
  • 12
  • 19
  • Yes, this! This is exactly what I meant, thanks for putting it in words! I've altered the question title to reflect this. Now we just need a solution ;) – Diamundo Jul 02 '19 at 07:18
  • @Diamundo: the smallest dyadic rational representable in 7 digits or less is 1/128 (0.0078125), which means these numbers are accurate only up to two decimal digits, or about 1.1 km of precision. Is that enough? I suspect not. – Jeroen Mostert Jul 02 '19 at 10:20
  • Yes, would be much less than 11 mm, that is 11 mm x 78125 = 0.859 375 km. – RobertBaron Jul 02 '19 at 10:41
  • I suppose that's correct. I've added an edit - we've discussed this in office too - and will mark this answer as the correct one. It might not have solved the problem, but it has been important in the process. Thank you all! – Diamundo Jul 02 '19 at 10:56
0

If you cannot change the data type into a DECIMAL for whatever reasons, you have to cast it into a DECIMAL every time you need the value. It's that simple. And you can either do it on the SQL Server side or in VB.NET, but you need a DECIMAL. DOUBLEs are imprecise.

By the way, it is not the SQL Server that rounds towards the nearest number it recognizes by adding a bunch of digits - it's the processor that does it. That's also why you may get slightly different DOUBLE values after restoring your database on another server.

And never ever even think of using them as an ID: I know an application that uses FLOAT values containing the timestamp (<creation day since whatever>.<time as fractals of the day>) as part of the primary key (of nearly every table!). Every 10000th record or so cannot be addressed directly by its ID because the value differs somewhat on the client that sends the query and the server by some nanoseconds although the number looks exactly the same in SSMS on the client and the server.

Christoph
  • 3,322
  • 2
  • 19
  • 28
  • If you find a database where, on restoring, you find "slightly different `FLOAT` values" in *columns*, you have almost certainly found a bug (there have been some) that `DBCC CHECKDB WITH DATA_PURITY` should find. The SQL Server team is *very* keen on maintaining data integrity, and while it may occur that the results of *calculations* differ (due to the difference in floating-point instructions used and the x86 extended precision), stored values will *not* have different values depending on what processor the server happens to have. And all this is irrelevant for `geography`, of course. – Jeroen Mostert Jul 02 '19 at 11:22