0

I ran into a database where I was looking for a particular value to SUM... in this case it was distance in miles. I had a hell of a time using my "search all values in database" procedure to find how the frontend displayed value was stored in the backend database.

That's until I realized they stored miles as an integer and displayed it as a decimal.

For instance:

  • Distance in DB: 2368 (as an INT)

  • Distance in Frontend: 23.68

Why would someone do this? Is there some sort of performance advantage?

Edit: Also, this is a major globally used piece of software used for delivery routing... so I'm wondering if they know something I don't.

Another note... they do store other values as FLOAT with the same exact 2-point precision they are displaying distance as.

justiceorjustus
  • 2,017
  • 1
  • 19
  • 42
  • I think the more relevant question is not about `int` vs `float` but `int` vs `decimal`. And I can't imagine a GREAT reason why you would do this. Maybe to save a byte or two per record (Assuming here though). – JNevill Dec 15 '17 at 16:48
  • 1
    Most likely whoever wrote the code was confused about floats, database types, and/or using a language that didn't support decimals. In all databases except SQLite the `NUMERIC(s,p)` or equivalent type is used to specify how big a number can be and how many decimals it is allowed to have *without* loss of precision. If someone uses eg Javascript though, which has no decimal type, or uses string concatenation to pass parameters as floats instead of decimals, he/she may think that databases can lose precision – Panagiotis Kanavos Dec 15 '17 at 16:49
  • @PanagiotisKanavos This is an industry-specific software that was created by and is used by a $200+ billion dollar company globally... which is why I'm taken back a bit and figured there must be something I don't know. – justiceorjustus Dec 15 '17 at 16:55
  • Could you elaborate a little bit on the backend software? What language is it? What numeric representation format is it using? etc. – Adrien Brunelat Dec 15 '17 at 17:00
  • 1
    @justiceorjustus Unicenter TNG perhaps? Or something similar? Older software was made to work with *any* database, even those that didn't have datetime or decimal types. That's why you may see integers used instead of decimals or dates. There are worse cases, like [zoned decimals](https://www.ibm.com/support/knowledgecenter/en/ssw_ibm_i_72/rzasd/zodecfo.htm) - using a letter as the last digit to signify both precision and the sign of the number. That's still used by IATA files to report ticket sales between airlines. – Panagiotis Kanavos Dec 15 '17 at 17:00
  • @justiceorjustus in fact using a 32-bit "unix timestamps" to store dates is going to cause [a bit of a problem](https://en.wikipedia.org/wiki/Year_2038_problem) in a few years, as using 1970 as a base means there will be an overflow by 2038 – Panagiotis Kanavos Dec 15 '17 at 17:02
  • @PanagiotisKanavos The software is by Omnitracs and it is used by a major beverage distributor that I work for (including many more companies). Weird thing is... it uses `FLOAT` for other things that are displayed even on the same panel... such as delivery volumes. Displayed in the same 2-point precision, as well. – justiceorjustus Dec 15 '17 at 17:06
  • 1
    @justiceorjustus something tells me the software also runs on AS/400 ... – Panagiotis Kanavos Dec 15 '17 at 17:08
  • Size would be a reason. Buy disk and memory is cheap. – paparazzo Dec 15 '17 at 17:16
  • 2
    The real answer? Whoever wrote this probably never thought the miles would be used with decimals. Probably many years ago. Then someone working on the front end said "Hey we need more precision, let's start storing the value with decimals too." Now it's ingrained in every aspect of the software and it would be a huge undertaking to change so they just roll with it. Pretty much how every weird database or software spaghetti was created ever. – Jacob H Dec 15 '17 at 17:16
  • Maybe they're storing hundredths of miles. And use an integer routing algorithm. – 1010 Dec 16 '17 at 17:40

2 Answers2

3

FLOAT is terrible because it has precision and conversion issues, but I think you're more focused on why INT was used instead of DECIMAL or NUMERIC

There are a few possible reasons I can think of:

  1. INT is (negligibly) smaller:

    • DECIMAL and NUMERIC are 5-17 bytes depending on precision
    • INT is 4 bytes
  2. Some people believe INT is faster, but good luck proving that

  3. You mention this is used globally, so it may make more sense for the format of a decimal value to be handled by the front end anyway. For instance, 12.89 in the US is 12,89 in many European countries.

Aaron Dietz
  • 10,137
  • 1
  • 14
  • 26
  • I started writing about decimals having no format, but googling for `Omnitracs` and AS400 return a few hits. Data on AS440/ DB2 can be stored as flat files and queried either as a delimited file (often using COBOL) or SQL over DB2. Field sizes and separators *do* play a role there and quite often the number of decimals is part of the field definition, not the actual value. – Panagiotis Kanavos Dec 15 '17 at 17:11
  • I'm going to rule out 3 because they do store decimal values (same precision) as `float` for other things... still... not sure if 1 or 2 is enough to convince me that's why they did it. – justiceorjustus Dec 15 '17 at 17:12
  • @justiceorjustus I did encounter such unexpected data when pulling data from DB2 on AS400 as files a decade ago. The data may be stored this way to avoid rewriting code for different databases, even though most databases support precise decimals nowadays – Panagiotis Kanavos Dec 15 '17 at 17:15
  • @PanagiotisKanavos So are you saying you think they just updated old legacy code to make the current software? Because still, like I said, they are using `float` for hundreds of other columns. – justiceorjustus Dec 15 '17 at 17:16
  • @PanagiotisKanavos Thanks for the comments - one billing system I work with is in DB2 on an AS400 and often have it in mind (mostly against my will) – Aaron Dietz Dec 15 '17 at 19:21
  • I'm going to take this as the most logical explanation, I guess. The world may never know. – justiceorjustus Dec 17 '17 at 21:49
-1

With floats it is possible to lose some data when casting. You will end up with unexpected results. See: Converting SQL FLOAT to SQL INT, lost data

Jan Wytze
  • 3,307
  • 5
  • 31
  • 51
  • Which is why everyone uses `NUMERIC(s,p)` instead of FLOAT. Never integer with unspecified decimals. Quite likely whoever wrote the codd didn't understand the difference and/or used a language that didn't support decimals, like Javascript – Panagiotis Kanavos Dec 15 '17 at 16:45