23

I'm building this API, and the database will store values that represent one of the following:

  • percentage
  • average
  • rate

I honestly have no idea how to represent something that the range is between 0 and 100% in numbers. Should it be

  • 0.00 - 1.00
  • 0.00 - 100.00
  • any other alternative that I don't know

Is there a clear choice for that? A global way of representing on databases something that goes from 0 to 100% percent? Going further, what's the correct that type for it, float or decimal?

Thank you.

GMB
  • 216,147
  • 25
  • 84
  • 135
Amanda Ferrari
  • 1,168
  • 5
  • 17
  • 30
  • 3
    Does this answer your question? [Storing statistical data, do I need DECIMAL, FLOAT or DOUBLE?](https://stackoverflow.com/questions/19601975/storing-statistical-data-do-i-need-decimal-float-or-double) – Marty Jan 22 '20 at 00:14
  • 5
    Numbers may be stored in many ways. There is nothing inherently wrong with either storing a percentage using 0-100 or using 0-1. What matters is what you need to do with the numbers, what accuracy you need, and so on. You must explain more context before a good answer can be given. Do you need to store numbers that are exactly representable with a small number of decimal digits? If you average things, you get fractions like thirds or sevenths. Do you need to store those exactly? Or just approximately? How approximately? What will you do with them? – Eric Postpischil Jan 22 '20 at 00:36
  • 1
    If the values are 0.00 to 100.00 in steps of 0.01 that is 10001 different values. Simply use an `int` to represent _hundredths_ or in units of _Permyriad_ or ‱. – chux - Reinstate Monica Jan 22 '20 at 04:17
  • @chux-ReinstateMonica - Yes, "scaled integers" are possible, but clumsy. – Rick James Jan 31 '20 at 19:15
  • @RickJames Perhaps. I have not found scaled integers difficult. – chux - Reinstate Monica Jan 31 '20 at 21:34

7 Answers7

9

I'll take the opposite stance.

FLOAT is for approximate numbers, such as percentages, averages, etc. You should do formatting as you display the values, either in app code or using the FORMAT() function of MySQL.

Don't ever test float_value = 1.3; there are many reasons why that will fail.

DECIMAL should be used for monetary values. DECIMAL avoids a second rounding when a value needs to be rounded to dollars/cents/euros/etc. Accountants don't like fractions of cents.

MySQL's implementation of DECIMAL allows 65 significant digits; FLOAT gives about 7 and DOUBLE about 16. 7 is usually more than enough for sensors and scientific computations.

As for "percentage" -- Sometimes I have used TINYINT UNSIGNED when I want to consume only 1 byte of storage and don't need much precision; sometimes I have used FLOAT (4 bytes). There is no datatype tuned specifically for percentage. (Note also, that DECIMAL(2,0) cannot hold the value 100, so technically you would need DECIMAL(3,0).)

Or sometimes I have used a FLOAT that held a value between 0 and 1. But then I would need to make sure to multiply by 100 before displaying the "percentage".

More

All three of "percentage, average, rate" smell like floats, so that would be my first choice.

One criterion for deciding on datatype... How many copies of the value will exist?

If you have a billion-row table with a column for a percentage, consider that TINYINT would take 1 byte (1GB total), but FLOAT would take 4 bytes (4GB total). OTOH, most applications do not have that many rows, so this may not be relevant.

As a 'general' rule, "exact" values should use some form of INT or DECIMAL. Inexact things (scientific calculations, square roots, division, etc) should use FLOAT (or DOUBLE).

Furthermore, the formatting of the output should usually be left to the application front end. That is, even though an "average" may compute to "14.6666666...", the display should show something like "14.7"; this is friendlier to humans. Meanwhile, you have the underlying value to later decide that "15" or "14.667" is preferable output formatting.

The range "0.00 - 100.00" could be done either with FLOAT and use output formatting or with DECIMAL(5,2) (3 bytes) with the pre-determination that you will always want the indicated precision.

Rick James
  • 135,179
  • 13
  • 127
  • 222
3

I would generally recommend against using float. Floating point numbers do represent numbers in base-2, which causes some (exact) numbers to be round-up in operations or comparisons, because they just cannot be accurately stored in base-2. This may lead to suprising behaviors.

Consider the following example:

create table t (num float);
insert into t values(1.3);

select * from t;

| num |
| --: |
| 1.3 |

select * from t where num = 1.3;

| num |
| --: |

Base-2 comparison of number 1.3 fails. This is tricky.

In comparison, decimal provide an accurate representation of finite numbers within their range. If you change float to decimal(2, 1) in the above example, you do get the expected results.

GMB
  • 216,147
  • 25
  • 84
  • 135
  • Please elaborate on the downvote so I possibly can improve this answer... – GMB Jan 22 '20 at 00:30
  • 4
    This answer is false in several regards. “In comparison, decimal have a smaller range but provide an exact representation of finite numbers within that range” is false: Decimal does not represent ⅓ exactly. “some (exact, finite) numbers are round-up” is incorrect; numbers are not “round-up”. Conversions and other operations may round. The default rounding mode is most commonly round-to-nearest-ties-to-even, not round-up. – Eric Postpischil Jan 22 '20 at 00:31
  • 4
    Issues with accuracy are not due to “Floating point numbers” but simply due to numerical representations: **All** finite numerical representations have limited accuracy: Floating-point, fixed-point, integer, rational, decimal, binary, everything. – Eric Postpischil Jan 22 '20 at 00:32
  • @EricPostpischil: OK. I rephrased some parts of my answer according to your comments, most of which I agree with. I would still stand to the point that we generally want to avoid `float` in relational databases if we want accuracy and precision in numeric operations, as the example demonstrates for numeric comparison. – GMB Jan 22 '20 at 00:46
  • @GMB I have not down-voted your answer, but I believe it uses strange assumption that everyone working with float values incorrectly. Well, this is partially true as many people do this (comparison must always uses delta, printing must always use formatting), however this should not be stopper while choosing correct data type in DBMS. So I believe the answer is misleading. – fifonik Jan 22 '20 at 00:51
  • I just up-voted your answer as I think whomever down-voted it needs to see the men in the white coats. (I got down-voted yesterday for actually providing the correct and accepted answer, just not for going into detail) – Rodney Ellis Jan 22 '20 at 01:05
  • 2
    Sigh. What have you fixed? My comment says the answer is wrong because it says decimal provides an exact representation of numbers within its range, but in fact it does not because it does not provide an exact representation of ⅓. The change says “accurate” instead of “exact”, but then why is not binary floating-point just as good—neither is exact for ⅓, and both or neither are accurate, depending on what your threshold for accurate is and how much precision they have. The question indicates averages will be represented, and averaging three things gives you numbers like ⅓. – Eric Postpischil Jan 22 '20 at 01:27
  • 4
    The comment says round-to-nearest-ties-to-even is most commonly used, but the answer still says round-up. The answer says comparisons may round-up, but comparison are perfect: Comparisons always return a mathematically correct result, with no rounding. (Some programming languages may convert operands before comparing, but those are separate operations.) – Eric Postpischil Jan 22 '20 at 01:28
  • 1
    1/3 cannot be _exactly_ represented in either binary or decimal. A discount of 20% off of $14.99 will require rounding fractional cents do not exist. – Rick James Jan 23 '20 at 06:47
1

I recommend using decimal(5,2) if you're going to store it in the same way you'll display it since decimal is for preserving the exact precision. (See https://dev.mysql.com/doc/refman/8.0/en/fixed-point-types.html)

Because floating-point values are approximate and not stored as exact values, attempts to treat them as exact in comparisons may lead to problems. They are also subject to platform or implementation dependencies.

(https://dev.mysql.com/doc/refman/8.0/en/floating-point-types.html)

A floating-point value as written in an SQL statement may not be the same as the value represented internally.

For DECIMAL columns, MySQL performs operations with a precision of 65 decimal digits, which should solve most common inaccuracy problems.

https://dev.mysql.com/doc/refman/8.0/en/problems-with-float.html

jhoanna
  • 1,797
  • 25
  • 25
1

Decimal : In case of financial applications it is better to use Decimal types because it gives you a high level of accuracy and easy to avoid rounding errors

Double : Double Types are probably the most normally used data type for real values, except handling money.

Float : It is used mostly in graphic libraries because very high demands for processing powers, also used situations that can endure rounding errors.

Reference: http://net-informations.com/q/faq/float.html

Prabodha
  • 520
  • 1
  • 6
  • 19
0

Difference between float and decimal are the precision. Decimal can 100% accurately represent any number within the precision of the decimal format, whereas Float, cannot accurately represent all numbers.

Use Decimal for e.g. financial related value and use float for e.g. graphical related value

0
mysql> create table numbers (a decimal(10,2), b float);
mysql> insert into numbers values (100, 100);
mysql> select @a := (a/3), @b := (b/3), @a * 3, @b * 3 from numbers \G

*********************************************************************

@a := (a/3): 33.333333333
@b := (b/3): 33.333333333333
@a + @a + @a: 99.999999999000000000000000000000
@b + @b + @b: 100

The decimal did exactly what's supposed to do on this cases, it truncated the rest, thus losing the 1/3 part.

So for sums, the decimal is better, but for divisions, the float is better, up to some point, of course. I mean, using DECIMAL will not give you "fail-proof arithmetic" in any means.

I hope this will help.

vishpatel73
  • 317
  • 4
  • 10
0

In tsql: Float, 0.0 store as 0 and it dont require to define after decimal point digit, e.g. you dont need to write Float(4,2). Decimal, 0.0 store as 0.0 and it has option to define like decimal(4,2), I would suggest 0.00-1.00, by doing this you can calculate value of that percent without multiply by 100, and if you report then set data type of that column as percent as MS Excel and other platform view like 0.5 -> 50%.

Arsalan Khan
  • 403
  • 3
  • 11