17

I'm trying to calculate percentage of certain items in Shell Script. I would like to round off the value, that is, if the result is 59.5, I should expect 60 and not 59.

item=30
total=70
percent=$((100*$item/$total))

echo $percent

This gives 42.

But actually, the result is 42.8 and I would to round it off to 43. "bc" does the trick, is there a way without using "bc" ?

I'm not authorized to install any new packages. "dc" and "bc" are not present in my system. It should be purely Shell, cannot use perl or python scripts either

user2354302
  • 1,833
  • 5
  • 23
  • 35
  • You can use [dc](http://en.wikipedia.org/wiki/Dc_(computer_program)) instead. Bash only supports integer arithmetic (true for most shells) – Fredrik Pihl Jun 18 '14 at 11:35
  • Possible duplicate of [How do I use floating-point division in bash?](https://stackoverflow.com/questions/12722095/how-do-i-use-floating-point-division-in-bash) – kvantour Mar 21 '19 at 14:30

6 Answers6

25

Use AWK (no bash-isms):

item=30
total=70
percent=$(awk "BEGIN { pc=100*${item}/${total}; i=int(pc); print (pc-i<0.5)?i:i+1 }")

echo $percent
43
Michael Back
  • 1,821
  • 1
  • 16
  • 17
  • 1
    While it's good to have an `awk` alternative - even though it contradicts the "purely shell" premise of the question - it is ill-advised to use a _double-quoted_ string with _shell-variable expansion_ as the `awk` script, because it leads to confusion over what is expanded by the shell up front vs. what `awk` interprets later. The cleaner solution is to use a _single-quoted_ `awk` script to which (shell variable) values are passed with `awk`'s `-v` option. Also, given that floating-point arithmetic is used anyway, using `awk`'s `printf` function with format string `%.0f` is simpler. – mklement0 May 31 '16 at 02:55
  • @mklelement - using -v is more acceptable and cleaner.... but in this case, unnecessary. True, I did use the loophole that he did not mention awk (only python and perl were listed as unacceptable)... I provided the shell examples later. ;) – Michael Back May 31 '16 at 04:51
  • There's nothing to be gained from _not_ using `-v` in this case - except promoting ill-advised practices, which includes implementing custom rounding after having performed floating-point arithmetic already. As for the loophole: it's perfectly fine to provide alternative solutions, as long as they're declared as such; even though the question doesn't explicitly rule out `awk`, it does ask for a "purely Shell" solution, so that part of my comment was simply meant to make it explicit that this solution doesn't qualify as such (while _potentially_ still having value) - no more and no less. – mklement0 Jun 02 '16 at 05:25
  • @mklement0 - Well... actually... to me there was a gain... I initially used -v, but that made the script extend enough to the right that I wanted to compress the line a bit... I still also maintain that using **awk** here is probably a good idea to look at... I presume the "user" thought is was cool too and accepted the answer -- thinking it better than the (much faster) "pure shell" answers I submitted a bit later... so, all is cool... :) Happy I could help "user." – Michael Back Jun 02 '16 at 17:18
  • Again, my intent was not to discount an `awk` answer, but to make it explicit - again, as guidance to future readers - that this answer is not for someone looking for a _pure shell_ solution (for whatever reason). Clearly, at the very least 5-6 people have found value in your answer, and that's great. – mklement0 Jun 02 '16 at 17:31
  • As for the space issue: it's a great idea to avoid horizontal scrolling - there are far too many answers here that try to cram solutions into a single, scrolling line. However, I suggest not letting space concerns guide what solution to offer (my comment was about the _fundamental approach_); POSIX-like shells support multi-line strings, allowing you to easily spread an `awk` script across multiple lines for readability. – mklement0 Jun 02 '16 at 17:34
10

Taking 2 * the original percent calculation and getting the modulo 2 of that provides the increment for rounding.

item=30
total=70
percent=$((200*$item/$total % 2 + 100*$item/$total))

echo $percent
43

(tested with bash, ash, dash and ksh)

This is a faster implementation than firing off an AWK coprocess:

$ pa() { for i in `seq 0 1000`; do pc=$(awk "BEGIN { pc=100*${item}/${total}; i=int(pc); print (pc-i<0.5)?i:i+1 }"); done; }
$ time pa

real    0m24.686s
user    0m0.376s
sys     0m22.828s

$ pb() { for i in `seq 0 1000`; do pc=$((200*$item/$total % 2 + 100*$item/$total)); done; }
$ time pb

real    0m0.035s
user    0m0.000s
sys     0m0.012s
Michael Back
  • 1,821
  • 1
  • 16
  • 17
  • Not sure why people are so obsessed with benchmarking the built-ins. If code is going to be slow, it will because of the developer or the algorithm, not this. Right ? – Mike Q May 20 '18 at 02:48
  • The original answer that I wrote was in **awk**... which is good because once we start calculating in floating point, **awk** starts to look look like the best tool to get the job done. The point of this particular answer was to show that if this is the limit of the math that is needed in our script, rounding can also be calculated faster with integer math in shell. – Michael Back May 21 '18 at 17:03
  • It should be pointed out that here by "integer math" it is meant the integer arithmetic with truncation toward `0`, which is defined in the ISO C standard. In no way, this code shows that we can compute percentage using ordinary integer math with no truncation, only modulo, etc. Of course, we can, but that's not what this code do. It uses the truncation toward 0 that is done by the shell. – Dominic108 Dec 20 '19 at 22:58
  • @Dominic108 -- Good point... But I'm focusing these answers on the practical for the questioner (who was just frustrated with not being able to do floating point math)... with various levels of readability, verbosity and speed. **awk** is nice if the questioner wants to adapt by using floating point and a fast flexible language. If on the other hand the questioner wants to use shell exclusively and wants a really fast solution... he can likely get by using integer math operations. – Michael Back Dec 23 '19 at 12:47
  • Truncation toward 0 is not a high level concept. It is hidden behind the casting of floats to integers. It's not so natural, because otherwise it would be offered at a high level together with floor and ceil. It's because it's a bit weird from a high level perspective that it's only in 1999 that it became a standard for casting. The idea of using this to define a high level concept might seems like an unreliable hack to many. I would not say that these people are not practical. So, it's important to emphasize that it's a standard. – Dominic108 Dec 26 '19 at 18:37
7

A POSIX-compliant shell script is only required to support integer arithmetic using the shell language ("only signed long integer arithmetic is required"), so a pure shell solution must emulate floating-point arithmetic[1]:

item=30
total=70

percent=$(( 100 * item / total + (1000 * item / total % 10 >= 5 ? 1 : 0) ))
  • 100 * item / total yields the truncated result of the integer division as a percentage.
  • 1000 * item / total % 10 >= 5 ? 1 : 0 calculates the 1st decimal place, and if it is equal to or greater than 5, adds 1 to the integer result in order to round it up.
  • Note how there's no need to prefix variable references with $ inside an arithmetic expansion $((...)).

If - in contradiction to the premise of the question - use of external utilities is acceptable:


  • awk offers a simple solution, which, however, comes with the caveat that it uses true double-precision binary floating point values and may therefore yield unexpected results in decimal representation - e.g., try printf '%.0f\n' 28.5, which yields 28 rather than the expected 29):
awk -v item=30 -v total=70 'BEGIN { printf "%.0f\n", 100 * item / total }'
  • Note how -v is used to define variables for the awk script, which allows for a clean separation between the single-quoted and therefore literal awk script and any values passed to it from the shell.

  • By contrast, even though bc is a POSIX utility (and can therefore be expected to be present on most Unix-like platforms) and performs arbitrary-precision arithmetic, it invariably truncates the results, so that rounding must be performed by another utility; printf, however, even though it is a POSIX utility in principle, is not required to support floating-point format specifiers (such as used inside awk above), so the following may or may not work (and is not worth the trouble, given the simpler awk solution, and given that precision problems due to floating-point arithmetic are back in the picture):
# !! This MAY work on your platform, but is NOT POSIX-compliant:
# `-l` tells `bc` to set the precision to 20 decimal places, `printf '%.0f\n'`
# then performs the rounding to an integer.
item=20 total=70
printf '%.0f\n' "$(bc -l <<EOF
100 * $item / $total
EOF
)"

[1] However, POSIX allows non-integer support "The shell may use a real-floating type instead of signed long as long as it does not affect the results in cases where there is no overflow." In practice, ksh and zsh. support floating-point arithmetic if you request it, but not bash and dash. If you want to be POSIX-compliant (run via /bin/sh), stick with integer arithmetic. Across all shells, integer division works as usual: the quotient is returned, that is the result of the division with any fractional part truncated (removed).

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • You keep saying what you know/believe shells do when they do integer arithmetic. I don't want to be rude, but my comment-question was were is this documented? It feels to me that your position is that it is a practical fact and it is the way it is and it does not need to be documented. Sincerely, I would not build an application on that premise. I have provided documentations that are in the right direction in my answer. If you want to help along, please provide extra documentations. I know what you say: all shells truncate ... toward zero, but it does not replace a documentation. – Dominic108 Dec 19 '19 at 21:13
  • @Dominic108: I personally think that the explanation in the answer is sufficient - it describes de facto behavior that should match everyone's expectation of how integer division works. We know that POSIX doesn't spell out the behavior, but that it is aligned with ISO C, whose versions since C99 (1999), from what I understand, mandate the truncate-the-fractional-part ("round toward zero") behavior. The behavior is easily verified, and is highly unlikely to change. If you additionally want to go looking for explicit documentation for each indiv. shell, feel free - I don't see the need. – mklement0 Dec 19 '19 at 21:21
  • There is a big difference between one guy, no matter how important he might be, saying "all (or allmost all) shells work that way." and a documentation done by a committee saying "almost all shells work that way ... and we can rely on the fact that all shells claiming compliance with this standard do so." – Dominic108 Dec 19 '19 at 21:38
  • The fact that POSIX refers to the C99, etc. exactly what you wrote above, you did not say that before. It came after my question. I am reasonably happy with that, now. But, if anyone were to add extra documentation on this subject, it would be welcome. – Dominic108 Dec 19 '19 at 21:57
0

The following is based upon my second answer, but with expr and back-ticks -- which (while perhaps abhorrent to most) can be adapted to work in even the most archaic shells natively:

item=30
total=70
percent=`expr 200 \* $item / $total % 2 + 100 \* $item / $total`

echo $percent
43
Michael Back
  • 1,821
  • 1
  • 16
  • 17
  • Given that arithmetic expansion (`$((...))`) has been [part of the POSIX shell command language](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_04) since [at least 1997; SUS v2](http://pubs.opengroup.org/onlinepubs/7908799/xcu/chap2.html#tag_001_006_004), it's fair to assume that unless truly ancient shells must be supported, `expr` is not needed. – mklement0 May 31 '16 at 03:04
  • 1
    @mklement0 - I have old Solaris instances still being used at work... and they still use tcsh for everyone's login shell (because setup scripts were never translated to sh). Sue my employer (please). – Michael Back May 31 '16 at 04:34
  • I see, but your snippet doesn't actually run in `tcsh` (you'd have to modify the variable assignment statements), and the `shell` tag is generally understood to refer to _POSIX-like_ shells. – mklement0 May 31 '16 at 04:41
  • And while our _comments_ now cover the cases where you _do_ still need `expr` (pre-'97 Bourne-like shells and shells that don't support arithmetic, such as `tcsh`/`csh` (I'll take your word for it)), I encourage you to add this information directly to your _answer_. – mklement0 May 31 '16 at 04:43
  • 1
    @mkelement - I meant "*can be adapted to work in all shells natively*" to meet your objection(s)... I will try to be more clear in the future. – Michael Back Jun 01 '16 at 20:58
  • Can I offer a shift in perspective? My comments weren't _objections_; they were mean to give _context_ and offer _clarifications_ to your answer in other to provide additional guidance to future readers. Assuming that you agree with the content, such guidance is better provided as part of the answer itself rather than being buried in a comments thread that much fewer people are likely to read. – mklement0 Jun 02 '16 at 05:18
  • @mklement0 - Are you happier with the rewording? – Michael Back Jun 02 '16 at 17:10
  • Yes, thanks for updating; _personally_, I'd put _all_ the (more detailed) findings we've worked out in these comments directly into the answer, but that's obviously your call. – mklement0 Jun 02 '16 at 17:40
0

With a natural restriction to positive percentage (which cover almost all applications), we have a much simpler solution:

echo $(( ($item*1000/$total+5)/10 ))

It uses the automatic truncation toward 0 that is done by the shell instead of an explicit modulo 2 as in the second answer of Michael Back, which I have upvoted.

BTW, it might be obvious to many, but it was not obvious to me at first that the truncation toward 0 done in evaluating this code is fixed in POSIX, which says that it must respect the C standard for integer arithmetic

Arithmetic operators and control flow keywords shall be implemented as equivalent to those in the cited ISO C standard section, as listed in Selected ISO C Standard Operators and Control Flow Keywords. -- see https://pubs.opengroup.org/onlinepubs/9699919799/

For the C standard, see https://mc-stan.org/docs/2_21/functions-reference/int-arithmetic.html or section 6.3.1.4 in http://www.open-std.org/jtc1/sc22/wg14/www/C99RationaleV5.10.pdf. The first reference is for C++, but it, of course, defines the same integer arithmetic as the second reference which refers to the C99 ISO C standard.

Note that a restriction to positive percentage do not mean that we cannot have a percentage of, say 80%, which is a decrease of 20%. A negative percentage corresponds to a decrease of more than 100%. Of course, it can happen, but not in typical applications.

In accordance with the C standard, to cover negative percentages, we must test if the intermediary value item*1000/$total is negative and, in that case, substract 5, instead of adding 5, but we lose the simplicity.

Dominic108
  • 111
  • 4
0

Here is an answer without modulo 2 and pure bash, which works with negative percentage:

echo "$(( 200*item/total - 100*item/total ))"

We could define a generic integer division that rounds to the nearest integer and then apply it.

function rdiv {
   echo $((  2 * "$1"/"$2" - "$1"/"$2" ))
}
rdiv "$((100 * $item))" "$total"

In general, to compute the nearest integer using float to int conversion, one can use, say in Java : n = (int) (2 * x) - (int) x;

Explanation (based on binary expansion):

Let fbf(x) be the first bit of the fractional part of x, keeping the sign of x. Let int(x) be the integer part, which is x truncated toward 0, again keeping the sign of x. Rounding to the nearest integer is

round(x) = fbf(x) + int(x).

For example, if x = 100*-30/70 = -42.857, then fbf(x) = -1 and int(x) = -42.

We can now understand the second answer of Michael Back, which is based on modulo 2, because:

fbf(x) = int(2*x) % 2 

We can understand the answer here, because:

fbf(x) = int(2*x) - 2*(int(x))

An easy way to look at this formula is to see a multiplication by 2 as shifting the binary representation to the left. When we first shift x, then truncate, we keep the fbf(x) bit that we lose if we first truncate, then shift.

The key point is that, in accordance with Posix and the C standard, a shell does a "rounding" toward 0, but we want a rounding toward the closest integer. We just need to find a trick to do that and there is no need for modulo 2.

Dominic108
  • 111
  • 4