2

Here is a MWE, where I'm calculating the distribution of (in this case) 181 balls into 997 different buckets chosen randomly.

> hthord
function(tprs=100,lower=0,upper=5,limits=0.95,ords=997){
    p = dbinom(seq(lower,upper,),size=tprs,prob=1/ords)
    ll = qnorm(0.5*(1+limits))
    pe = ords*p
    pl = pe - ll*sqrt(ords*p*(1-p))
    pu = pe + ll*sqrt(ords*p*(1-p))
    cbind(seq(lower,upper),pl,pe,pu,deparse.level=0)
}

> hthord(181)
     [,1]         [,2]         [,3]        [,4]
[1,]    0 808.37129927 8.314033e+02 854.4353567
[2,]    1 128.89727212 1.510884e+02 173.2794395
[3,]    2   6.46037329 1.365256e+01  20.8447512
[4,]    3  -0.95391946 8.178744e-01   2.5896682
[5,]    4  -0.33811535 3.654158e-02   0.4111985
[6,]    5  -0.06933517 1.298767e-03   0.0719327
> 

Can anyone explain why column [,3], only, is displayed in exponential notation?

It occurs to me that pl and pu are being coerced into a different class from pe, but the details escape me. Please help!

Brent.Longborough
  • 9,567
  • 10
  • 42
  • 62

2 Answers2

3

You are running a function that returns a matrix. To display a matrix, print.default() gets called. It tries to find a good (succinct) way to represent the values in each column while taking into account R's global options.

If you type options() or ?options, you will see the global options include several display and print settings. One is digits, which controls the number of significant digits to print when printing numeric values. Another is scipen, short for "scientific (notation) penalty", which help(options) explains is:

scipen:  integer. A penalty to be applied when deciding to print numeric values 
         in fixed or exponential notation. Positive values bias towards fixed 
         and negative towards scientific notation: fixed notation will be 
         preferred unless it is more than scipen digits wider."

In your case, column 3 has smaller values than the other cols and it turns out to be more succinct to write the value in scientific notation. print.deault() will be consistent in how it displays a vector or column, so the whole col gets changed.

As pedrosaurio stated, you can set scipen to a really high value and ensure that you never see scientific notation.

You can play around with the settings for hands-on learning:

> op <- options() # store current settings

> options("digits","scipen")
$digits
[1] 7

$scipen
[1] 0

> print(pi); print(1e5)
[1] 3.141593
[1] 1e+05
> print(c(pi, 1e5)) # uses consistent format for whole vector
[1] 3.141593e+00 1.000000e+05

> options(digits = 15)
> print(pi)
[1] 3.14159265358979
> options(digits = 5)
> print(pi)
[1] 3.1416

> print(pi/100000); print(1e5)
[1] 3.1416e-05
[1] 1e+05
> options(scipen=3) #set scientific notation penalty
> print(pi/100000); print(1e5)
[1] 0.000031416
[1] 100000

> options(op)     # reset (all) initial options

See also: stackoverflow.com/questions/9397664/

Community
  • 1
  • 1
MattBagg
  • 10,268
  • 3
  • 40
  • 47
1

Change the option scipen so it is in the same format. It won't matter for calculations as it is just a format.

options(scipen=9999)

run this command and it should look all the same.

Why is column 3? I don't know and it shouldn't matter unless you are exporting it to another program that doesn't recognize scientific notation.

pedrosaurio
  • 4,708
  • 11
  • 39
  • 53
  • 2
    print.default was called and is trying to find a succinct way to represent the values in column 3, which span a broader range than in the other cols. This is further explained here. http://stackoverflow.com/questions/9397664/force-r-not-to-use-exponential-notation-e-g-e10 – MattBagg Nov 15 '12 at 23:11
  • @mb3041023 : thanks. It's the small value of the last item in column 3 that does it. If you want to make that into an answer, I'll be happy to feed you some SE-Karma. – Brent.Longborough Nov 18 '12 at 14:18