0

I have a dataframe which shows a number of shops that have had a health and safety test. Within this dataframe I have the name of the shop and a factor that shows the outcome of the test on a certain day.

head(facttab)
    new_table.dba_name new_table.results
1            QUICK SUB   Out of Business
2             BAR BARI              Pass
3   FOOD FIRST CHICAGO              Pass
4   TRATTORIA ISABELLA              Pass
5    DELI-TIME, L.L.C.              Pass
6 GREAT AMERICAN BAGEL              Fail
>    

 facttab <- data.frame(new_table$dba_name, new_table$results)
    head(table(facttab))

new_table.dba_name                Fail No Entry Not Ready Out of Business Pass Pass w/ Conditions
  1 2 3 EXPRESS                      1        0         0               0    0                  0
  1155 CAFETERIA                     0        0         0               0    1                  0
  16TH ST FOOD MART                  0        0         0               1    0                  0
  194  RIB  JOYNT                    0        1         0               0    0                  0
  24HR MINI MART & CELLAR FOR YOU    1        0         0               0    0                  0
  7-ELEVEN                           0        0         0               0    4                  2

I would like to build another table or dataframe that shows the % of the total outcomes of tests for each shop over the whole dataframe so I can see who has the largest % fails and the largest % pass.

The resulting table would be similar to above for example 7-Eleven would be - 0%, No Entry - 0%, Not Ready Out - 0%, Out of Business 0%, Pass - 66% and Pass w/conditions - 33%.

Grabdegood
  • 27
  • 1
  • 1
  • 4
  • 3
    Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). This will make it much easier for others to help you. – Jaap May 07 '16 at 09:17
  • Thanks, I've looked at that and made some changes – Grabdegood May 07 '16 at 14:29
  • 2
    `prop.table(table(facttab), 1)` – bouncyball May 07 '16 at 15:33
  • thanks that works great. Do you know if there is a similar function for a dataframe? – Grabdegood May 08 '16 at 09:06

1 Answers1

0

I thought I would whip up an answer. This is how to convert the prop.table into a data.frame. I'm sure there's probably a quicker way of doing this. Note that I'm using a dataset I created myself. It would probably be helpful to look at ?reshape

set.seed(123)
#create some dummy data
df <- data.frame(store = sample(c('a','b','c'), 100, replace = T),
                 status = sample(c('foo','bar','haz'), 100, replace = T))
#convert to prop.table
(prop.t <- prop.table(table(df$store, df$status), 1))

          bar       foo       haz
  a 0.4242424 0.2121212 0.3636364
  b 0.4117647 0.4117647 0.1764706
  c 0.3636364 0.3030303 0.3333333

#coerce to data.frame
(prop.t.df <- data.frame(prop.t))

  Var1 Var2      Freq
1    a  bar 0.4242424
2    b  bar 0.4117647
3    c  bar 0.3636364
4    a  foo 0.2121212
5    b  foo 0.4117647
6    c  foo 0.3030303
7    a  haz 0.3636364
8    b  haz 0.1764706
9    c  haz 0.3333333

#use reshape()
(reshape(prop.t.df, direction = 'wide', idvar = 'Var1', v.names = 'Freq', timevar = 'Var2'))

  Var1  Freq.bar  Freq.foo  Freq.haz
1    a 0.4242424 0.2121212 0.3636364
2    b 0.4117647 0.4117647 0.1764706
3    c 0.3636364 0.3030303 0.3333333

Obviously, you'd probably want to play around with the names a bit, but this is one way of getting at what you want.

PS Another way of getting at it is:

prop.t.df2 = as.data.frame.matrix(prop.t)

Note: you'd probably need to create a new column called Store by accessing the row.names of prop.t.df2.

prop.t.df2$Store = row.names(prop.t.df2)
bouncyball
  • 10,631
  • 19
  • 31