0

I have a data frame that I read from a file: sassign. I created a frequency table using prop.table(). Here's what I used:

prop.table(table(sassign$state))

Output is:

    AE      CT      DC      DE      MA      MD      ME      NH      NJ      NY      PA      RI      VA      VI      VT 
0.00010 0.05024 0.00678 0.01422 0.08504 0.08344 0.00686 0.01330 0.22136 0.33060 0.17436 0.00804 0.00054 0.00090 0.00422 

This output is very clunky. Is there any way I can organize this as a column, include number of occurences and then sort it?

I also tried CrossTabs, but it's even more clunkier.

CrossTable(sassign$state)

  Cell Contents
|-------------------------|
|                       N |
|         N / Table Total |
|-------------------------|


Total Observations in Table:  50000 


          |        AE |        CT |        DC |        DE |        MA | 
          |-----------|-----------|-----------|-----------|-----------|
          |         5 |      2512 |       339 |       711 |      4252 | 
          |     0.000 |     0.050 |     0.007 |     0.014 |     0.085 | 
          |-----------|-----------|-----------|-----------|-----------|


          |        MD |        ME |        NH |        NJ |        NY | 
          |-----------|-----------|-----------|-----------|-----------|
          |      4172 |       343 |       665 |     11068 |     16530 | 
          |     0.083 |     0.007 |     0.013 |     0.221 |     0.331 | 
          |-----------|-----------|-----------|-----------|-----------|


          |        PA |        RI |        VA |        VI |        VT | 
          |-----------|-----------|-----------|-----------|-----------|
          |      8718 |       402 |        27 |        45 |       211 | 
          |     0.174 |     0.008 |     0.001 |     0.001 |     0.004 | 
          |-----------|-----------|-----------|-----------|-----------|

I'm a beginner and have started working with R for past 4 days. I've spent about 4 hours on this situation, so I'd appreciate any help. Thanks in advance.

watchtower
  • 4,140
  • 14
  • 50
  • 92

2 Answers2

5

When posting, it is best to provide an example that actually works. You provided the output, but not the data that was used to produce it, so your output can't be recreated. I will create a portion of the data for this example (first 4 states):

state <- c(rep("AE", 5), rep("CT", 2512), rep("DC", 339), rep("DE", 711))

Now, you should save the result of prop.table() in a object so you can continue working with it.

tab <- prop.table(table(state))

Then you might want to create a data.frame with the state as one column and the proportion as another, like this:

df <- data.frame(state=names(tab), proportion=as.numeric(tab))

The content of df is now:

  state  proportion
1    AE 0.001401738
2    CT 0.704233249
3    DC 0.095037847
4    DE 0.199327166

If you also want to sort the rows by the proportion, then you can do

df <- df[order(df$proportion),]
mrbrich
  • 853
  • 1
  • 8
  • 9
  • Thanks mrbrich. I will post the data from next time onward. My data has 50K rows, so do you think it's okay to post a snippet? – watchtower Jul 29 '16 at 14:54
  • Yes, I didn't mean that you need to post the whole data, just a small example. – mrbrich Jul 29 '16 at 15:05
  • 2
    @watchtower [see here](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610) for some guidance for future posts – Jaap Jul 29 '16 at 15:13
  • 2
    you can simplify it a bit `as.data.frame(tab)` – user20650 Jul 29 '16 at 15:52
0

You can do that inside the table: table(data$x,dnn = "Frecuencia")[order(table(data$x))])

with dnn you give the name of the column

So using prop.table it looks like this:

prop.table(table(data$x,dnn = "Frecuencia")[order(table(data$x))]),2)
TarangP
  • 2,711
  • 5
  • 20
  • 41
Frish Vanmol
  • 314
  • 2
  • 6