0

I have the following data frame (this is an example and the dataframe can contain more columns)

SelectVar 
     a    b    c    l    p    v   aa   ff
1 Dxa2 Dxa2 Dxa2 Dxa2 Dxa2 Dxb2 Dxb2 Dxb2
2 Dxb2 Dxb2 Dxb2 Dxd2 Dxi2 Dxc2 Dxd2 Dxi2
3 Dxc2 Dxd2 Dxi2 Dxi2 tneg Dxd2 Dxi2 tneg

I would like to count the frequency of the elements without convert it into a vector and using table or by indicating the element like in

length(SelectVar[SelectVar=="Dxa2"])

Is there any other way as to count the frequencies of the elements than the two mentioned in the above paragraph for the sample dataframe.

FXQuantTrader
  • 6,821
  • 3
  • 36
  • 67
Barnaby
  • 1,472
  • 4
  • 20
  • 34
  • 1
    What's wrong with `table`? – Blue Magister Feb 05 '14 at 16:13
  • What is your ultimate goal? If you can explain what you're going to do with the frequency counts, we may be able to suggest a different approach. – Carl Witthoft Feb 05 '14 at 16:18
  • I produce repeated dataframes as above where the column names can be different, longer or shorter all of three rows. I would not want to have to have to manipulate the data every time as I would like to link the three most frequent elements to another expression. in creating a vector to use by a table I have to manipulate the data every time due that both the elements and column names may be different – Barnaby Feb 05 '14 at 16:20
  • Isn't constructing a simple function helpful for what you describe in your comment? Something like `f = function(mydf) sort(table(unlist(mydf)))`. And, also, maybe you could, somehow, have all your dataframes in a list and manipulate them with `lapply`? – alexis_laz Feb 05 '14 at 16:30
  • Out of curiosity, why are using `data.frame` for this? If the columns all contain the same data type, `matrix` seems more appropriate and perhaps easier to manipulate. – dg99 Feb 05 '14 at 16:39
  • I think you are right a matrix is more appropriate – Barnaby Feb 05 '14 at 17:02

1 Answers1

2

I think you asked the same question yesterday, Counting the frequency of an element in a data frame

answer modified from, dickoa's answer to previous question instead of data.frame, if you make it matrix, table() does not need vectorization and it should work.

df <- read.table(text = "   b    c    e    f    g    h    j 
1 Dxa2 Dxa2 Dxa2 Dxa2 Dxa2 Dxa2 Dxa2
2 Dxb2 Dxb2 Dxb2 Dxb2 Dxc2 Dxc2 Dxc2
3 Dxd2 Dxi2 tneg tpos Dxd2 Dxi2 tneg", header = TRUE, row.names = 1)

ll<-data.frame(table(as.matrix(df)))

now, you can sort by freq and select top 3

head(ll[order(ll$Freq, decreasing=T),],3)               
Community
  • 1
  • 1
Ananta
  • 3,671
  • 3
  • 22
  • 26