0

I'd like to count the number of recurring numerical variables in a given column. My data set is: mydata

And I have a dataframe that looks like this:

mydata <- structure(list(VARIABLE = c(1, 1, 2, 3, 3, 3, 4, 4)), 
  .Names = "VARIABLE",   row.names = c(NA, -8L), class = "data.frame")

mydata
##     VARIABLE
## 1        1
## 2        1
## 3        2
## 4        3
## 5        3
## 6        3
## 7        4
## 8        4

I'd like to calculate the number of 1s,2s,3s,4s in column VARIABLE (Two 1, One 2, Three 3, Two 4). Is there anyway I can do this without installing an additional package?

mnel
  • 113,303
  • 27
  • 265
  • 254
Pirate
  • 311
  • 1
  • 5
  • 12
  • Welcome to Stack Overflow! You will find that you get better answers if you take the time to make your question reproducible. Please follow the guidelines (http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example), paying special attention to the part about `dput()`. Thanks! – Ari B. Friedman Sep 11 '12 at 01:47
  • okay! thank you!! I'll learn how to ask question in a proper format! – Pirate Sep 11 '12 at 01:52

2 Answers2

4

Yes, use table as follows, it comes from the base package:

mydata <- data.frame(VARIABLE = c(1, 1, 2, 3, 3, 3, 4, 4))
table(mydata$VARIABLE)

# 1 2 3 4 
# 2 1 3 2 

As suggested by Maiasaura, you can turn the output into a nice data.frame:

data.frame(table(mydata$VARIABLE))
#   Var1 Freq
# 1    1    2
# 2    2    1
# 3    3    3
# 4    4    2
flodel
  • 87,577
  • 21
  • 185
  • 223
0

As an alternative to ?table you could also use ?rle in the instance of wanting to detect "runs" of particular repeating values in a variable/vector. In this instance, you will get the same results as using the table function, though this is not always the case.

mydata <- data.frame(VARIABLE = c(1, 1, 2, 3, 3, 3, 4, 4))
rle(mydata$VARIABLE)

Result:

Run Length Encoding
  lengths: int [1:4] 2 1 3 2
  values : num [1:4] 1 2 3 4

You can subset the results of the rle function as well, like so:

rle(mydata$VARIABLE)$values
[1] 1 2 3 4

rle(mydata$VARIABLE)$lengths
[1] 2 1 3 2
thelatemail
  • 91,185
  • 12
  • 128
  • 188
  • You can create a `data.frame` from the results by `do.call(data.frame,rle(mydata$VARIABLE))` – mnel Sep 11 '12 at 02:21