for eg: a dataframe "housing" has a column "street" with different street names as levels. I want to return a df with counts of the number of houses in each street (level), basically number of repetitions. what functions do i use in r?
Asked
Active
Viewed 40 times
3 Answers
0
This should help:
library(dplyr)
housing %>% group_by(street) %>% summarise(Count=n())

Duck
- 39,058
- 13
- 42
- 84
-
This saved me thanks! in addition, how do i ignore na values in housing while applying these arguments? – bverc Jul 07 '20 at 15:32
-
@bverc You could add a new pipe like `filter(!is.na(yourvariable))`. Let me know if I can help more! – Duck Jul 07 '20 at 15:36
-
Thank you, it's exactly what i needed – bverc Jul 07 '20 at 16:05
0
summary
gives the first 100 frequencies of the factor levels. If there are more, try:
table(housing$street)
For example, let's generate one hundred one-letter street names and summarise them with table
.
set.seed(1234)
housing <- data.frame(street = sample(letters, size = 100, replace = TRUE))
x <- table(housing$street)
x
# a b c d e f g h i j k l m n o p q r s t u v w x y z
# 1 3 5 6 4 6 2 6 5 3 1 3 1 2 5 5 4 1 5 5 3 7 4 5 3 5
As per OP's comment. To further use the result in analyses, it needs to be included in a variable. Here, the x
. The class of the variable is table
, and it works in base R with most functions as a named vector. For example, to find the most frequent street name, use which.max
.
which.max(x)
# v
# 22
The result says that the 22nd position in x
has the maximum value and it is called v
.

nya
- 2,138
- 15
- 29
-
-
Also, I'm not able to perform any analysis using this method. for eg: to find the highest value – bverc Jul 07 '20 at 15:44
-
Simply send the table result into a variable and use it as a vector. I added an example to the response. – nya Jul 08 '20 at 07:09
0
This can be done in multiple ways, for instance with base R using table()
:
table(housing$street)
It can also be done through dplyr, as illustrated by Duck.
Another option (my preference) is using data.table.
library(data.table)
setDT(housing)
housing[, .N, by = street]

ljwharbers
- 393
- 2
- 8