-1

for eg: a dataframe "housing" has a column "street" with different street names as levels. I want to return a df with counts of the number of houses in each street (level), basically number of repetitions. what functions do i use in r?

bverc
  • 13
  • 2

3 Answers3

0

This should help:

library(dplyr)

housing %>% group_by(street) %>% summarise(Count=n())
Duck
  • 39,058
  • 13
  • 42
  • 84
0

summary gives the first 100 frequencies of the factor levels. If there are more, try:

table(housing$street)

For example, let's generate one hundred one-letter street names and summarise them with table.

set.seed(1234)
housing <- data.frame(street = sample(letters, size = 100, replace = TRUE))
x <- table(housing$street)
x
# a b c d e f g h i j k l m n o p q r s t u v w x y z 
# 1 3 5 6 4 6 2 6 5 3 1 3 1 2 5 5 4 1 5 5 3 7 4 5 3 5 

As per OP's comment. To further use the result in analyses, it needs to be included in a variable. Here, the x. The class of the variable is table, and it works in base R with most functions as a named vector. For example, to find the most frequent street name, use which.max.

which.max(x)
#  v 
# 22 

The result says that the 22nd position in x has the maximum value and it is called v.

nya
  • 2,138
  • 15
  • 29
0

This can be done in multiple ways, for instance with base R using table():

table(housing$street)

It can also be done through dplyr, as illustrated by Duck.

Another option (my preference) is using data.table.

library(data.table)
setDT(housing)
housing[, .N, by = street]
ljwharbers
  • 393
  • 2
  • 8