R get all categories in column

Question

I have a large Dataset (dataframe) where I want to find the number and the names of my cartegories in a column.

For example my df was like that:

 A   B   
 1   car
 2   car
 3   bus
 4   car
 5   plane 
 6   plane 
 7   plane 
 8   plane 
 9   plane 
 10   train

I would want to find :

  car
  bus
  plane
  train
  4

How would I do that?

What do you mean with `number and names`? What number? For instance, where does the 4 come from? If you mean frequencies, you may want to use something like `table(df$B)`. — coffeinjunky, Sep 02 '17 at 20:25

score 25 · Accepted Answer · answered Sep 02 '17 at 20:28

25

categories <- unique(yourDataFrame$yourColumn) 
numberOfCategories <- length(categories)

Pretty painless.

answered Sep 02 '17 at 20:28

CCD

590
3
8

score 11 · Answer 2 · answered Sep 02 '17 at 20:30

11

This gives unique, length of unique, and frequency:

table(df$B)
bus   car plane train 
1     3     5     1

length(table(x$B))
[1] 4

answered Sep 02 '17 at 20:30

score 8 · Answer 3 · answered Sep 02 '17 at 20:18

8

You can simply use unique:

x <- unique(df$B)

And it will extract the unique values in the column. You can use it with apply to get them from each column too!

answered Sep 02 '17 at 20:18

sconfluentus

4,693
1
21
40

Thanks, you made my day!! – Sohail Aug 20 '19 at 15:17

Rich Scriven · Answer 4 · 2017-09-02T20:54:16.270

2

I would recommend you use factors here, if you are not already. It's straightforward and simple.

levels() gives the unique categories and nlevels() gives the number of them. If we run droplevels() on the data first, we take care of any levels that may no longer be in the data.

with(droplevels(df), list(levels = levels(B), nlevels = nlevels(B)))
# $levels
# [1] "bus"   "car"   "plane" "train"
#
# $nlevels
# [1] 4

edited Sep 02 '17 at 20:54

answered Sep 02 '17 at 20:34

Rich Scriven

97,041
11
181
245

2

Thankfully there's a data frame method for `droplevels` – Rich Scriven Sep 02 '17 at 20:56

score 1 · Answer 5 · answered Jan 03 '19 at 02:29

1

Additionally, to see sorted values you can use the following:

sort(table(df$B), decreasing = TRUE)

And you will see the values in the decreasing order.

answered Jan 03 '19 at 02:29

V C

39
3

score 0 · Answer 6 · answered May 19 '22 at 08:47

Firstly you must ensure that your column is in the correct data type. Most probably R had read it in as a 'chr' which you can check with 'str(df)'. For the data you have provided as an example, you will want to change this to a 'factor'. df$column <- as.factor(df$column) Once the data is in the correct format, you can then use 'levels(df$column)' to get a summary of levels you have in the dataset

R get all categories in column

6 Answers6