R- How to get top 2 Values in Column which is depending on other column?

Question

I have a data set of Zip code and house code.

 df = data.frame(zip = c(2900,2900,2900,3200,3100,3200),
                 house_code = c('abc','cde','efg','ghi','ijk','klm'))

I need to find top 2 zip code in terms of number of house_code?

you forgot quotes around your strings...my edits are full or else I'd edit it for you. — Cyrus Mohammadian, Aug 30 '16 at 04:48
also, your question isn't clear (at least to me). what does "in terms of number of `house_code` mean"? your `house_code` isnt numeric so you cant mean in terms of which has the highest number and they also dont repeat (in your given example), so you must not mean by count either. So what do you mean? — Cyrus Mohammadian, Aug 30 '16 at 04:49

score 0 · Answer 1 · answered Aug 30 '16 at 04:56

0

I think it could be: head(df[df$house_code == 'some value']$zip,2) where 'some value' is a house_code entry.

answered Aug 30 '16 at 04:56

Josseline Perdomo

363
1
4
15

score 0 · Answer 2 · answered Aug 30 '16 at 05:07

First use table to match house_codes and zip_codes.

> dftable <- table(df)

      house_code
zip    abc cde efg ghi ijk klm
  2900   1   1   1   0   0   0
  3100   0   0   0   0   1   0
  3200   0   0   0   1   0   1

Then use rowSums to find the number of house_codes for each zip_code.

> numHouse <- rowSums(dftable)

2900 3100 3200 
   3    1    2

Finally use order to find the top 2.

> names(numHouse)[order(numHouse, decreasing = TRUE)[1:2]]

[1] "2900" "3200"

Just call `table(df$zip)`, i.e. `names(sort(table(df$zip), TRUE)[1:2])` — alistaire, Aug 30 '16 at 05:36

R- How to get top 2 Values in Column which is depending on other column?

2 Answers2