3

I am new to R and working on the below dataset:

I have a file called zippopinc

Repex:

head(zippopinc)

  Year         Zip     Total_Population Median_Income   City State
1 1 2017 ZCTA5 00601            17599         11757  Adjuntas    PR
2 2 2017 ZCTA5 00602            39209         16190    Aguada    PR
3 3 2017 ZCTA5 00603            50135         16645 Aguadilla    PR
4 4 2017 ZCTA5 00606             6304         13387   Maricao    PR
5 5 2017 ZCTA5 00610            27590         18741    Anasco    PR
6 6 2017 ZCTA5 00612            62566         17744   Arecibo    PR
  Poptoincomeratio
       1.4968955
       2.4218036
       3.0120156
       0.4709046
       1.4721733
       3.5260370

poptoincomeratio is basically Total_Population/Median_Income

My objective is to find which zip code has the highest Poptoincomeratio:

My input:

max(sapply(zippopinc$Poptoincomeratio, max))

Output:

4.454182

So I tried,

zippopinc$Zip[demograph_ratio$Poptoincomeratio == 4.454182]

But this gave me:

factor(0)
30956 Levels

I then tried to convert zipopinc as a factor but got the below error:

> as.factor(zippopinc)
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

How can I fix this?

Z.Lin
  • 28,055
  • 6
  • 54
  • 94

1 Answers1

0

If you want to find which zip code has the highest Poptoincomeratio do :

zippopinc$Zip[which.max(zippopinc$Poptoincomeratio)]

The reason why it doesn't yield you any output with current approach is because there are some limitations in comparing floating point values. Read more here

Even for the shared example, we can see that 3.5260370 is the highest value in Poptoincomeratio column but when we compare the values we get

zippopinc$Poptoincomeratio == 3.5260370
#[1] FALSE FALSE FALSE FALSE FALSE FALSE

but if you use which.max it returns the highest Zip value

zippopinc$Zip[which.max(zippopinc$Poptoincomeratio)]
#[1] 612
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks.This is just a reproducible example and not the entire dataset, that is why you dont see 4.454182 @Ronak Shah. I mentioned #Repex. –  Mar 23 '19 at 02:41
  • @KidCode I know. Can you run `zippopinc$Zip[which.max(zippopinc$Poptoincomeratio)]` on your real data and compare the answer? – Ronak Shah Mar 23 '19 at 02:41
  • Yes, I got it and marked your answer correct. :) thanks –  Mar 23 '19 at 02:43