0

So I have a data set from a package and I want to create a new data frame, with only Cities with crime rates above 30% for one column.

The data set has a column, Crime, which has the crime rates for cities. The values are in decimal form.

df2 <- Cities[,"Crime" > .30]

But it's not returning only the cities with crime rates above 0.30, it's returning all of them. I'm not sure why this is since I've specified > 0.30 in the code? I just spent some time looking around for help on subsetting and creating data frames and none of them were helpful with this type of problem, they were only general subsetting ones where you're selecting the whole column.

I feel like I'm very close and I've tried other things but I'm getting frustrated.

zinger001
  • 5
  • 3
  • 1
    Does this answer your question? [Filter data.frame rows by a logical condition](https://stackoverflow.com/questions/1686569/filter-data-frame-rows-by-a-logical-condition) – user438383 Oct 07 '20 at 16:14
  • `df2 <- Cities["Crime" > .30]` will work. the comma in the brackets makes it into a logical operation that returns true/false. without the comma it reduces the df in the way you want – D.J Oct 07 '20 at 16:18
  • Hm. I got rid of the comma and it's still giving me the whole data set. – zinger001 Oct 07 '20 at 16:33

1 Answers1

0

You have to index correctly, in your code you are trying to index at column level. If you want to filter you have to index at row levels, this is the left side of the , in brackets. Here an example:

set.seed(123)
#Data
Cities <- data.frame(Crime=runif(100,0,1))
#Filter
df2 <- Cities[Cities$Crime > .30,,drop=F]

Rows in dataframes:

#Original
nrow(Cities)
 [1] 100
#Filtered
nrow(df2)
 [1] 65

Or using dplyr:

library(dplyr)
#Code
df2 <- Cities %>% filter(Crime > .30)

Or base R subset():

#Code 2
df2 <- subset(Cities,Crime > .30)
Duck
  • 39,058
  • 13
  • 42
  • 84
  • Ahhh thank you so much! I knew it was going to be something really small. Although, can I ask what the drop=F is for in the filter code? – zinger001 Oct 07 '20 at 16:37
  • @zinger001 Yes, that option is used when you only have one column in your dataframe so that when you filter you will keep the dataframe structure. I hope that was helpful for you! – Duck Oct 07 '20 at 16:39