Factors created when subsetting data frame

Question

When using subset on a data frame, my resulting data frame has some odd behavior. df is the subset of a larger data frame

>df

   buy_sell_count                 trt                 sector
1               1            0.023957              Apartment
2               1            0.026739           Strip Center
3               1  0.0705979999999999                   Mall
4               1  0.0595650000000001                 Office
5               1  0.0290539999999999             Industrial

I've tried the various drop-level practices shown in this question, but none have worked.

When i do mean(df$trt) I get a argument is not numeric or logical: returning NA When i do as.numeric(df$trt) I get

 [1]  8  9 12 11 10  1  4  6  3  5  7  2

I think it has to do with the levels: df$trt produces

 [1] 0.023957            0.026739            0.0705979999999999  0.0595650000000001  0.0290539999999999 
 [6] -0.01607            -0.188538           0.00279700000000016 -0.022502           0.00178300000000009
[11] 0.00770099999999996 -0.0191330000000001
12 Levels: -0.01607 -0.0191330000000001 -0.022502 -0.188538 0.00178300000000009 ... 0.0705979999999999

The problem isn't dropping levels, it's that your `trt` column shouldn't be a factor at all. Try `as.numeric(as.character())` on it, but really you should trace back upstream in your code to find out where it became a factor (e.g. when reading it in from file). — joran, Apr 18 '18 at 19:23
try `as.numeric(as.character(df$trt))`. This should solve the issue at hand, but your bigger issue is happening either before or during subsetting — Dave Gruenewald, Apr 18 '18 at 19:23

Factors created when subsetting data frame

0 Answers0