0

I am using the following code to try to replace cells in the column of a dataframe:

DMS_Sync_Report$Valid.Last.Sync...Temp. <- as.numeric(DMS_Sync_Report$Valid.Last.Sync...Temp.)
DMS_Sync_Report$Valid.Last.Sync...Temp.[which(DMS_Sync_Report$Valid.Last.Sync...Temp.> 3)] <- ">3days"
DMS_Sync_Report$Valid.Last.Sync...Temp.[which(DMS_Sync_Report$Valid.Last.Sync...Temp.<= 3)] <- "<3days"
DMS_Sync_Report$Valid.Last.Sync...Temp.[which(is.na(DMS_Sync_Report$Valid.Last.Sync...Temp.))] <- ">3days"

Data Set : Needs to replace the columns with ">3days" and "3days"

The above code is not giving the right results

Todd Burus
  • 963
  • 1
  • 6
  • 20
Sree
  • 11
  • 1
  • Columns need to have a single type, like numeric or character. As soon as you do your first `<- ">3days"`, the column is forced to become a `character` (string) class, and then numeric operations like `<=` can't be counted on anymore. – Gregor Thomas Jun 09 '20 at 06:16
  • Instead, I'd suggest the [How to bin data](https://stackoverflow.com/q/5570293/903061) FAQ as a duplicate. From the top answer there, you can find `cut`, which will do this all at once (assuming you start with a numeric column): `DMS_Sync_Report$Valid.Last.Sync...Temp. <- cut(DMS_Sync_Report$Valid.Last.Sync...Temp., breaks = 3, labels = c("<3days", ">3days"))`. This approach is nice because it scales well - if you had 3, 4, or 10 categories, the code would not get much longer. – Gregor Thomas Jun 09 '20 at 06:19
  • @ToddBurus in the result the "NA" and "<3days" is getting captured properly, error is occurring in ">3days" as certain cells which are supposed to be in <3days is featuring in >3days – Sree Jun 09 '20 at 06:25
  • @GregorThomas the same error is repeating - >3days and <3days are not getting captures properly... – Sree Jun 09 '20 at 06:28
  • Please add data using `dput` and show the expected output for the same. Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). – Ronak Shah Jun 09 '20 at 06:32
  • Oops, I forget the extreme breaks must be included. Looks like you figured it out though. To be *very* safe, you could use `breaks = c(-Inf, 3, Inf)` inside `cut()`. – Gregor Thomas Jun 09 '20 at 06:45

1 Answers1

1

The issue was related to Binning and the following codes helped me resolve the same

DMS_Sync_Report$bins <- cut(DMS_Sync_Report$Valid.Last.Sync...Temp, breaks=c(-1,3,1000), labels=c("<3 days",">3days")) DMS_Sync_Report$Valid.Last.Sync...Temp[which(is.na(DMS_Sync_Report$Valid.Last.Sync...Temp))] <- ">3days"

Sree
  • 11
  • 1