0

I am working with this dataframe in Jupyter notebooks: Dataframe with issue

As you can see the column is a chr datatype. I want to remove all instances of "12:00:00 AM" from this column. I have tried using the gsub function

bbData_sleepDay_clean <- gsub("12:00:00 AM", "", bbData_sleepDay_clean)

but it turns my data frame into this: Messed up output.

I know I can also convert this column into datetime, for which I tried this:

bbData_sleepDay_clean <- as.POSIXct(bbData_sleepDay_clean[[col2]], format="%Y/%/%d")

But I am not experienced with this conversion and it did not work.

Does anyone know a way to approach this?

timschlum
  • 27
  • 3
  • Please provide a minimal reproducible example. See here for how to do this https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example You can use `dput()` for your data. Instead of telling us it "did not work," show us what it did using the reproducible example - we should see error messages etc. – socialscientist Jul 29 '22 at 18:54

1 Answers1

0

You are not appropriately referring to the column date.

In the first instance, if you want to use gsub() to simply replace the time part with "", then you can do this:

bbData_sleepDay_clean$date <- gsub(" 12:00:00 AM", "", bbData_sleepDay_clean$date)

In the second instance, you can use as.Date() directly like this:

bbData_sleepDay_clean$date <- as.Date(bbData_sleepDay_clean$date, format="%m/%d/%Y")

Output:

   id      date TotalMinutesAsleep TotalTimeInBed
1 p17 4/12/2016                415            818
2 p17 4/13/2016                463            118
3 p17 4/15/2016                179            299
4 p17 4/16/2016                526            229
5 p17 4/17/2016                195            244
6 p17 4/19/2016                938             14

Note that date column in the first instance will be class character, whereas in the second approach, date will be converted to clase Date

Input:

structure(list(id = c("p17", "p17", "p17", "p17", "p17", "p17"
), date = c("4/12/2016 12:00:00 AM", "4/13/2016 12:00:00 AM", 
"4/15/2016 12:00:00 AM", "4/16/2016 12:00:00 AM", "4/17/2016 12:00:00 AM", 
"4/19/2016 12:00:00 AM"), TotalMinutesAsleep = c(415L, 463L, 
179L, 526L, 195L, 938L), TotalTimeInBed = c(818L, 118L, 299L, 
229L, 244L, 14L)), class = "data.frame", row.names = c(NA, -6L
))
langtang
  • 22,248
  • 1
  • 12
  • 27