Hi everyone I am fairly new here. I m having trouble figuring out how to combine the same dates in the same column. Some Ids contain both the same dates and different ones. My goal was to only combine the ones with the same dates. Also, let me say that this df is a result of some basic cleaning and grouping of the originals.
library(dplyr)
str(notimesw)
grouped_df [480 × 5] (S3: grouped_df/tbl_df/tbl/data.frame)
$ Id : num [1:480] 1.5e+09 1.5e+09 1.5e+09 1.5e+09 1.5e+09 ...
$ date : Date[1:480], format: "2016-04-12" "2016-04-13" ...
$ WeightPounds : num [1:480] NA NA NA NA NA NA NA NA NA NA ...
$ TotalSleepRecords : num [1:480] 1 2 1 2 1 1 1 1 1 1 ...
$ TotalMinutesAsleep: num [1:480] 327 384 412 340 700 304 360 325 361 430 ...
- attr(*, "groups")= tibble [26 × 2] (S3: tbl_df/tbl/data.frame)
..$ Id : num [1:26] 1.50e+09 1.64e+09 1.84e+09 1.93e+09 2.03e+09 ...
..$ .rows: list<int> [1:26]
.. ..$ : int [1:27] 1 2 3 4 5 6 7 8 9 10 ...
.. ..$ : int [1:4] 26 27 28 29
.. ..$ : int [1:3] 30 31 32
.. ..$ : int [1:6] 33 34 35 36 37 416
.. ..$ : int [1:28] 38 39 40 41 42 43 44 45 46 47 ...
.. ..$ : int 66
.. ..$ : int [1:15] 67 68 69 70 71 72 73 74 75 76 ...
.. ..$ : int [1:2] 417 418
.. ..$ : int [1:28] 82 83 84 85 86 87 88 89 90 91 ...
.. ..$ : int [1:8] 110 111 112 113 114 115 116 117
.. ..$ : int [1:28] 118 119 120 121 122 123 124 125 126 127 ...
.. ..$ : int [1:24] 144 145 146 147 148 149 150 151 152 153 ...
.. ..$ : int [1:28] 168 169 170 171 172 173 174 175 176 177 ...
.. ..$ : int [1:10] 196 197 198 199 200 421 422 423 424 425
.. ..$ : int [1:28] 201 202 203 204 205 206 207 208 209 210 ...
.. ..$ : int [1:31] 229 230 231 232 233 234 235 236 237 238 ...
.. ..$ : int [1:27] 260 261 262 263 264 265 266 267 268 269 ...
.. ..$ : int [1:18] 286 287 288 289 290 291 292 293 294 295 ...
.. ..$ : int [1:3] 304 305 306
.. ..$ : int [1:61] 307 308 309 310 311 312 313 314 315 316 ...
.. ..$ : int [1:2] 338 339
.. ..$ : int [1:24] 340 341 342 343 344 345 346 347 348 349 ...
.. ..$ : int [1:3] 364 365 366
.. ..$ : int [1:32] 367 368 369 370 371 372 373 374 375 376 ...
.. ..$ : int [1:15] 399 400 401 402 403 404 405 406 407 408 ...
.. ..$ : int [1:24] 457 458 459 460 461 462 463 464 465 466 ...
If the above is a little too much here is a carbon copy of the 1st 12 lines.
notime <-data.frame (
Id=c(1503960366,1503960366,1503960366,4319703577,4319703577,4319703577,5553957443,5553957443,5553957443,6117666160,6117666160,6117666160),
date=c("2016-04-27", "2016-04-12","2016-04-27", "2016-04-12", "2016-04-27","2016-04-27","2016-5-16","2016-4-16","2016-4-16","2016-04-24","2016-04-24","2016-04-28"),
WeightPounds= c(115,NA, NA,NA,NA,135,NA,137,NA,NA,185,184),
TotalSleepRecords=c(NA,1,1,1,2,NA,2,1,1,1,2,NA),
TotalMinutesAsleep=c(NA,405,555,214,355, NA,495 ,222,542,228,350,NA))
I tried the following code chunk
notimesw%>% group_by(date,Id) %>%
summarize(WeightPounds=mean(WeightPounds,na.rm=TRUE),
TotalSleepRecords=mean(TotalSleepRecords,na.rm=TRUE),
TotalMinutesAsleep=mean(TotalMinutesAsleep,na.rm=TRUE))
And received the following
WeightPounds TotalSleepRecords TotalMinutesAsleep
1 158.8118 1.118644 419.4673
I have tried another suggestion
aggregate(df_merge[-1], list(df_merge$Id), FUN = mean, na.rm = TRUE)
Which combine the Ids with the same dates but the other dates are missing.
Group.1 date WeightPounds TotalSleepRecords TotalMinutesAsleep
1 1503960366 2016-04-27 115.9631 1.080000 360.2800
2 1644430081 2016-05-02 NaN 1.000000 294.0000
3 1844505072 2016-04-25 NaN 1.000000 652.0000
4 1927972279 2016-04-17 294.3171 1.600000 417.0000
5 2026352035 2016-04-27 NaN 1.000000 506.1786
6 2320127002 2016-04-23 NaN 1.000000 61.0000
7 2347167796 2016-04-21 NaN 1.000000 446.8000
8 2873212765 2016-05-01 125.6635 NaN NaN
9 3977333714 2016-04-25 NaN 1.142857 293.6429
10 4020332650 2016-04-30 NaN 1.000000 349.3750
11 4319703577 2016-04-27 159.5045 1.038462 476.6538
12 4388161847 2016-04-27 NaN 1.291667 403.1250
13 4445114986 2016-04-27 NaN 1.392857 385.1786
14 4558609924 2016-04-29 153.5299 1.000000 127.6000
15 4702921684 2016-04-26 NaN 1.071429 421.1429
16 5553957443 2016-04-27 NaN 1.225806 463.4839
17 5577150313 2016-04-24 199.9593 1.038462 432.0000
18 6117666160 2016-04-26 NaN 1.222222 478.7778
19 6775888955 2016-04-14 NaN 1.000000 349.6667
20 6962181067 2016-04-27 135.7019 1.096774 448.0000
21 7007744171 2016-04-23 NaN 1.000000 68.5000
22 7086361926 2016-04-27 NaN 1.000000 453.1250
23 8053475328 2016-04-26 NaN 1.000000 297.0000
24 8378563200 2016-04-26 NaN 1.125000 443.3438
25 8792009665 2016-04-24 NaN 1.000000 435.6667
26 8877689391 2016-04-26 187.7144 NaN NaN
>
Hopefully, I explained it correctly. Thanks In advance for your help. It is greatly appreciated.