1

I have a data frame with multiple time series . I would like to remove all data in date that have NA values.

The data frame looks as follows,

Date    Time    Value
1/1/2014    0:00    30
1/1/2014    1:00    20
1/1/2014    2:00    12
1/1/2014    3:00    NA
    .   
    .   
    .   
1/1/2014    23:00   23
2/1/2014    0:00    12
2/1/2014    1:00    23
2/1/2014    2:00    34
2/1/2014    3:00    43
    .   
    .   
    .   
2/1/2014    23:00   30
3/1/2014    0:00    34
3/1/2014    1:00    NA
3/1/2014    2:00    NA
3/1/2014    3:00    23
    .   
    .   
    .   
3/1/2014    23:00   45

I would remove all data in date that have NA values so that the data frame look as follows,

Date    Time    Value
2/1/2014    0:00    12
2/1/2014    1:00    23
2/1/2014    2:00    34
2/1/2014    3:00    43
    .   
    .   
    .   
2/1/2014    23:00   30
H.Nattha
  • 31
  • 5
  • Ok, my bad, it is not a duplicate. Here you want to remove all rows, for which at least one row was NA. – Rumid Jul 26 '17 at 05:10
  • 1
    It's not a duplicate. – tushaR Jul 26 '17 at 05:25
  • 1
    I reopened as it is not a duplicate of that. All values need to be removed from group if one is NA – Sotos Jul 26 '17 at 05:26
  • @RonakShah - the question is not about removing rows where any value is missing, it is about removing whole groups where the date is ever associated with a missing value. – thelatemail Jul 26 '17 at 05:31
  • 1
    From the duplicated link : `library(dplyr); df %>% group_by(Date) %>% filter(!any(is.na(Value)))` – Ronak Shah Jul 26 '17 at 05:40

2 Answers2

3

It is not clear about the wording in the question. If this is meant to remove 'date' rows having any 'NA' in the 'Value' column

library(data.table)
setDT(df)[, .SD[!any(is.na(Value))] , Date]
#       Date  Time Value
#1: 2/1/2014  0:00    12
#2: 2/1/2014  1:00    23
#3: 2/1/2014  2:00    34
#4: 2/1/2014  3:00    43
#5: 2/1/2014 23:00    30
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 4
    I think they want to remove whole `Date`s where any `Value` is `NA` for that date. I.e. `1/1` and `3/1` seem to have been removed totally. – thelatemail Jul 26 '17 at 05:02
  • Thanks for comment. @akrun You're correct. – H.Nattha Jul 26 '17 at 05:18
  • @akrun Please, can you check your answer? I've tried it with OP's data but it doesn't remove a single row. – Uwe Jul 26 '17 at 07:40
  • @UweBlock The OP said `Thanks for comment. @akrun You're correct.` – akrun Jul 26 '17 at 07:45
  • 1
    @akrun Would you mind to post the result of your code then, please? – Uwe Jul 26 '17 at 08:13
  • @UweBlock The question title is `Remove all data in date that have NA values` I don't find any NA in the data you updated in the OP's post. If this is a group by operation, then I updated the post – akrun Jul 26 '17 at 08:33
  • There are NA values in the `Value` column. Please, re-read the [comment of thelatemail](https://stackoverflow.com/questions/45317737/remove-all-data-in-date-that-have-na-values#comment77597572_45317806) and see what the OP has posted as expected result. – Uwe Jul 26 '17 at 08:38
  • @UweBlock Updated the post if that is what the OP meant – akrun Jul 26 '17 at 08:39
  • Thank you, @akrun But please check the result of `df[complete.cases(df$Date),]`. It has no effect as there are no `NA` in the `Date` column. – Uwe Jul 26 '17 at 08:48
2

Probably many ways to do this, but here's one. Find the unique Date values associated with a missing Value, and remove those rows from the final selection:

dat[!dat$Date %in% unique(dat[is.na(dat$Value),"Date"]),]

       Date  Time Value
6  2/1/2014  0:00    12
7  2/1/2014  1:00    23
8  2/1/2014  2:00    34
9  2/1/2014  3:00    43
10 2/1/2014 23:00    30
thelatemail
  • 91,185
  • 12
  • 128
  • 188