0

First I want to say I am new to R. This problem is frustrating beyond belief. I have tried apply, lapply, and mapply. All with errors. I am lost.

What I want to do is take the time from "Results" and place it in the time in "Records" IF Records does not have a time (where it is NA).

I have already done this in a traditional for-loop but it makes the code hard to read. I have read the apply functions can make this easier.

Data Frame "Results"
ID Time(sec)    
1  1.7169811
2  1.9999999
3  2.3555445
4  3.4444444

Data Frame "Records"
ID Time(sec) Date
1  NA        1/1/2018
2  1.9999999 1/1/2018
3  NA        1/1/2018
4  3.1111111 1/1/2018

Data Frame 'New' Records
ID Time(sec)   Date
1  1.7169811 1/1/2018
2  1.9999999 1/1/2018
3  2.3555445 1/1/2018
4  3.1111111 1/1/2018
Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89
Shawn
  • 3,583
  • 8
  • 46
  • 63

2 Answers2

3

No need to use apply in this situation. A pattern of conditionally choosing between two values based on some predicate is ifelse():

ifelse(predicate, value_a, value_b)

In this case you said you also have to make sure the values are matched by ID between the two dataframes. A function that achieves this in R is appropriately named match()

match(target_values, values_to_be_matched)

match returns indices that match values_to_be_matched to target_values when used like so: target_values[indices].

Combining this together:

inds <- match(records$ID, results$ID)
records$time <- ifelse(is.na(records$time), results$time[inds], records$time)

is.na() here is a predicate that checks if the value is NA for every value in the vector.

Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89
3

Inspired by this answer.

From the help: Given a set of vectors, coalesce() finds the first non-missing value at each position. This is inspired by the SQL COALESCE function which does the same thing for NULLs

    library(tidyverse)

    txt1 <- "ID Time(sec)    
    1  1.7169811
    2  1.9999999
    3  2.3555445
    4  3.4444444"

    txt2 <- "ID Time(sec) Date
    1  NA        1/1/2018
    2  1.9999999 1/1/2018
    3  NA        1/1/2018
    4  3.1111111 1/1/2018"

    df1 <- read.table(text = txt1, header = TRUE)
    df2 <- read.table(text = txt2, header = TRUE)

    df1 %>% 
      left_join(df2, by = "ID") %>% 
      mutate(Time.sec. = coalesce(Time.sec..x, Time.sec..y)) %>% 
      select(-Time.sec..x, -Time.sec..y)

    #>   ID     Date Time.sec.
    #> 1  1 1/1/2018  1.716981
    #> 2  2 1/1/2018  2.000000
    #> 3  3 1/1/2018  2.355545
    #> 4  4 1/1/2018  3.444444

Created on 2018-03-10 by the reprex package (v0.2.0).

Tung
  • 26,371
  • 7
  • 91
  • 115