3

I have hockey data, called df

structure(list(event_index = 1:57, coords_x = c(80, 53, 31, -56, 
-34, -33, -40, 30, -66, -36, 45, 17, -6, 47, -51, -31, -69, -86, 
-70, 80, 65, -76, -71, 81, -57, 80, 75, 77, -71, -40, -83, 62, 
77, 76, NA, -61, 69, -45, 68, 31, 58, 61, 80, 34, 80, -85, -37, 
-57, 76, 14, 49, -82, -34, -36, -83, -84, -55), coords_y = c(-1, 
14, -30, 17, 26, -23, -37, 17, -32, -18, 25, 17, -38, 21, 28, 
22, 17, 13, 10, -37, -17, 9, 18, -11, 21, -7, 3, 3, -38, 31, 
8, -30, -2, 4, NA, -5, 15, 10, -30, -34, 20, 27, -4, 8, -18, 
19, 32, -21, 0, 40, -4, -30, -24, -28, -2, -3, 34), event_rinkside = c("R", 
"R", "R", "L", "L", "L", "L", "R", "L", "L", "R", "N", "N", "R", 
"L", "L", "L", "L", "L", "R", "R", "L", "L", "R", "L", "R", "R", 
"R", "L", "L", "L", "R", "R", "R", NA, "L", "R", "L", "R", "R", 
"R", "R", "R", "R", "R", "L", "L", "L", "R", "N", "R", "L", "L", 
"L", "L", "L", "L")), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -57L))

How do I create rows after every single row, leaving me with 57 * 2 (114 rows), but the values in my newly created rows depend on event_rinkside column.

  • If event_rinkside equals R, then, I want to insert 82 into coords_x and 0 into coords_y.
  • If event_rinkside equals L, then, I want to insert -82 into coords_x and 0 into coords_y.

I feel like the solution to this SO question is a good starting points, but I don't know how to incorporate my own conditions:

Here is the solution I'm talking about:

library(purrr)
df %>%
  group_by(id) %>%
  map_dfr(rbind, NA) %>%
  mutate(id = rep(df$id, each = 2))

4 Answers4

2

Here's a solution with dplyr:

library(dplyr)

df %>%
  mutate(coords_x = 82 * ifelse(event_rinkside == "L", -1, 1),
         coords_y = 0) %>%
  rbind(df, .) %>%
  arrange(event_index)

How it works:

In the first step, mutate is used to modify an unassigned copy of df. The column coords_x gets the value of 82; the value is multiplied with -1 if event_rinkside == "L" and 1 otherwise. The column coords_y gets the value of 0.

In the next step, the unchanged original data frame df and the current unassigned and modified copy of it are combined with rbind. Here, . represents the result of the mutate step above. The result of rbind has the rows of the original version above the rows of the modified version.

In the last step, arrange is used to sort the rows along the values of event_index. In this way, each original row is directly followed by the corresponding modified row.

The result:

# A tibble: 114 x 4
   event_index coords_x coords_y event_rinkside
         <int>    <dbl>    <dbl> <chr>         
 1           1       80       -1 R             
 2           1       82        0 R             
 3           2       53       14 R             
 4           2       82        0 R             
 5           3       31      -30 R             
 6           3       82        0 R             
 7           4      -56       17 L             
 8           4      -82        0 L             
 9           5      -34       26 L             
10           5      -82        0 L             
# … with 104 more rows
Sven Hohenstein
  • 80,497
  • 17
  • 145
  • 168
  • 1
    Could you explain the logic leading up to this? – NelsonGon Jan 18 '19 at 21:27
  • 1
    I understood it after going through it step by step. Sven `mutate`d `coords_x` and `coords_y` with the desired values : (82, 0) for R and (-82, 0) for L. Then, he attached the original dataset with `rbind` then `arrange` d by event_index to format the dataset in a way I wanted. Really simple, but brilliant –  Jan 19 '19 at 00:55
  • 1
    @NelsonGon I added an explanation. – Sven Hohenstein Jan 19 '19 at 06:03
  • 1
    @JasonBaik I added an explanation. – Sven Hohenstein Jan 19 '19 at 06:04
0

I'm not too familiar with r, the my algorithm should work regardless of that. You want to shift the row up to the 2n-1 row. I would create a second array and manually place them in at the specific indexes.

some pseudo code for you (i usually write in python so my pseudo shows it)

reinsert(list):
   array_out = [len(list)*2,len(list[0]) // initialize to the desired dimensions 
   array_out[0] = list[0]  /// manually insert first row cause math
   for n in range(0,len(list)):
      array_out[2n-1] = list[n] 
      array_out[2n] = event_rinkside // make a function call or make an ifthen clause to do you logic
   return(array_out)

you can insert the newly created rows in the loop or add them after the fact knowing they will all be at even numbered indexes.

Sam
  • 293
  • 3
  • 19
0

This is similar to Sven's answer, using case_when to distinguish between the possibilities within event_rinkside:

new_df <- df %>% bind_rows(
  df %>% mutate(
    coords_x = case_when(
      event_rinkside == 'R' ~  82,
      event_rinkside == 'L' ~ -82,
      TRUE                  ~ coords_x
    ),
    coords_y = case_when(
      event_rinkside == 'R' ~ 0,
      event_rinkside == 'L' ~ 0,
      TRUE                  ~ coords_y
    )
  )
) %>% arrange(
  event_index
)

If you know the ranges of your variables, it could be simplified into if_elses.

Werner
  • 14,324
  • 7
  • 55
  • 77
0

My attempt, which is pretty similar to other answers already,

df <- df[rep(1:nrow(df), each = 2),] ## Create a duplicate row after each row

df[seq(2,nrow(df),2),] <- df[seq(2,nrow(df),2),] %>% mutate(coords_x = case_when(event_rinkside == "R" ~ 82,
                                                        event_rinkside == "L" ~ -82,
                                                        TRUE ~ coords_x),
                                   coords_y = case_when(event_rinkside == "R" ~ 0,
                                                        event_rinkside == "L" ~ 0,
                                                        TRUE ~ coords_y)
)
Nautica
  • 2,004
  • 1
  • 12
  • 35