0

I am trying to transform my data from wide to long but keep getting a few error:

Current data structure

Country       Month        New_users Returning_users  Active_users  Sum_byyear Sum_bycoun
Angola        2020-06-01   5         0                5             17         47  
Angola        2020-09-01   6         2                8             17         47
Angola        2020-10-01   3         1                4             17         47
Angola        2021-01-01   9         7                16            30         47
Angola        2021-03-01   5         3                8             30         47            
Angola        2021-05-01   5         1                6             30         47
South Africa  2020-06-01   40        9                49            157        814
South Africa  2020-09-01   65        13               78            157        814
South Africa  2020-10-01   20        10               30            157        814
South Africa  2021-01-01   47        37               84            657        814
South Africa  2021-03-01   224       66               290           657        814
South Africa  2021-05-01   206       77               283           657        814
 

Desired data structure

Country    Month        Users            Count      Sum_by_year     Sum_by_country
Angola     2020-06-01   New_users        5          17              47
Angola     2020-06-01   Returning_users  0          17              47
Angola     2020-06-01   Active_users     5          17              47
Angola     2020-09-01   New_users        6          17              47
Angola     2020-09-01   Returning_users  2          17              47
Angola     2020-09-01   Active_users     8          17              47
Angola     2020-10-01   New_users        3          17              47
Angola     2020-10-01   Returning_users  1          17              47
Angola     2020-10-01   Active_users     4          17              47
Angola     2021-01-01   New_users        9          30              47
Angola     2021-01-01   Returning_users  7          30              47
Angola     2021-01-01   Active_users     16         30              47
Angola     2021-03-01   New_users        5          30              47
Angola     2021-03-01   Returning_users  3          30              47
Angola     2021-03-01   Active_users     8          30              47
Angola     2021-05-01   New_users        5          30              47
Angola     2021-05-01   Returning_users  1          30              47
Angola     2021-05-01   Active_users     6          30              47
Ghana     2020-06-01   New_users        40          157              814
Ghana     2020-06-01   Returning_users  9           157              814
Ghana     2020-06-01   Active_users     49          157              814
Ghana     2020-09-01   New_users        65          157              814
Ghana     2020-09-01   Returning_users  13          157              814
Ghana     2020-09-01   Active_users     78          157              814
Ghana     2020-10-01   New_users        20          157              814
Ghana     2020-10-01   Returning_users  10          157              814
Ghana     2020-10-01   Active_users     30          157              814
Ghana     2020-09-01   Active_users     78          157              814
Ghana     2021-01-01   New_users        47          657              814
Ghana     2021-01-01   Returning_users  37          657              814
Ghana     2021-01-01   Active_users     84          657              814
Ghana     2021-03-01   New_users        224         657              814
Ghana     2021-03-01   Returning_users  66          657              814
Ghana     2021-03-01   Active_users     283         657              814
Ghana     2021-05-01   New_users        206         657              814
Ghana     2021-05-01   Returning_users  77          657              814
Ghana     2021-05-01   Active_users     283         657              814

Essentially it should be converted to long varying by both country and date. I have converted date and every other variable is in the correct format.

What I've tried:

reshape (df, dir = "long", sep = ","
                  idvar = "Country",
                  varying= c("Month"))
Error in guess (varying):
   Failed to guess time-varying variables from their names 

Another method

df%>%pivot_longer (Country, names_to = c(".value", "User"), names_pattern = "(.*)(\\d+)")

Error in 'stop_vctrs()':
!Can't recycle '..1' (size 656) to match '...3' (size 0)

Please assist.

Thandi
  • 225
  • 1
  • 2
  • 9
  • What were you expecting that regex to do? It needs to match the pattern in your column names, but it doesn't match any, since none of your columns have digits like your regex is looking for. You also don't need `".value"` in your names—you're doing a pretty simple reshaping where just setting `names_to = "User"` is sufficient – camille Mar 21 '22 at 22:12

1 Answers1

2
df <- read.table(text = "Country       Month        New_users Returning_users  Active_users  Sum_byyear Sum_bycoun
Angola        2020-06-01   5         0                5             17         47  
Angola        2020-09-01   6         2                8             17         47
Angola        2020-10-01   3         1                4             17         47
Angola        2021-01-01   9         7                16            30         47
Angola        2021-03-01   5         3                8             30         47            
Angola        2021-05-01   5         1                6             30         47
SouthAfrica  2020-06-01   40        9                49            157        814
SouthAfrica  2020-09-01   65        13               78            157        814
SouthAfrica  2020-10-01   20        10               30            157        814
SouthAfrica  2021-01-01   47        37               84            657        814
SouthAfrica  2021-03-01   224       66               290           657        814
SouthAfrica  2021-05-01   206       77               283           657        814", h = T)

library(tidyverse)
df %>% 
  pivot_longer(c(New_users, Returning_users, Active_users), names_to = "Users", values_to = "Count") %>% 
  select(Country, Month, Users, Count, Sum_byyear, Sum_bycoun)

# A tibble: 36 x 6
   Country Month      Users           Count Sum_byyear Sum_bycoun
   <chr>   <chr>      <chr>           <int>      <int>      <int>
 1 Angola  2020-06-01 New_users           5         17         47
 2 Angola  2020-06-01 Returning_users     0         17         47
 3 Angola  2020-06-01 Active_users        5         17         47
 4 Angola  2020-09-01 New_users           6         17         47
 5 Angola  2020-09-01 Returning_users     2         17         47
 6 Angola  2020-09-01 Active_users        8         17         47
 7 Angola  2020-10-01 New_users           3         17         47
 8 Angola  2020-10-01 Returning_users     1         17         47
 9 Angola  2020-10-01 Active_users        4         17         47
10 Angola  2021-01-01 New_users           9         30         47
# ... with 26 more rows
Lennyy
  • 5,932
  • 2
  • 10
  • 23