1

I need to obtain the following "output" using the "table" below. My code below does not work. Appreciate if someone can give me a help. I tried the method mentioned in here but no luck to obtain the output I wanted.

Thanks.

table:
    
    GRID_CODE    sat_date       pol       my_id     acres
    0.1234       05_24_2019     12345     5         94.5
    0.5678       05_24_2019     12345     5         94.5
    0.1456       05_24_2019     12345     5         94.5
    0.5895       05_24_2019     12345     5         94.5
    1.2535       05_24_2019     12345     5         94.5
    0.4878       05_24_2019     12345     5         94.5
    0.2134       06_30_2019     12345     5         94.5
    0.6178       06_30_2019     12345     5         94.5
    0.1556       06_30_2019     12345     5         94.5
    0.5895       06_30_2019     12345     5         94.5
    0.2675       06_30_2019     12345     5         94.5
    0.7188       07_15_2019     12345     5         94.5
    0.8123       07_15_2019     12345     5         94.5
    0.1788       07_15_2019     12345     5         94.5
    0.9852       07_15_2019     12345     5         94.5
    0.4528       07_15_2019     12345     5         94.5
    0.7861       07_15_2019     12345     5         94.5
    0.0541       07_15_2019     12345     5         94.5
    

    

My code below does not work. Appreciate if someone can give me a help.

    library(data.table)
    setDT(table)
    output1 <- dcast.data.table(table, pol + my_id + acres  ~ sat_date, fun.aggregate = identity, fill = NA_real_, value.var = "GRID_CODE")

pol     my_id   acres     05_24_2019     06_30_2019      07_15_2019 
12345   5       94.5      0.1234         0.2134          0.7188 
Aggregate function missing, defaulting to 'length'

I also tried this but it does not produce the desired output.

output2 <- reshape(out_p_sel, idvar = "pol", timevar = "sat_date", direction = "wide")


    output:
    
    pol       my_id    acres    05_24_2019       06_30_2019      07_15_2019
    12345     5        94.5     0.1234           0.2134          0.7188
    12345     5        94.5     0.5678           0.6178          0.8123
    12345     5        94.5     0.1456           0.1556          0.1788
    12345     5        94.5     0.5895           0.5895          0.9852
    12345     5        94.5     1.2535           0.2675          0.4528
    12345     5        94.5     0.4878           N/A             0.7861
    12345     5        94.5     N/A              N/A             0.0541
NUdu
  • 173
  • 6

2 Answers2

1

I think you need to specify the row number where each value goes. Try :

library(dplyr)

df %>%
  group_by(pol, my_id, sat_date) %>%
  mutate(row = row_number()) %>%
  tidyr::pivot_wider(names_from = sat_date, values_from = GRID_CODE) %>%
  select(-row)

#    pol my_id acres `05_24_2019` `06_30_2019` `07_15_2019`
#  <int> <int> <dbl>        <dbl>        <dbl>        <dbl>
#1 12345     5  94.5        0.123        0.213       0.719 
#2 12345     5  94.5        0.568        0.618       0.812 
#3 12345     5  94.5        0.146        0.156       0.179 
#4 12345     5  94.5        0.590        0.590       0.985 
#5 12345     5  94.5        1.25         0.268       0.453 
#6 12345     5  94.5        0.488       NA           0.786 
#7 12345     5  94.5       NA           NA           0.0541
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
0

I can get the desired output if I use a temporary variable:

table[, g1:=1:.N, by=sat_date]
output <- dcast(table, pol + my_id + acres + g1  ~ sat_date, fun.aggregate = identity, fill = NA_real_, value.var = "GRID_CODE")
output$g1 <- NULL
Vasily A
  • 8,256
  • 10
  • 42
  • 76