0

I'm trying to use spread() function to make my data.frame wide, but i have some errors that i don't even understand...

some part of my dataframe is looks like this:

> df
    NO2 Month
1    23    01
2    27    01
3    16    01
4    13    01
5    26    01
6    23    01
7    51    01
8    46    01
9    21    01
10   18    01
11   13    01
12   22    01
13   47    01
14   60    01
15   49    01
16   76    01
17   38    01
18   24    01
19   15    01
20   20    01
21   33    01
22   17    01
23   19    01
24   20    01
25   25    01
26   46    01
27   53    01
28   41    01
29   54    01
30   28    01
31   28    01
32   51    02
33   61    02
34   56    02
35   57    02
36   30    02
37   12    02
38   27    02
39   13    02
40   35    02
41   40    02
42   40    02
43   47    02
44   72    02
45   55    02
46   30    02
47   10    02
48   29    02
49   50    02
50   39    02
51   61    02
52   56    02
53   44    02
54   46    02
55   35    02
56   34    02
57   41    02
58   39    02
59   39    02
60   27    03
61   48    03
62   36    03
63   40    03
64   41    03
65   45    03
66   46    03
67   43    03
68   55    03 (...)

so simply i have values for each day in year and i want to spread them and use boxplot() for each month, to make it more clearly to read, but since i cant event spread it i cant show it in right way

I'm trying the spread and also reshape but have some errors:

df=data.frame(data)
df$Month=as.numeric(format(data$date,format="%m"))
df=df%>%select(c("NO2","Month"))

df=reshape(df,idvar=c("NO2","Month"),direction="wide",timevar="Month")
warnings() ## here i have first errors (will show them in below)

df=spread(df,Month,NO2)  ## have problems here also

df=spread(df,df$Month,df$NO2)  ## and here also

First error i have with reshape() function is for each "Month" i've got something like this

1: In reshapeWide(data, idvar = idvar, timevar = timevar,  ... :
  multiple rows match for Month=1: first taken

for second error i have something like this

Error in eval_tidy(enquo(var), var_env) : object 'Month' not found

and for third try i have this

Error: "var" must evaluate to a single number or a column name, not NULL

Can someone help me? I don't rlly get it, i've made spreads and this is my first touch with this problem..

Potato
  • 172
  • 1
  • 12
  • 2
    What is your expected output? Maybe you need `df %>% group_by(Month) %>% mutate(row = row_number()) %>% spread(Month, NO2)` ? – Ronak Shah Jun 02 '19 at 07:23
  • @RonakShah i think, you're genious... Thanks you very much, i didn't wanted to group my months because i thought that it maybe destroy the values inside of my dataframe, how did you get it, i mean i will probably never guess that i need to group them... – Potato Jun 02 '19 at 07:37

1 Answers1

2

You probably need

library(dplyr)
library(tidyr)

df %>%
   group_by(Month) %>%
   mutate(row = row_number()) %>%
   spread(Month, NO2)

which gives you this output

#     row   `1`   `2`   `3`
#   <int> <int> <int> <int>
# 1     1    23    51    27
# 2     2    27    61    48
# 3     3    16    56    36
# 4     4    13    57    40
# 5     5    26    30    41
# 6     6    23    12    45
# 7     7    51    27    46
# 8     8    46    13    43
#.....

Or

df %>%
   group_by(Month) %>%
   mutate(row = row_number()) %>%
   spread(row, NO2)

which gives you this

#  Month   `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8` ....
#  <int> <int> <int> <int> <int> <int> <int> <int> <int> ....
#1     1    23    27    16    13    26    23    51    46 ....
#2     2    51    61    56    57    30    12    27    13 ....
#3     3    27    48    36    40    41    45    46    43 ....

The point being we need a unique identifier when we want to cast a dataframe from long to wide. As it is not present in your original dataframe we create it by grouping every Month and assign a new number to every row using row_number().


If you want to achieve the same result with base R reshape, we can add the same unique identifier using ave and seq_along as FUN argument.

df$row <- with(df, ave(NO2, Month, FUN = seq_along))
reshape(df,direction="wide",idvar ="Month", timevar = "row")
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213