0

I have a dataframe with monthly data and I want to add a column, which gives me the Season of each month. Hereby, 3(Mrz)-5(May) is defined as Spring, 6(Jun)-8(Aug) as Summer, 9(Sep)-11(Nov) as Autumn and 12(Dec)-2(Feb) as Winter.

sample Data.

MONTH <- sample(1:12, 10, rep=TRUE)
SALES <-sample(30:468, 10, rep = TRUE)
df = data.frame(MONTH,SALES)

   MONTH SALES
1      9   209
2      3   273
3      9   249
4      7    99
5      9   442
6      6   202
7      7   347
8      3   428
9      1    67
10     2   223

i reached my goal by using nested ifelse:

df$SEASON<-ifelse(df$MONTH>=3 & df$MONTH<=5,"SPRING",
                 ifelse(df$MONTH>=6 & df$MONTH<=8,"SUMMER",
                       ifelse(df$MONTH>=9 & df$MONTH<=11,"AUTUMN",
                              ifelse(df$MONTH>=12 | df$MONTH<=2,"WINTER",NA))))

   MONTH SALES SEASON
1      9   209 AUTUMN
2      3   273 SPRING
3      9   249 AUTUMN
4      7    99 SUMMER
5      9   442 AUTUMN
6      6   202 SUMMER
7      7   347 SUMMER
8      3   428 SPRING
9      1    67 WINTER
10     2   223 WINTER

However: The use of nested ifelse is not very elegant, is it? Furthermore, it gets laborious, if I have more than 4 character-values to assign (for example: add names to twenty different IDs). What would be the more elegant way, to solve this kind of problem?

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
LuckyLuke
  • 11
  • 5
  • Instead of mentioning conditions one by one use `cut` with `labels` - https://stackoverflow.com/questions/13559076/convert-continuous-numeric-values-to-discrete-categories-defined-by-intervals – Ronak Shah Oct 21 '20 at 14:45

3 Answers3

0

Does this work:

> library(dplyr)
> df %>% mutate(SEASON = case_when(MONTH %in% 3:5 ~ 'Spring', MONTH %in% 6:8 ~ 'Summer', MONTH %in% 9:11 ~ 'Autumn', TRUE ~ 'Winter'))
# A tibble: 10 x 4
      X1 MONTH SALES SEASON
   <dbl> <dbl> <dbl> <chr> 
 1     1     9   209 Autumn
 2     2     3   273 Spring
 3     3     9   249 Autumn
 4     4     7    99 Summer
 5     5     9   442 Autumn
 6     6     6   202 Summer
 7     7     7   347 Summer
 8     8     3   428 Spring
 9     9     1    67 Winter
10    10     2   223 Winter
> 
Karthik S
  • 11,348
  • 2
  • 11
  • 25
0

What you're looking for is a dplyr method of mutate + case_when

df <- df %>% mutate(new_season = case_when(Month = 1 ~ "January"))

hachiko
  • 671
  • 7
  • 20
0

You can create a vector of your expected values, then index off of it.

seasons <- c(
  "WINTER",
  rep(c("SPRING", "SUMMER", "AUTUMN"), each = 3),
  "WINTER", "WINTER"
)

df$SEASON <- season[df$MONTH]
bcarlsen
  • 1,381
  • 1
  • 5
  • 11