4

I have dates in the format mm/yyyy in column 1, and then results in column 2.

  1. month Result
  2. 01/2018 96.13636
  3. 02/2018 96.40000
  4. 3/2018 94.00000
  5. 04/2018 97.92857
  6. 05/2018 95.75000
  7. 11/2017 98.66667
  8. 12/2017 97.78947

How can I order by month such that it will start from the first month (11/2017) and end (05/2018).

I have tried a few 'orders', but none seem to be ordering by year and then by month

leerssej
  • 14,260
  • 6
  • 48
  • 57
pablo144
  • 117
  • 2
  • 8

2 Answers2

3

In tidyverse (w/ lubridate added):

library(tidyverse)
library(lubridate)

dfYrMon <- 
    df1 %>% 
    mutate(date = parse_date_time(month, "my"),
           year = year(date),
           month = month(date)
           ) %>% 
    arrange(year, month) %>% 
    select(date, year, month, result)

With data:

df1 <- tibble(month = c("01/2018", "02/2018", "03/2018", "04/2018", "05/2018", "11/2017", "12/2017"), 
              result = c(96.13636, 96.4, 94, 97.92857, 95.75, 98.66667, 97.78947))

Will get you this 'dataframe':

# A tibble: 7 x 4
        date  year month   result
      <dttm> <dbl> <dbl>    <dbl>
1 2017-11-01  2017    11 98.66667
2 2017-12-01  2017    12 97.78947
3 2018-01-01  2018     1 96.13636
4 2018-02-01  2018     2 96.40000
5 2018-03-01  2018     3 94.00000
6 2018-04-01  2018     4 97.92857
7 2018-05-01  2018     5 95.75000

Making your data values atomic (year in its own column, month in its own column) generally improves the ease of manipulation.

Or if you want to use base R date manipulations instead of lubridate's:

library(tidyverse)

dfYrMon_base <- 
    df1 %>% 
    mutate(date = as.Date(paste("01/", month, sep = ""), "%d/%m/%Y"),
           year = format(as.Date(date, format="%d/%m/%Y"),"%Y"),
           month = format(as.Date(date, format="%d/%m/%Y"),"%m")
          ) %>%
    arrange(year, month) %>%
    select(date, year, month, result)

dfYrMon_base

Note the datatypes created.

# A tibble: 7 x 4
        date  year month   result
      <date> <chr> <chr>    <dbl>
1 2017-11-01  2017    11 98.66667
2 2017-12-01  2017    12 97.78947
3 2018-01-01  2018    01 96.13636
4 2018-02-01  2018    02 96.40000
5 2018-03-01  2018    03 94.00000
6 2018-04-01  2018    04 97.92857
7 2018-05-01  2018    05 95.75000
leerssej
  • 14,260
  • 6
  • 48
  • 57
2

We can convert it to yearmon class and then do the order

library(zoo)
out <- df1[order(as.yearmon(df1$month, "%m/%Y"), df1$Result),]
row.names(out) <- NULL
out
#    month   Result
#1 11/2017 98.66667
#2 12/2017 97.78947
#3 01/2018 96.13636
#4 02/2018 96.40000
#5 03/2018 94.00000
#6 04/2018 97.92857
#7 05/2018 95.75000

data

df1 <- structure(list(month = c("01/2018", "02/2018", "03/2018", "04/2018", 
"05/2018", "11/2017", "12/2017"), Result = c(96.13636, 96.4, 
94, 97.92857, 95.75, 98.66667, 97.78947)), .Names = c("month", 
"Result"), class = "data.frame", 
row.names = c("1", "2", "3", 
"4", "5", "6", "7"))
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    or perhaps `z <- read.zoo(df1, FUN = as.yearmon, format = "%m/%Y"); replace(df1, TRUE, fortify.zoo(z))` (You can just use `z` and omit the second line if a zoo object is sufficient.) – G. Grothendieck Jun 01 '18 at 15:04
  • Thanks! Another q... plotting this now I get the error Error in plot.window(...) : need finite 'xlim' values In addition: Warning messages: 1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion 2: In min(x) : no non-missing arguments to min; returning Inf 3: In max(x) : no non-missing arguments to max; returning -Inf I do not understand since x on y should be fine ... – pablo144 Jun 01 '18 at 17:00