1

I need to order df by country according to target_cc levels order. How can I achieve this? See MWE


country <- rep(c("AT","BE","CY","DE","EE"),10)
value <- seq(1, 50)

target_cc <- data.frame("DE","CY","BE","AT","EE")


df <- data.frame(country, value)
df

OTStats
  • 1,820
  • 1
  • 13
  • 22
Jordan_b
  • 305
  • 1
  • 8

4 Answers4

3

The best way to do this is to make your country variable a factor with the levels in the order you want. Then any standard solution for sorting/ordering will work on it:

# First, it's weird that target_cc is a data frame with these columns
# I'm hoping that was a typo in your question, and we can use it as a
# vector instead. If not, we can create the vector from the data frame
# with unlist():

target_cc
#  X.DE. X.CY. X.BE. X.AT. X.EE.
#1    DE    CY    BE    AT    EE

# useless as data frame, useful as vector
target_cc_v = unlist(target_cc)
# or fix the definition
target_cc_v = c("DE","CY","BE","AT","EE")


# Make country a factor with the levels in this order:
df$country = factor(df$country, levels = target_cc_v)

# Any standard sort/order solution should now work
df[order(df$country, df$value), ]
#    country value
# 4       DE     4
# 9       DE     9
# 14      DE    14
# 19      DE    19
# 24      DE    24
# 29      DE    29
# 34      DE    34
# 39      DE    39
# 44      DE    44
# 49      DE    49
# 3       CY     3
# 8       CY     8
# ...
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
1

You can make country an ordered factor.

library(dplyr)

country <- rep(c("AT","BE","CY","DE","EE"),10)
value <- seq(1, 50)

# chenged this to a vector rather than a data frame
target_cc <- c("DE","CY","BE","AT","EE")

df %>% 
  mutate(country = factor(country, levels = target_cc)) %>% 
  arrange(country)
  • 1
    All factors have an order of their levels -- `ordered = TRUE` is really only needed if you want polynomial contrasts in a modeling function, or you want to do comparisons with `<` and `>`. – Gregor Thomas Feb 07 '20 at 15:10
  • Good point. I thought it made it a bit more explicit here what was happening. Because I don't know... does it hurt here? –  Feb 07 '20 at 15:10
  • 2
    It doesn't hurt *at this point*, but if OP fits a linear model at some later point they may be very confused by the polynomial contrasts if they were expecting the usual contrasts with a reference level. And it probably wouldn't occur to them to go back to this step to debug that. – Gregor Thomas Feb 07 '20 at 15:12
0

I'm not sure if I understand your request correctly, so please let me know if this isn't what you're looking for.

You can use target_cc to join with df, but it needs to be the same length as df.

Using dplyr:

library(dplyr)

country <- rep(c("AT","BE","CY","DE","EE"), 10)

value <- seq(1, 50)

df <- data.frame(country, value)

target <- data.frame(
  country = rep(c("DE","CY","BE","AT","EE"), times = 5)
)

df2 <- df %>%
  right_join(target, by = "country") %>%
  distinct()

head(df2)
#>   country value
#> 1      DE     4
#> 2      DE     9
#> 3      DE    14
#> 4      DE    19
#> 5      DE    24
#> 6      DE    29

tail(df2)
#>    country value
#> 45      EE    25
#> 46      EE    30
#> 47      EE    35
#> 48      EE    40
#> 49      EE    45
#> 50      EE    50

Created on 2020-02-07 by the reprex package (v0.3.0)

Since this will produce the cross-product, use distinct to keep only the unique rows.

Fleur De Lys
  • 480
  • 2
  • 9
0

A base R solution using order

dfout <- df[order(match(df$country,unlist(target_cc))),]

which gives

> dfout
   country value
4       DE     4
9       DE     9
14      DE    14
19      DE    19
24      DE    24
29      DE    29
34      DE    34
39      DE    39
44      DE    44
49      DE    49
3       CY     3
8       CY     8
13      CY    13
18      CY    18
23      CY    23
28      CY    28
33      CY    33
38      CY    38
43      CY    43
48      CY    48
2       BE     2
7       BE     7
12      BE    12
17      BE    17
22      BE    22
27      BE    27
32      BE    32
37      BE    37
42      BE    42
47      BE    47
1       AT     1
6       AT     6
11      AT    11
16      AT    16
21      AT    21
26      AT    26
31      AT    31
36      AT    36
41      AT    41
46      AT    46
5       EE     5
10      EE    10
15      EE    15
20      EE    20
25      EE    25
30      EE    30
35      EE    35
40      EE    40
45      EE    45
50      EE    50
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81