I have a data table listing data by country, and I want to add region information. I made the default "Europe" so I then need to correct those rows which are not in the "Europe" region. Here's what I need to assign:
ordered_tally[ , region :="Europe"]
AsiaRows = c(5, 12, 17, 24, 25, 26, 34, 36, 45, 56, 61, 64, 75, 76, 81, 88, 90, 91)
NorthAmericaRows = c(7, 18, 42, 47)
SouthAmericaRows = c(11, 13, 32, 38, 41, 46, 48, 52, 53, 57, 58, 68, 77, 78, 83)
AfricaRows = c(22, 66, 67, 70, 74, 82, 84, 86, 95)
OceaniaRows = c(31, 62)
MiddleEastRows = c(20, 51, 71, 72, 87, 89, 92)
region = c("Asia", "NorthAmerica", "SouthAmerica", "Africa", "Oceania", "MiddleEast")
region_indices = c(AsiaRows, NorthAmericaRows, SouthAmericaRows, AfricaRows, OceaniaRows, MiddleEastRows)
Here's what worked for me:
lapply(c(1:6), function(x) ordered_tally[region_indices[[x]], "region":= eval(region[x]
)])
Is there a better way to do this? I have the sense lapply
shouldn't be necessary and that there must be a native data.table
operation I'm missing.
Here's ordered_tally
:
V1 country freq
1: 1 Poland 27718
2: 2 Spain 9800
3: 3 Denmark 7551
4: 4 United Kingdom 4910
5: 5 Thailand 4069
6: 6 Norway 3995
7: 7 United States 3837
8: 8 Netherlands 3075
9: 9 France 2866
10: 10 Italy 2429
11: 11 Brazil 2109
12: 12 India 1804
13: 13 Argentina 1610
14: 14 Germany 1437
15: 15 Sweden 1409
16: 16 Lithuania 1339
17: 17 Taiwan 1241
18: 18 Mexico 1218
19: 19 Latvia 1152
20: 20 Israel 1076
21: 21 Czech Republic 1064
22: 22 South Africa 1036
23: 23 Portugal 897
24: 24 Malaysia 889
25: 25 Indonesia 851
26: 26 Hong Kong 722
27: 27 Russia 670
28: 28 Switzerland 635
29: 29 Finland 613
30: 30 Ireland 603
31: 31 Australia 588
32: 32 Chile 573
33: 33 Estonia 540
34: 34 China 514
35: 35 Belgium 498
36: 36 Singapore 488
37: 37 Austria 464
38: 38 Colombia 361
39: 39 Belarus 330
40: 40 Romania 330
41: 41 Guatemala 311
42: 42 Canada 295
43: 43 Bulgaria 288
44: 44 Greece 246
45: 45 South Korea 231
46: 46 Costa Rica 218
47: 47 Puerto Rico 218
48: 48 Venezuela 204
49: 49 Serbia 203
50: 50 Turkey 184
51: 51 United Arab Emirates 175
52: 52 Ecuador 172
53: 53 Peru 161
54: 54 Hungary 136
55: 55 Montenegro 131
56: 56 Japan 127
57: 57 El Salvador 95
58: 58 Uruguay 84
59: 59 Cyprus 70
60: 60 Iceland 59
61: 61 Bangladesh 57
62: 62 New Zealand 47
63: 63 Moldova 42
64: 64 Philippines 41
65: 65 Slovakia 36
66: 66 Mauritius 32
67: 67 Nigeria 30
68: 68 Bolivia 25
69: 69 Croatia 25
70: 70 Zimbabwe 24
71: 71 Saudi Arabia 20
72: 72 Oman 19
73: 73 Slovenia 19
74: 74 Egypt 15
75: 75 Vietnam 14
76: 76 Sri Lanka 10
77: 77 Dominican Republic 9
78: 78 Nicaragua 9
79: 79 Macedonia (FYROM) 5
80: 80 Ukraine 5
81: 81 Azerbaijan 4
82: 82 Namibia 4
83: 83 Barbados 3
84: 84 Botswana 3
85: 85 Malta 3
86: 86 Ghana 2
87: 87 Iraq 2
88: 88 Kazakhstan 2
89: 89 Lebanon 2
90: 90 Macau 2
91: 91 Nepal 2
92: 92 Qatar 2
93: 93 Albania 1
94: 94 Faroe Islands 1
95: 95 Morocco 1
and here's the dump
code to reproduce:
ordered_tally <-
data.table(V1 = c("1", "2", "3", "4", "5", "6", "7", "8",
"9", "10", "11", "12", "13", "14", "15", "16", "17", "18", "19",
"20", "21", "22", "23", "24", "25", "26", "27", "28", "29", "30",
"31", "32", "33", "34", "35", "36", "37", "38", "39", "40", "41",
"42", "43", "44", "45", "46", "47", "48", "49", "50", "51", "52",
"53", "54", "55", "56", "57", "58", "59", "60", "61", "62", "63",
"64", "65", "66", "67", "68", "69", "70", "71", "72", "73", "74",
"75", "76", "77", "78", "79", "80", "81", "82", "83", "84", "85",
"86", "87", "88", "89", "90", "91", "92", "93", "94", "95"),
country = c("Poland", "Spain", "Denmark", "United Kingdom",
"Thailand", "Norway", "United States", "Netherlands", "France",
"Italy", "Brazil", "India", "Argentina", "Germany", "Sweden",
"Lithuania", "Taiwan", "Mexico", "Latvia", "Israel", "Czech Republic",
"South Africa", "Portugal", "Malaysia", "Indonesia", "Hong Kong",
"Russia", "Switzerland", "Finland", "Ireland", "Australia",
"Chile", "Estonia", "China", "Belgium", "Singapore", "Austria",
"Colombia", "Belarus", "Romania", "Guatemala", "Canada",
"Bulgaria", "Greece", "South Korea", "Costa Rica", "Puerto Rico",
"Venezuela", "Serbia", "Turkey", "United Arab Emirates",
"Ecuador", "Peru", "Hungary", "Montenegro", "Japan", "El Salvador",
"Uruguay", "Cyprus", "Iceland", "Bangladesh", "New Zealand",
"Moldova", "Philippines", "Slovakia", "Mauritius", "Nigeria",
"Bolivia", "Croatia", "Zimbabwe", "Saudi Arabia", "Oman",
"Slovenia", "Egypt", "Vietnam", "Sri Lanka", "Dominican Republic",
"Nicaragua", "Macedonia (FYROM)", "Ukraine", "Azerbaijan",
"Namibia", "Barbados", "Botswana", "Malta", "Ghana", "Iraq",
"Kazakhstan", "Lebanon", "Macau", "Nepal", "Qatar", "Albania",
"Faroe Islands", "Morocco"), freq = c(27718L, 9800L, 7551L,
4910L, 4069L, 3995L, 3837L, 3075L, 2866L, 2429L, 2109L, 1804L,
1610L, 1437L, 1409L, 1339L, 1241L, 1218L, 1152L, 1076L, 1064L,
1036L, 897L, 889L, 851L, 722L, 670L, 635L, 613L, 603L, 588L,
573L, 540L, 514L, 498L, 488L, 464L, 361L, 330L, 330L, 311L,
295L, 288L, 246L, 231L, 218L, 218L, 204L, 203L, 184L, 175L,
172L, 161L, 136L, 131L, 127L, 95L, 84L, 70L, 59L, 57L, 47L,
42L, 41L, 36L, 32L, 30L, 25L, 25L, 24L, 20L, 19L, 19L, 15L,
14L, 10L, 9L, 9L, 5L, 5L, 4L, 4L, 3L, 3L, 3L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 1L, 1L, 1L), user_count = c(4126L, 1533L,
1312L, 870L, 620L, 867L, 668L, 493L, 448L, 401L, 530L, 314L,
246L, 318L, 338L, 227L, 205L, 201L, 170L, 237L, 166L, 165L,
173L, 135L, 171L, 116L, 97L, 139L, 123L, 94L, 102L, 100L,
162L, 122L, 86L, 73L, 101L, 79L, 49L, 68L, 28L, 37L, 41L,
50L, 49L, 28L, 27L, 35L, 47L, 59L, 22L, 35L, 34L, 16L, 29L,
30L, 14L, 24L, 16L, 15L, 29L, 2L, 15L, 6L, 10L, 8L, 6L, 3L,
3L, 7L, 5L, 6L, 3L, 4L, 6L, 7L, 3L, 1L, 1L, 4L, 2L, 2L, 1L,
1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L))