0

I am a naïve, I have loaded a famous dataset on R, now I want to do several experiments with it. below is the array of scripts I have executed so far :

I have a battles dataframe :

str(battles)

'data.frame':   38 obs. of  25 variables:
 $ name              : Factor w/ 38 levels "Battle at the Mummer's Ford",..: 13 1 7 14 18 10 25 5 3 17 ...
 $ year              : int  298 298 298 298 298 298 298 299 299 299 ...
 $ battle_number     : int  1 2 3 4 5 6 7 8 9 10 ...
 $ attacker_king     : Factor w/ 5 levels "","Balon/Euron Greyjoy",..: 3 3 3 4 4 4 3 2 2 2 ...
 $ defender_king     : Factor w/ 7 levels "","Balon/Euron Greyjoy",..: 6 6 6 3 3 3 6 6 6 6 ...
 $ attacker_1        : Factor w/ 11 levels "Baratheon","Bolton",..: 10 10 10 11 11 11 10 9 9 9 ...
 $ attacker_2        : Factor w/ 8 levels "","Bolton","Frey",..: 1 1 1 1 8 8 1 1 1 1 ...
 $ attacker_3        : Factor w/ 3 levels "","Giants","Mormont": 1 1 1 1 1 1 1 1 1 1 ...
 $ attacker_4        : Factor w/ 2 levels "","Glover": 1 1 1 1 1 1 1 1 1 1 ...
 $ defender_1        : Factor w/ 13 levels "","Baratheon",..: 12 2 12 8 8 8 6 11 11 11 ...
 $ defender_2        : Factor w/ 3 levels "","Baratheon",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ defender_3        : logi  NA NA NA NA NA NA ...
 $ defender_4        : logi  NA NA NA NA NA NA ...
 $ attacker_outcome  : Factor w/ 3 levels "","loss","win": 3 3 3 2 3 3 3 3 3 3 ...
 $ battle_type       : Factor w/ 5 levels "","ambush","pitched battle",..: 3 2 3 3 2 2 3 3 5 2 ...
 $ major_death       : int  1 1 0 1 1 0 0 0 0 0 ...
 $ major_capture     : int  0 0 1 1 1 0 0 0 0 0 ...
 $ attacker_size     : int  15000 NA 15000 18000 1875 6000 NA NA 1000 264 ...
 $ defender_size     : int  4000 120 10000 20000 6000 12625 NA NA NA NA ...
 $ attacker_commander: Factor w/ 32 levels "","Asha Greyjoy",..: 8 6 9 22 16 18 6 30 2 28 ...
 $ defender_commander: Factor w/ 29 levels "","Amory Lorch",..: 7 4 10 28 12 14 15 1 1 1 ...
 $ summer            : int  1 1 1 1 1 1 1 1 1 1 ...
 $ location          : Factor w/ 28 levels "","Castle Black",..: 8 13 17 9 27 17 4 12 5 23 ...
 $ region            : Factor w/ 7 levels "Beyond the Wall",..: 7 5 5 5 5 5 5 3 3 3 ...
 $ note              : Factor w/ 6 levels "","Greyjoy's troop number based on the Battle of Deepwood Motte, in which Asha had 1000 soldier on 30 longships. That comes out to"| __truncated__,..: 1 1 1 1 1 1 1 1 1 2 ...

My requirement is I want to know how many loss and wins a king had in his entire span of GOT so far.

select(battles,attacker_outcome,attacker_king)
   attacker_outcome            attacker_king
1               win Joffrey/Tommen Baratheon
2               win Joffrey/Tommen Baratheon
3               win Joffrey/Tommen Baratheon
4              loss               Robb Stark
5               win               Robb Stark
6               win               Robb Stark
7               win Joffrey/Tommen Baratheon
8               win      Balon/Euron Greyjoy
9               win      Balon/Euron Greyjoy
10              win      Balon/Euron Greyjoy
11              win               Robb Stark
12              win      Balon/Euron Greyjoy
13              win      Balon/Euron Greyjoy
14              win Joffrey/Tommen Baratheon
15              win               Robb Stark
16              win        Stannis Baratheon
17             loss Joffrey/Tommen Baratheon
18              win               Robb Stark
19              win               Robb Stark
20             loss        Stannis Baratheon
21              win               Robb Stark
22             loss               Robb Stark
23              win                         
24              win Joffrey/Tommen Baratheon
25              win Joffrey/Tommen Baratheon
26              win Joffrey/Tommen Baratheon
27              win               Robb Stark
28             loss        Stannis Baratheon
29              win Joffrey/Tommen Baratheon
30              win                         
31              win        Stannis Baratheon
32              win      Balon/Euron Greyjoy
33              win      Balon/Euron Greyjoy
34              win Joffrey/Tommen Baratheon
35              win Joffrey/Tommen Baratheon
36              win Joffrey/Tommen Baratheon
37              win Joffrey/Tommen Baratheon
38                         Stannis Baratheon

I need 2 more columns with name "number of wins" and "number of loss" for each attacker king.

Note: Please excuse me if in any ways my question hurts the stackOverFlow ask question policy, as this is my first question in R.

zx8754
  • 52,746
  • 12
  • 114
  • 209
Soumyaansh
  • 8,626
  • 7
  • 45
  • 45

2 Answers2

3

You can use table from base package

table(df$attacker_king,df$attacker_outcome )

#                           loss win
#  Balon/Euron Greyjoy         0   7
#  Joffrey/Tommen Baratheon    1  13
#  Robb Stark                  2   8
#  Stannis Baratheon           2   2
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
2

One option would be dplyr. After grouping by 'attacker_king', we summarise the output by creating two columns ('NoWins', 'NoLoss') based on the sum of the logical vector for "win" and "loss" and if needed filter out the blank elements in 'attacker_king'.

library(dplyr)
battles %>%
      group_by(attacker_king) %>%
      summarise(NoWins = sum(attacker_outcome == "win"),
                 NoLoss = sum(attacker_outcome == "loss")) %>%
      filter(nzchar(attacker_king))
#            attacker_king NoWins NoLoss
#                 <chr>  <int>  <int>
#1      Balon/Euron Greyjoy      7      0
#2 Joffrey/Tommen Baratheon     13      1
#3               Robb Stark      8      2
#4        Stannis Baratheon      2      2

Or we can use dplyr/tidyr. After grouping, we get the frequency count with tally, filter (as above) and then spread (from tidyr) to convert the 'long' to 'wide' format.

library(tidyr)
battles %>%
     group_by(attacker_king, attacker_outcome) %>%
     tally() %>% 
     filter(nzchar(attacker_king) & nzchar(attacker_outcome)) %>% 
     spread(attacker_outcome, n)

Or using dcast from data.table. This would be much easier as the dcast also have the fun.aggregate so we can specify the function (in this case length) while reshaping to 'wide' format.

library(data.table)
dcast(setDT(battles), attacker_king~attacker_outcome, length)[nzchar(attacker_king)
                        ][, -2, with = FALSE]
#                attacker_king loss win
#1:      Balon/Euron Greyjoy    0   7
#2: Joffrey/Tommen Baratheon    1  13
#3:               Robb Stark    2   8
#4:        Stannis Baratheon    2   2

Or use table from base R

table(battles[c("attacker_king", "attacker_outcome")])[-1,-1]
#                          attacker_outcome
#  attacker_king              loss win
#  Balon/Euron Greyjoy         0   7
#  Joffrey/Tommen Baratheon    1  13
#  Robb Stark                  2   8
#  Stannis Baratheon           2   2
akrun
  • 874,273
  • 37
  • 540
  • 662