1

I am trying to use lists in R as dictionaries in computing winning percentages for basketball teams. Basically, for each win, I'd like to increment the appropriate dictionary amount, and for each game, I'd like to increment the appropriate dictionary amount. Somehow, the answers I'm getting seem reasonable but are incorrect, and I can't figure out why the program logically doesn't give the expected outputs. Any suggestions or tips would be appreciated. The code I'm using is below:

games <- read.csv(game_pathname, header = FALSE)

names(games) <- c("GameDate", "DateCount", "HomeID", "AwayID", "HomePts", "AwayPts",     "HomeAbbr", "AwayAbbre", "HomeName", "AwayName")

wins = list()
total = list()

for (team in unique(games$HomeName)) {
    wins[team] <- 0
    total[team] <- 0
}

for (i in 1:nrow(games)) {
    if (games$HomePts[i] > games$AwayPts[i]) {
        wins[games$HomeName[i]] <- wins[[games$HomeName[i]]] + 1
    } else {
        wins[games$AwayName[i]] <- wins[[games$AwayName[i]]] + 1
    }
    total[games$HomeName[i]] <- total[[games$HomeName[i]]] + 1
    total[games$AwayName[i]] <- total[[games$AwayName[i]]] + 1
}

for (team in unique(games$HomeName)) {
    print(paste(team, wins[[team]] / total[[team]]))
}
Artem
  • 3,304
  • 3
  • 18
  • 41
user1230611
  • 11
  • 1
  • 4

1 Answers1

0

As I looked in the code and by creation toy example there is no problems in the algorithm. In the simulation below I used three teams, where one is complete looser, another break even, and the third is a champion.

games <- data.frame(HomeName = c("a", "b", "c"),
                    HomePts = c(1, 2, 3),
                    AwayPts = c(3, 1, 2),
                    AwayName = c("c", "a", "b")                    )
wins = list()
total = list()

for (team in unique(games$HomeName)) {
  wins[team] <- 0
  total[team] <- 0
}

for (i in 1:nrow(games)) {
  if (games$HomePts[i] > games$AwayPts[i]) {
    wins[games$HomeName[i]] <- wins[[games$HomeName[i]]] + 1
  } else {
    wins[games$AwayName[i]] <- wins[[games$AwayName[i]]] + 1
  }
  total[games$HomeName[i]] <- total[[games$HomeName[i]]] + 1
  total[games$AwayName[i]] <- total[[games$AwayName[i]]] + 1
}

for (team in unique(games$HomeName)) {
  print(paste(team, wins[[team]] / total[[team]]))
}

games
wins
total

The output of your algorithm is below:

[1] "a 0"
[1] "b 0.5"
[1] "c 1"

> games
  HomeName HomePts AwayPts AwayName
1        a       1       3        c
2        b       2       1        a
3        c       3       2        b

> wins
$`a`
[1] 0

$b
[1] 1

$c
[1] 2

> total
$`a`
[1] 2

$b
[1] 2

$c
[1] 2

However it not very much "R-style" as using for and direct manipulation with list indices is considered like not "comme il faut" :)

You can get similar results with e.g. dplyr packagу, which is a part of tidyverse packages. Code below is a comparison of the results of the games, then split it into two data frames and merge it row-wise. Finally group by team name and calculate mean win rate. Please see below:

library(dplyr)
df <- games %>% mutate(hwins = (HomePts > AwayPts), awins = !hwins)
df_home <- df %>% select(HomeName, hwins) %>% rename(name = HomeName, wins = hwins)
df_away <- df %>% select(AwayName, awins) %>% rename(name = AwayName, wins = awins)
df <- bind_rows(df_home, df_away) %>% group_by(name) %>% summarise(mean_wins = mean(wins))
df

Output:

# A tibble: 3 x 2
  name  mean_wins
  <fct>     <dbl>
1 a           0  
2 b           0.5
3 c           1  
Artem
  • 3,304
  • 3
  • 18
  • 41