I am beyond new to using R or anything in the computer science field. I have managed to utilize an R Script shared by my professor to gather data on earnings from the E-sport industry and see if there is gender inequality amongst top earners. I have managed to scrape the data from https://www.esportsearnings.com/ for the top 10 played games and all the prize money that has been awarded amongst those games. However, I have no way to sort that data by if the earner is "male", "female", "other", or "unable to tell". I am very new to R and all of this and can share the R code I used if need be. If anyone has a code or can guide me in the right direction, it would be greatly appreciated!
Hello again everyone! Thank you all for reaching back so soon. I will attach the two main scripts I have used below this from my professor. As far as the data I have collected, I am not sure of the best way to share it. It is essentially an Excel file with thousands of players ranked from highest earnings to lowest for the top 10 grossing games of all time. I will try to attach a screenshot of that.
R Script
library(dplyr)
library(lubridate)
library(jsonlite)
library(httr)
DataREAD <- read.csv("allplayersStar2.csv", na.strings = "NA")
PlayerID <- DataREAD$PlayerId
PlayerID <- PlayerID[1:2115] #Must change this if repeating scrape due to interruption
playerIDs <- as.numeric(gsub('[$,]', '', PlayerID))
df <- data.frame(matrix(ncol = 25, nrow = 0))
for(i in 1:length(playerIDs)) {
#assigns the i player ID
playerID <- playerIDs[[i]]
APILINK <- paste0("http://api.esportsearnings.com/v0/LookupPlayerTournaments?apikey=b0e0da7e58c715f8618fbf2bb0f01920395531a048ccc4857274c6ccd7c157f9&playerid=", playerID,"&offset=0")
Sys.sleep(1)
jsonplayer <- APILINK %>%
httr::GET(config = httr::config(ssl_verifypeer = FALSE)) %>%
content(as = "text") %>%
fromJSON()
Sys.sleep(1)
#this subset uses if to figure out if
#there is more data left. If so it
#will collect another set of data
if(nrow(jsonplayer) == 100) {
APILINK <- paste0("http://api.esportsearnings.com/v0/LookupPlayerTournaments?apikey=b0e0da7e58c715f8618fbf2bb0f01920395531a048ccc4857274c6ccd7c157f9&playerid=", playerID,"&offset=100")
Sys.sleep(1)
jsonplayer2 <- APILINK %>%
httr::GET(config = httr::config(ssl_verifypeer = FALSE)) %>%
content(as = "text") %>%
fromJSON()
jsonplayer <- rbind(jsonplayer, jsonplayer2)
if(nrow(jsonplayer) == 200) {
Sys.sleep(1)
APILINK <- paste0("http://api.esportsearnings.com/v0/LookupPlayerTournaments?apikey=b0e0da7e58c715f8618fbf2bb0f01920395531a048ccc4857274c6ccd7c157f9&playerid=", playerID,"&offset=200")
Sys.sleep(1)
jsonplayer2 <- APILINK %>%
httr::GET(config = httr::config(ssl_verifypeer = FALSE)) %>%
content(as = "text") %>%
fromJSON()
jsonplayer <- rbind(jsonplayer, jsonplayer2)
if(nrow(jsonplayer) == 300) {
Sys.sleep(1)
APILINK <- paste0("http://api.esportsearnings.com/v0/LookupPlayerTournaments?apikey=b0e0da7e58c715f8618fbf2bb0f01920395531a048ccc4857274c6ccd7c157f9&playerid=", playerID,"&offset=300")
Sys.sleep(1)
jsonplayer2 <- APILINK %>%
httr::GET(config = httr::config(ssl_verifypeer = FALSE)) %>%
content(as = "text") %>%
fromJSON()
jsonplayer <- rbind(jsonplayer, jsonplayer2)
Sys.sleep(1)
}}}
jsonplayer$Prize <- as.numeric(gsub('[$,]', '', jsonplayer$Prize))
jsonplayer$ExchangeRate <- as.numeric(gsub('[$,]', '', jsonplayer$ExchangeRate))
earnings <- mutate(jsonplayer, Earnings = Prize * ExchangeRate / TeamPlayers)
#be careful to change this to the correct GameId!
earnings <- filter(earnings, GameId == "151")
playerdata <- c()
playerdata <- append(playerdata, playerID)
for(i in 1:24) {
eYear <- i+1997
enddate <- paste0(eYear,"-12-31")
startdate <- paste0(eYear,"-01-01")
earningsyear <- filter(earnings, EndDate < enddate & EndDate > startdate)
if(nrow(earningsyear) > 0) {
yearearnings <- sum(earningsyear$Earnings)
playerdata <- append(playerdata, yearearnings, after = length(1+i))
} else {
playerdata <- append(playerdata, NA, after = length(1+i))
}
}
df <- rbind(df, playerdata)
}
x <- c("PlayerID", "X21", "X20", "X19", "X18",
"X17", "X16", "X15", "X14",
"X13", "X12", "X11", "X10",
"X09", "X08", "X07", "X06",
"X05", "X04", "X03", "X02",
"X01", "X00", "X99", "X98")
colnames(df) <- x
write.csv(df, "playerearningsStar.csv")
#Repeat using the following steps:
#Change the number range in the third line of code
#If you have scraped 70 players start from [71:end]
#Change the number range in the write.csv