So I am trying to read a csv into R, and if I use
data = read.csv("2013_NBAseason.csv", header = T)
I get an error
Error in read.table(file = file, header = header, sep = sep, quote = quote, :
duplicate 'row.names' are not allowed"
Which is because the dates aren't unique because multiple games are played everyday. Thus, I tried removing the last column using this, but I still get the same error.
The cause of the problem, I think after reading this is because my last column doesn't have a header
Thus I have done this
data = read.csv("2013_NBAseason.csv", header = T,
colClasses=c(rep(NA,7),"NULL"), row.names=NULL)
Now I have a dataframe that has all my column names shifted over and an empty column to the right
head(data)
row.names Date Box.Score Away Away_Points Home Home_Points
1 Tue, Oct 30, 2012 Box Score Washington Wizards 84 Cleveland Cavaliers 94
2 Tue, Oct 30, 2012 Box Score Dallas Mavericks 99 Los Angeles Lakers 91
3 Tue, Oct 30, 2012 Box Score Boston Celtics 107 Miami Heat 120
4 Wed, Oct 31, 2012 Box Score Sacramento Kings 87 Chicago Bulls 93
What is the best way to solve this, or to avoid the problem to start with?
Also if someone tells me how to add the csv, I can upload it so that you guys can see the raw data.
Also, manually changing the csv won't work, because this needs to be extrapolated to many more csvs with something like this
temp = list.files(pattern="*.csv")
data = do.call("rbind", lapply(temp, read.csv, ...