1

So I am trying to read a csv into R, and if I use

data = read.csv("2013_NBAseason.csv", header = T)

I get an error

Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
duplicate 'row.names' are not allowed"

Which is because the dates aren't unique because multiple games are played everyday. Thus, I tried removing the last column using this, but I still get the same error.

The cause of the problem, I think after reading this is because my last column doesn't have a header

Thus I have done this

data = read.csv("2013_NBAseason.csv", header = T, 
                 colClasses=c(rep(NA,7),"NULL"), row.names=NULL)

Now I have a dataframe that has all my column names shifted over and an empty column to the right

head(data)
          row.names      Date          Box.Score Away          Away_Points Home  Home_Points
1 Tue, Oct 30, 2012 Box Score Washington Wizards   84  Cleveland Cavaliers   94
2 Tue, Oct 30, 2012 Box Score   Dallas Mavericks   99   Los Angeles Lakers   91
3 Tue, Oct 30, 2012 Box Score     Boston Celtics  107           Miami Heat  120
4 Wed, Oct 31, 2012 Box Score   Sacramento Kings   87        Chicago Bulls   93

What is the best way to solve this, or to avoid the problem to start with?

Also if someone tells me how to add the csv, I can upload it so that you guys can see the raw data.

Also, manually changing the csv won't work, because this needs to be extrapolated to many more csvs with something like this

temp = list.files(pattern="*.csv")
data = do.call("rbind", lapply(temp, read.csv, ...
Community
  • 1
  • 1
qwertylpc
  • 2,016
  • 7
  • 24
  • 34
  • 1
    You might want to check out `fread` from the `data.table` package. I don't think it has this restriction when reading in data. – Mike.Gahan May 04 '15 at 03:18
  • Have you tried with tab separated, e.g. set in read.csv options `sep = "\t"`? The default separator is commas but your date variable seems to have commas, which might cause problems. – Antti Sep 19 '16 at 06:50

1 Answers1

0

Why don't you try not using header = T

do this:

#read data without any row names
data <- read.csv("2013_NBAseason.csv")

#enter string "home_points" to last column. I am assuming it is column 6.
data[1, 6] <- "Home_Points"

#make row 1, your column names
colnames(data) = data[1, ]

Does the above solve it?

vagabond
  • 3,526
  • 5
  • 43
  • 76
  • I tried this, but when removing the first row after naming my columns, it loses the column names. I might have been doing it wrong and if you could solve that issue, then I would do that – qwertylpc May 04 '15 at 03:48
  • actually, just replicate 2-3 lines of your data as it is in csv into a data frame in R and paste it here. – vagabond May 04 '15 at 03:50