R: invalid 'sep' value: must be one byte

Question

I'm trying to read a file which uses :: as the column seperator:

userID::MovieID::Rating::Timestamp
1::1193::5::978300760
1::661::3::978302109
1::914::3::978301968
1::3408::4::978300275

Here is my code

tr = read.table("/home/user/ml-1m/ratings.dat",sep = ":"  )
print(tr)

　　 the result is :

   V1 V2   V3 V4 V5 V6        V7
1   2 NA  318 NA  5 NA 978298413
2   2 NA 1207 NA  4 NA 978298478
3   2 NA 1968 NA  2 NA 978298881
4   2 NA 3678 NA  3 NA 978299250
5   2 NA 1244 NA  3 NA 978299143
6   2 NA  356 NA  5 NA 978299686
7   2 NA 1245 NA  2 NA 978299200

I don't want the NA value.
But if I set sep="::" ,there is error invalid 'sep' value: must be one byte　 How can I　fixed this?

Did you check the content of `tr`? Are they the expected values? — , Apr 20 '15 at 06:52
You're error is not *reproducible*. It's very hard to help you when we can't run your example. Please see [this page](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) on the subject and modify accordingly. — Anders Ellern Bilgrau, Apr 20 '15 at 07:05

score 12 · Answer 1 · answered Apr 20 '15 at 07:55

12

The text file importing functions only support single characters as column separators. However, you can tell read.table to ignore columns for import with its colClasses parameter (see the help file):

read.table(text = "userID::MovieID::Rating::Timestamp
1::1193::5::978300760
1::661::3::978302109
1::914::3::978301968
1::3408::4::978300275", 
           sep = ":", colClasses = c(NA, "NULL"),
           header = TRUE)

#  userID MovieID Rating Timestamp
#1      1    1193      5 978300760
#2      1     661      3 978302109
#3      1     914      3 978301968
#4      1    3408      4 978300275

answered Apr 20 '15 at 07:55

Roland

127,288
10
191
288

Thanks,I still have a question,the raw data didn't have this header :````userID::MovieID::Rating::Timestamp```` ,so I want to use ````col.names=c('user','movie','rating','timestamp')```` in ````read.table```` But it seems like I need to use ````````col.names=c('user','NA','movie','NA','rating','NA','timestamp')```` for the NA value .How can I solve this ? – user2492364 Apr 20 '15 at 08:17
2

I'd set the column names after import. – Roland Apr 20 '15 at 08:18
I mean the ratings.dat is just numbers ````1::1193::5::978300760 1::661::3::978302109 1::914::3::978301968 1::3408::4::978300275```` – user2492364 Apr 20 '15 at 08:24
3

I understood that. Just use `names(yourDF) <- c('user','movie','rating','timestamp')`. – Roland Apr 20 '15 at 08:26

R: invalid 'sep' value: must be one byte

1 Answers1