I load a data table with fread
the way I always do. The files has ~2M records and is tab delimited.
The load is successful. I can print the head of the table and the column names, so far so good.
But then either changing the name of the first column or setting it as a key fails complaining it cannot find the old column name. I am sure there is no typo in the column name, no heading or trailing space, I tried many times with copy/paste and retyping. I can change the name of apparently any other column.
The first column is long integer id's, so I had to load the bit64 package to get rid of a warning in 'fread', but it did not seem to help. Is it a clue?
Does anyone have any idea what could cause such a symptom? How to debug?
I use R 3.1.0 on Windows 64, latest version of all packages.
Edit: more details
The data load command:
txnData <- fread(txnInDataPathFileName, header=TRUE, sep="\t", na.strings="NA")
The column names:
colnames(txnData)
[1] "txn_ext_id" "txn_desc" "txn_type_id" "site_id" "date_id" "device_id" "cust_id"
[8] "empl_id" "txn_start_time" "txn_end_time" "total_sales" "total_units" "gross_margin"
The rename column that fails (and so does setkey):
setnames(txnData, "txn_ext_id", "txnId")
Error in setnames(txnData, "txn_ext_id", "txnId") :
Items of 'old' not found in column names: txn_ext_id
And finally the requested dput command:
dput(head(txnData))
structure(list(`txn_ext_id` = structure(c(4.88536962440272e-311,
1.10971996159584e-311, 9.9460266389845e-312, 1.0227644072435e-311,
1.10329710699982e-311, 1.01930594588518e-311), class = "integer64"),
txn_desc = c("checkout transaction", "checkout transaction",
"checkout transaction", "checkout transaction", "checkout transaction",
"checkout transaction"), txn_type_id = c(0L, 0L, 0L, 0L,
0L, 0L), site_id = c(982L, 982L, 982L, 982L, 982L, 982L),
date_id = c("2012-12-24", "2013-11-27", "2013-04-08", "2013-06-04",
"2013-11-14", "2013-05-28"), device_id = c(8L, 7L, 8L, 53L,
8L, 5L), cust_id = structure(c(2.02600292130833e-313, 2.02572944866119e-313,
2.02583815970388e-313, 2.02580527009968e-313, 2.02568405005593e-313,
2.02736582767668e-313), class = "integer64"), empl_id = c("?",
"?", "?", "?", "?", "?"), txn_start_time = c("2012-12-24T08:35:56",
"2013-11-27T12:43:30", "2013-04-08T11:48:29", "2013-06-04T15:27:47",
"2013-11-14T12:57:38", "2013-05-28T11:03:21"), txn_end_time = c("2012-12-24T08:38:00",
"2013-11-27T12:47:00", "2013-04-08T11:49:00", "2013-06-04T15:35:00",
"2013-11-14T13:00:00", "2013-05-28T11:05:00"), total_sales = c(48.86,
69.7, 8.53, 33.46, 39.19, 35.56), total_units = c(12L, 44L,
3L, 4L, 14L, 17L), gross_margin = c(0, 0, 0, 0, 0, 0)), .Names = c("txn_ext_id",
"txn_desc", "txn_type_id", "site_id", "date_id", "device_id",
"cust_id", "empl_id", "txn_start_time", "txn_end_time", "total_sales",
"total_units", "gross_margin"), class = c("data.table", "data.frame"
), row.names = c(NA, -6L), .internal.selfref = <pointer: 0x00000000002c0788>)