I have a data table with the structure as given below:
structure(list(GVKEY1 = c(2721, 113609, 62634, NA, 62599, 15855,
15855, NA, NA, NA), GVKEY2 = c(NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
), GVKEY3 = c(NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_), GVKEY4 = c(NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_), GVKEY5 = c(NA_real_, NA_real_, NA_real_,
NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_, NA_real_
)), .Names = c("GVKEY1", "GVKEY2", "GVKEY3", "GVKEY4", "GVKEY5"
), class = c("data.table", "data.frame"), row.names = c(NA, -10L
))
I want to create a new column which is the maximum value of all the five columns. Notice that the data has a lot of NAs.
I wrote the following line
patent <- patent[, GVKEY := lapply(.SD, max, na.rm = TRUE), .SDcols = c('GVKEY1', 'GVKEY2', 'GVKEY3', 'GVKEY4', 'GVKEY5')]
I get the following output.
Warning messages:
1: In[.data.table
(patent, ,:=
(GVKEY, lapply(.SD, max, na.rm = TRUE)), :
Supplied 5 items to be assigned to 3280338 items of column 'GVKEY' (recycled leaving remainder of 3 items).
2: In[.data.table
(patent, ,:=
(GVKEY, lapply(.SD, max, na.rm = TRUE)), :
Coerced 'list' RHS to 'double' to match the column's type. Either change the target column to 'list' first (by creating a new 'list' vector length 3280338 (nrows of entire table) and assign that; i.e. 'replace' column), or coerce RHS to 'double' (e.g. 1L, NA_[real|integer]_, as.*, etc) to make your intent clear and for speed. Or, set the column type correctly up front when you create the table and stick to it, please.
Not sure what I am doing wrong. It would be great if someone can help me.