I am trying to move some old code from a data frame implementation to data table. Initially I obtain my data from a .csv file, where some cells contain arrays which are converted into character strings by fread, like so:
> mydata$sport[1]
[1] "[24, 18, 24, 18]"
I want to parse these strings into numeric arrays. Here's what I've got partly working as a first step (to get rid of the brackets, step 2, not shown here, is to convert to a numeric array):
> name = "ascent"
> paste0(name, ":=strsplit(gsub('^\\[|\\]$','',", name, "),',')")
[1] "ascent:=strsplit(gsub('^\\[|\\]$','',ascent),',')"
#here I manually copy the result of paste0 into the datatable command
#I want to automate this setup, so this all can be put in a for loop
#for many names
> mydata[, ascent:=strsplit(gsub('^\\[|\\]$','',ascent),',')]
> mydata$ascent[10]
[[1]]
[1] "-999" " -999"
So the command I generate to make the modification is good, but I have many names
I want to do this for, so I don't want to copy and paste by hand, as is necessary above. I tried using the eval
trick discussed here dynamic column names in data.table, R
But once I introduce eval
the code doesn't work:
> name = "ascent"
> mydata[, eval(paste0(name, ":=strsplit(gsub('^\\[|\\]$','',", name, "),',')"))]
[1] "ascent:=strsplit(gsub('^\\[|\\]$','',ascent),',')"
So how can I implement this to work for an arbitrary name without having to create a command by hand for each desired name via paste0? I have an entire vector of names
where I would like to do this modification.
Here's the data table right after fread
and before making any modifications:
> mydata[1:10, .(sport, ascent)]
sport ascent
1: [24, 18, 24, 18] [-999, 140.0, -999, 140.0]
2: [2, 2, 2, 22] [-999, -999, -999, -999]
3: [-999, -999, -999, -999] [-999, -999, -999, -999]
4: [-999, -999] [173.0, 173.0]
5: [18, 18] [-999, -999]
6: [-999] [-999]
7: [-999] [-999]
8: [-999] [-999]
9: [-999, -999] [-999, -999]
10: [-999, -999] [-999, -999]