I want to assign values to a new column, based on the regex match in another column in python-datatable syntax.
DT[get rows by regex , assign value to new column, ]
import pandas as pd
import datatable as dt
from datatable import f, Frame
import re as re
DT = dt.Frame({'a' : [1,2,3,4], 'b' : ['hi', 'foo', 'fat', 'cat']})
DT['new_col']=DT[:,f.b]
DT['new_col'] = Frame([re.sub('f.*','words starting with f', s) for s in DT[:, "new_col"].to_list()[0]])
DT.head()
DT['new_col'] = Frame([re.sub('c.*','words starting with c', s) for s in DT[:, "new_col"].to_list()[0]])
DT.head()
Is there another solution without converting with "to_list()" and more within the datatable package (without a loop)?
Here the result of the Regex in this question does not allow for operations on a whole column: Python data.table row filter by regex This is for pandas but not datatable: How to filter rows in pandas by regex