I have a large dataframe (~1 million rows) with 20 string columns that I'm trying to concatenate into a single column with a separator, dropping NA values on the way. (Each row has a variable number of valid entries and NA values.)
Based on the solution here, I can get the output I need using df.apply
but it is very slow:
raw['combined'] = raw.loc[:, 'record_1':'record_20'].apply(lambda x: '|'.join(x.dropna().values), axis=1)
Is there a faster way to do this concatenation or am I stuck with df.apply
?