0

Suppose a dataframe as the following:

df_data=pd.DataFrame({'l_name':[['ABC','DOS','TRES'],['XYZ','MORTGAGE','SOLUTIONS']],
                      'o_name': ['ABC DOS TRES','XYZ MORTGAGE SOLUTIONS']})

where l_name is composed of tokens of a name and o_name a string. I'd like to get something as this output:

    word        o_name
0   ABC         ABC DOS TRES
1   DOS         ABC DOS TRES
2   TRES        ABC DOS TRES
3   XYZ         XYZ MORTGAGE SOLUTIONS
4   MORTGAGE    XYZ MORTGAGE SOLUTIONS
5   SOLUTIONS   XYZ MORTGAGE SOLUTIONS

Then, I tried something like this:

df_words = pd.DataFrame({'word': list(chain.from_iterable(df_data['l_name'])),
                         'o_name': df_data.o_name})

I have been looking how to expand the df_data.o_name to assign the same name to each word it belongs

Thanks a lot for your help :)

John Barton
  • 1,581
  • 4
  • 25
  • 51
  • 2
    what was wrong with the soltions in your other [post](https://stackoverflow.com/questions/59382676/pandas-chain-from-iterable-error-object-of-type-itertools-chain-has-no-len) additionally, `df.explode('word')` will work. – Umar.H Dec 17 '19 at 23:19
  • Can't you just explode the original `df_data`, e.g. `df_data.explode('l_name').reset_index(drop=True)`. – AChampion Dec 17 '19 at 23:31
  • That's right, my bad. I had an older Pandas version which was not allowing me to use the function. I updated to 0.25.1. Thanks a lot guys :) – John Barton Dec 17 '19 at 23:36

0 Answers0