Converting list of strings in pandas column into string

Question

I have a df with a column like this:

                       words
1                     ['me']
2                   ['they']
4         ['it', 'we', 'it']
5                         []
6         ['we', 'we', 'it']

I want it to look like this:

                     words
1                     'me'
2                   'they'
4               'it we it'
5                       ''          
6               'we we it'

I have tried both these options, but they both yield in a result identical to the original series.

def join_words(df):
    words_string = ''.join(df.words)
    return words_string

master_df['words_string'] = master_df.apply(join_words, axis=1)

and...

master_df['words_String'] = master_df.words.str.join(' ')

Both these result in the original df. What am I doing wrong?

Edit

Using master_df['words_string'] = master_df['words'].apply(' '.join), I got:

1                                     [ ' m e ' ]
2                                 [ ' t h e y ' ]
4             [ ' i t ' ,   ' w e ' ,   ' i t ' ]
5                                             [ ]
6             [ ' w e ' ,   ' w e ' ,   ' i t ' ]

ummm may be it is not an actual list? else `master_df.words.str.join(' ')` should work, check `ast.literal_eval` if they are just the string repr of a list , its better to include `df.head().to_dict()` in your question too — anky, Feb 20 '20 at 19:32
`df['words'].apply(literal_eval).agg(' '.join)` if it's a list not a string — Umar.H, Feb 20 '20 at 19:33
Does this answer your question? [Pandas DataFrame stored list as string: How to convert back to list?](https://stackoverflow.com/questions/23111990/pandas-dataframe-stored-list-as-string-how-to-convert-back-to-list) — AMC, Feb 20 '20 at 19:48
Also: https://stackoverflow.com/questions/1894269/convert-string-representation-of-list-to-list. — AMC, Feb 20 '20 at 19:49
Please provide a proper [mcve], especially since we discovered that the contents of your Series are **strings, not lists** as your post currently implies. The formatting in your post needs some editing, but I am unable to do so as we're lacking some accessible and easy to use examples of your data. — AMC, Feb 20 '20 at 19:51

score 4 · Accepted Answer · edited Jun 20 '20 at 09:12

4

Edit:

As your edit shows, it seems the rows are not actually lists but strings interpreted as lists. We can use eval to ensure the format is of type list so as to later perform the join. It seems your sample data is the following:

df = pd.DataFrame({'index':[0,1,2,3,4],
                   'words':["['me']","['they']","['it','we','it']","[]","['we','we','it']"]})

How about this? Using apply with a lambda function which uses ' '.join() for each row (list):

df['words'] = df['words'].apply(eval).apply(' '.join)
print(df)

Output:

   index     words
0      0        me
1      1      they
2      2  it we it
3      3          
4      4  we we it

edited Jun 20 '20 at 09:12

Community

1
1

answered Feb 20 '20 at 19:29

Celius Stingher

17,835
6
23
53

2

Why not `apply(' '.join)`? – Quang Hoang Feb 20 '20 at 19:30
You are right! For a reason I thought it'd be safer to use lambda with lists. Thanks, edited – Celius Stingher Feb 20 '20 at 19:31
That didn't work, I got the result seen in the edit above – connor449 Feb 20 '20 at 19:32
@connor449 looks like your cells are `string`, not `list` type. – Quang Hoang Feb 20 '20 at 19:36
2

Why use `eval()` instead of `ast.literal_eval()` ? – AMC Feb 20 '20 at 19:48

score 1 · Answer 2 · answered Feb 20 '20 at 19:38

1

Generally I'd advise against eval. Here's another approach when the elements are string not list:

words.str.extractall("'(\w*)'").groupby(level=0)[0].agg(' '.join)

Output:

1          me
2        they
4    it we it
6    we we it
Name: 0, dtype: object

answered Feb 20 '20 at 19:38

Quang Hoang

146,074
10
56
74

score 0 · Answer 3 · answered Feb 24 '20 at 12:52

Another idea is using the DataFrame.explode (since version 0.25.0) and the groupby/aggregate methods.

import pandas as pd

# create a list of list of strings
values = [
    ['me'],
    ['they'],
    ['it', 'we', 'it'],
    [],
    ['we', 'we', 'it']
]

# convert to a data frame
df = pd.DataFrame({'words': values})

# explode the cells (with lists) into separate rows having the same index
df2 = df.explode('words')
df2

This creates a table in the long-format giving the following output:

  words
0    me
1  they
2    it
2    we
2    it
3   nan
4    we
4    we
4    it

Now the long-format needs to be joined / aggregated:

# make sure the dtype is string
df2['words'] = df2['words'].astype(str)

# group by the index aggregating all values to a single string
df2.groupby(level=0).agg(' '.join)

giving the output:

      words
0        me
1      they
2  it we it
3       nan
4  we we it

Converting list of strings in pandas column into string

Edit

3 Answers3

Edit:

Linked