Unravel list within list

Question

I have a jupyter notebook, containing a pandas dataframe, with a column PAR (dtype = obj).

+------+------------------+
|      | PAR              |
+------+------------------+
| 0    | [[1.2.3, 2.3.4]] |
+------+------------------+
| 1    | [[3.2, 3.2]]     |
+------+------------------+

I do not understand how tyo 'clean' each [[list]] in each row, into something like [list].

I can print row contents:

print(df['PAR'][1])
print(', '.join(df['PAR'][1][0]))

This outputs:

[['3.2', '3.2']] 
3.2, 3.2

I can also 'strip' each cell into a string:

# df['PAR'] = df['PAR'].astype(str)
df['PAR'].replace(r'\[','', regex=True, inplace=True) 
df['PAR'].replace(r'\]','', regex=True, inplace=True) 
df['PAR'].replace(r'\'','', regex=True, inplace=True)

This gives a clean-ish string, although this is not the format that I need:

3.2, 3.2

But, what I'm looking for is a 1-level list in each row of my df, something like this:

+------+------------------+------------------+
|      | PAR              | PAR list         |
+------+------------------+------------------+
| 0    | [[1.2.3, 2.3.4]] | [1.2.3, 2.3.4]   |
+------+------------------+------------------+
| 1    | [[3.2, 3.2]]     | [3.2, 3.2]       |
+------+------------------+------------------+

(the spaces between comma and nth element are just for a better reading of the table above).

What would be a common approach to do this?

My next step is converting each new list into a list with only unique elements, following this thread: Get unique values from a list in python

mylist = ['nowplaying', 'PBS', 'PBS', 'nowplaying', 'job', 'debate', 'thenandnow']
myset = set(mylist)
mynewlist = list(myset)

So I'd appreciate some help to 'unlist' the lists in each row. A solution with a lambda-function (.map of .join?) would be easy for me to handle.

For the "unlisting", why not just do `df["PAR"] = [item[0] for item in df["PAR"]]` — Charles Dupont, May 14 '21 at 20:03
Using list comprehension to assign to a column have to ensure there is no `NaN` values in the column, or else it won't work. — SeaBean, May 14 '21 at 20:53
@pljvp. I updated my answer for the second part of your problem. — Corralien, May 14 '21 at 21:11

Corralien · Accepted Answer · 2021-05-14T21:09:50.607

1

Input data:

>>> df
                PAR
0  [[1.2.3, 2.3.4]]
1      [[3.2, 3.2]]

Unlist* and remove duplicates in one step:

df["PAR"] = df["PAR"].str[0].apply(np.unique)

Output data:

>>> df
              PAR
0  [1.2.3, 2.3.4]
1           [3.2]

* Corrected with help from @SeanBean

edited May 14 '21 at 21:09

answered May 14 '21 at 20:08

Corralien

109,409
8
28
52

It is the most informative & complete answer for me to learn. Thank you. And thank you @SeaBean & Corralien for the heads up! – pljvp May 14 '21 at 21:17

SeaBean · Answer 2 · 2021-05-14T20:57:16.000

1

You can simply use .str[0] to access the first and only element of the outer list, effectively removing one level of list, as follows:

df['PAR list'] = df['PAR'].str[0]

Test data preparation:

data = {'PAR': [
[['1.2.3', '2.3.4']],
[['3.2', '3.2']]]
}
df = pd.DataFrame(data)

print(df)

                PAR
0  [[1.2.3, 2.3.4]]
1      [[3.2, 3.2]]

Run new code:

 df['PAR list'] = df['PAR'].str[0]

Result:

print(df)

                PAR        PAR list
0  [[1.2.3, 2.3.4]]  [1.2.3, 2.3.4]
1      [[3.2, 3.2]]      [3.2, 3.2]

edited May 14 '21 at 20:57

answered May 14 '21 at 20:26

SeaBean

22,547
3
13
25

I think `PAR` is not list but a string. – Corralien May 14 '21 at 20:29
@Corralien I just tested out, if `PAR` were a string, it won't give `3.2, 3.2` by the command `print(', '.join(df['PAR'][1][0]))` I think why OP can use `.replace()` was because he/she used `.astype(str)` before that. My first impression seeing the column is of `dtype = obj` was also whether it is of string type. But list items are also listed as `dtype=obj` in `df.info()` – SeaBean May 14 '21 at 20:40
You have probably right! I'll fix my answer. – Corralien May 14 '21 at 20:44
1

@Corralien Glad to have friendly discussion here. We are just answering for leisure :-) – SeaBean May 14 '21 at 20:47
2

Thanks again for your help. +1 – Corralien May 14 '21 at 21:12
@Corralien My pleasure! :-) – SeaBean May 14 '21 at 21:22

Unravel list within list

2 Answers2