How to Convert Pandas Dataframe to Single List

Question

Suppose I have a dataframe:

    col1    col2    col3
0    1       5       2
1    7       13
2    9       1
3            7

How do I convert to a single list such as:

[1, 7, 9, 5, 13, 1, 7]

I have tried:

df.values.tolist()

However this returns a list of lists rather than a single list:

[[1.0, 5.0, 2.0], [7.0, 13.0, nan], [9.0, 1.0, nan], [nan, 7.0, nan]]

Note the dataframe will contain an unknown number of columns. The order of the values is not important so long as the list contains all values in the dataframe.

I imagine I could write a function to unpack the values, however I'm wondering if there is a simple built-in way of converting a dataframe to a series/list?

The desired outcome list is missing **2** from `col3`. Is that desired, or an oversight? — Trenton McKinney, Aug 23 '19 at 03:42
Sorry, yes that's a typo. Should be `[1, 7, 9, 5, 13, 1, 7, 2]` as you pointed out. — Alan, Aug 23 '19 at 06:25

busybear · Accepted Answer · 2019-08-23T03:37:18.023

15

Following your current approach, you can flatten your array before converting it to a list. If you need to drop nan values, you can do that after flattening as well:

arr = df.to_numpy().flatten()
list(arr[~np.isnan(arr)])

Also, future versions of Pandas seem to prefer to_numpy over values

An alternate, perhaps cleaner, approach is to 'stack' your dataframe:

df.stack().tolist()

edited Aug 23 '19 at 03:37

answered Aug 23 '19 at 03:28

busybear

10,194
1
25
42

Roushan · Answer 2 · 2019-08-23T03:47:34.643

2

you can use dataframe stack

In [12]: df = pd.DataFrame({"col1":[np.nan,3,4,np.nan], "col2":['test',np.nan,45,3]})

In [13]: df.stack().tolist()
Out[13]: ['test', 3.0, 4.0, 45, 3]

edited Aug 23 '19 at 03:47

answered Aug 23 '19 at 03:36

Roushan

4,074
3
21
38

score 1 · Answer 3 · answered Aug 23 '19 at 07:48

For Ordered list (As per problem statement):
Only if your data contains integer values:

Firstly get all items in data frame and then remove the nan from the list.

items = [item for sublist in [df[cols].tolist() for cols in df.columns] for item in sublist]
items = [int(x) for x in items if str(x) != 'nan']

For Un-Ordered list:
Only if your data contains integer values:

items = [int(x) for x in sum(df.values.tolist(),[]) if str(x) != 'nan']

How to Convert Pandas Dataframe to Single List

3 Answers3