0

I have a dataframe in Python of dimensions 21392x1972. What I would like to accomplish is to convert the data frame into a list of lists, such that the first column of my data frame is the first list within the long list, second column of data frame the second list with the one long list and so on.

I tried to use tolist() to convert the data frame to a list of lists. What is happening now is that each row of my data frame is becoming one list within the long list. But, what I would like to accomplish is that each column of data frame should become one list within the long list. I am new to using Pandas and Python, so any help in this regard is highly appreciated. Cheers!

import pandas as pd
mydataset = pd.read_csv('final_merged_data.csv')
mydataset_seq = mydataset.values.tolist() 
JChat
  • 784
  • 2
  • 13
  • 33

2 Answers2

2

Loop through all columns in your dataframe, index that column, and convert it to a list:

lst = [df[i].tolist() for i in df.columns]

Example:

df = pd.DataFrame({'a' : [1, 2, 3, 4],
'b' : [5, 6, 7, 8]})

print(df)
print('list', [df[i].tolist() for i in df.columns])

Output:

   a  b
0  1  5
1  2  6
2  3  7
3  4  8
'list' [[1, 2, 3, 4], [5, 6, 7, 8]]
Primusa
  • 13,136
  • 3
  • 33
  • 53
  • Thank you so much. Can you also kindly tell how I can retain my column names in dataframe while converting it into lists? Right now using tolist() removes the column names as per my code. Actually all my columns in dataframes have names and I would like to retain the column names in the list of lists. – JChat Jan 18 '19 at 08:45
  • I'm so sorry, I legitimately didn't see this comment. Do you still need this? – Primusa Mar 01 '19 at 22:13
  • it would be wonderful if you could help with this please. Basically, I want to know if there is a way to retain column names in the list of lists? – JChat Mar 01 '19 at 22:27
  • You really just have to get each column name from `df.columns`: `[[i] + df[i].tolist() for i in df.columns])` – Primusa Mar 02 '19 at 02:27
  • Thanks. Does this mean I am converting df to a list and assigning it to the columns of df? What is df.columns: doing here, can you please clarify. Sorry to ask, but I just wanted to know why assignment operator = is not used here. – JChat Mar 02 '19 at 12:55
  • `columns` is an attribute of the dataframe that is accessed using `df.columns`. It's a list of column names. What you're doing getting the column name (`[i]`) and adding it to the column values (`df[i].tolist()`). To build the list you don't need the assignment operator, but if you want to assign the value of the list to a variable you will - `my_var = [[i] + df[i].tolist() for i in df.columns]]` – Primusa Mar 02 '19 at 17:44
0

Just transpose what you have

mydataset_seq = mydataset.T.values.tolist() 
Ricky Kim
  • 1,992
  • 1
  • 9
  • 18