1

I am currently working on pandas structure in Python. I wrote a function that extracts data from Pandas data frame and stores it in lists. The code is working but I feel like there is a part that I could write in one for loop instead four for loops. I will give you an example below. The idea of this part of the code is to extract four columns from a pandas data frame into four lists. I did it with 4 separate for loops but I want to have one loop that does the thing.

col1,col1,col1,col1 = [],[],[],[]

for j in abc['col1']:
    col1.append(j)

for k in abc['col2']:
    col2.append(k)

for l in abc['col3']:
    col3.append(l) 

for n in abc['col4']:
    col4.append(n)

And my idea is to write a one for loop that does all the code. I tried to do something like this, but it doesn't work.

col1,col1,col1,col1 = [],[],[],[]

for j,k,l,n in abc[['col1','col2','col3','col4']]
    col1.append(j)
    col2.append(k)
    col3.append(l) 
    col4.append(n)

Can you help me with this idea to wrap four for loops into the one? I would appreciate your help!

Jakub Pluta
  • 147
  • 5
  • 13

3 Answers3

2

You don't need to use loops at all; you can just convert each column into a list directly.

list_1 = df["col"]to_list()

Have a look at this previous question.

Peritract
  • 761
  • 5
  • 13
1

Treating a panda dataframe like a list usually works, but is very bad for performance. I'd consider using the iterrows() function instead. This would work as in the following example:

col1,col2,col3,col4 = [],[],[],[]

for index, row in df.iterrows():
    col1.append(row['col1'])
    col2.append(row['col2'])
    col3.append(row['col3'])
    col4.append(row['col4'])
SerAlejo
  • 473
  • 2
  • 13
1

It's probably easier to use pandas.values and then numpy.ndarray.to_list():

col = ['col1','col2','col3']
data = []*len(col)
for i in range(len(col)):
   data[i] = df[col(i)].values.to_list()

Partha Mandal
  • 1,391
  • 8
  • 14