How to use dataframe column in for loop

Question

I am trying to implement a formula to create a new column in Dataframe using existing column but that column is a summation from 0 to a number present in some other column.

I was trying something like this:

dataset['B']=sum([1/i for i in range(dataset['A'])])

I know something like this would work dataset['B']=sum([1/i for i in range(10)])

but I want to make this 10 dynamic based on some different column.

I keep on getting this error.

TypeError: 'Series' object cannot be interpreted as an integer

Ahsan · Answer 1 · 2019-07-02T09:03:06.587

First of all, I should admit that I could not understand you question completely. However, what I understood that you want to iterate over the rows of a DataFrame and make a new column by doing some operation/s on that value. Is that is so, then I would recommend you following link
Regarding TypeError: 'Series' object cannot be interpreted as an integer: The init signature range() takes integers as input. i.e [i for i in range(10)] should give you [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]. However, if one of the value from your dataset['A'] is float, or not integer , this might result in the error you are having. Moreover, if you notice, the first value is a zero, as a result, 1/i should result in a different error. As a result, you might have to rewrite the code as [1/i for i in range (1 , row_value_of_dataset['A'])]

It will be highly appreciate if you could make an example of what you DataFrame might look like and what is your desired output. Then perhaps it is easier to post a solution.

BTW forgot to post what I understood from your question:

#assume the data:
>>>import pandas as pd
>>>data = pd.DataFrame({'A': (1, 2, 3, 4)})
#the data
>>>data
  A
0  1
1  2
2  3
3  4
#doing operation on each of the rows
>>>data['B']=data.apply(lambda row: sum([1/i for i  in range(1, row.A)] ), axis=1)
# Column B is the newly added data
>>>data
   A         B
0  1  0.000000
1  2  1.000000
2  3  1.500000
3  4  1.833333

Thanks for the solution.This method is too slow as I am working on very big data. Can you please suggest some optimized way. — Arun, Jul 02 '19 at 11:09

score 0 · Answer 2 · answered Jul 02 '19 at 09:05

Perhaps explicitly use cumsum, or even apply?

Anyway trying to move an array/list item directly into a dataframe and seem to view this as a dictionary. Try something like this, I've not tested it,

array_x = [x, 1/x for x in dataset.values.tolist()] # or `dataset.A.tolist()`
df = pd.DataFrame(data=(np.asarray(array_x)))
df.columns = [A, B]

Here the idea is to break the Series apart into a list, and input the list into a dataframe. This can be explicitly done without needing to go Series->list->dataframe and is not very efficient.

How to use dataframe column in for loop

2 Answers2