I'm trying to write code that collects data from a source online in a loop and manipulates this data with pandas inside each iteration. Initially I was thinking that I should initialise a dict outside of the loop, grab the data, convert the dict to a dataframe inside the loop, and perform my operations on that. But this feels quite strange to make the dictionary instead of just making a dataframe and append to that in the loop. But as I understand it, pandas is not really "designed" for cell-by-cell updating (rather vectorwise). What would be the most efficient approach to this?
import pandas as pd
d = {'a':[], 'b':[], 'c':[], 'x':[], 'z':[]}
for i in range(100):
d['a'].append(f'some info {i}')
d['b'].append(f'more info {i}')
d['c'].append(i)
d['x'].append(i*2)
d['z'].append(np.nan) # ???
df = pd.DataFrame(d)
# Some function that does calculations on df cols and returns df with new cols
df['z'] = 1