How to update a df using a for loop and arrays on Python?

Question

Suppose that I create the following df:

import pandas as pd

#column names
column_names = ["Time", "Currency", "Volatility expected", "Event", "Actual", "Forecast", "Previous"]

#create a dataframe including the column names
df = pd.DataFrame(columns=column_names)

Then, I create the following array that will have the cell values to add to my df:

rows = ["2:00", "GBP", "", "Construction Output (MoM) (Jan)", "1.1%", "0.5%", "2.0%",
        "2:00", "GBP", "", "U.K. Construction Output (YoY) (Jan)", "9.9%", "9.2%", "7.4%"]

So, how can I use a for loop to update my df so it ends up like this:

|Time   |Currency  |Volatility expected    |Event                               |Actual   |Forecast   |Previous  |
------------------------------------------------------------------------------------------------------------------
|02:00  |GBP       |                       |Construction Output (MoM) (Jan)     |1.1%     |0.5%       |2.0%      |
|04:00  |GBP       |                       |U.K. Construction Output (YoY) (Jan)|9.9%     |9.2%       |7.4%      |

I tried:

column_name_location = 0
for row in rows:
    df.at['0', df[column_name_location]] = row
    column_name_location += 1

print(df)

But got:

KeyError: 0

May I get some advice here?

Is `rows` a list of two lists, each of length 7? Or is it a list of 14 strings? — , Mar 11 '22 at 17:12

score 1 · Accepted Answer · answered Mar 11 '22 at 17:21

1

If rows is a flat list of items, you can convert it to a numpy array to reshape it first

Assuming rows is actualy a list of sub-lists, each sub-list being a row, you can create a pd.Series from each row using the dataframe's column names as the Series's index, and then use df.append to append them all:

df.append([pd.Series(r, index=df.columns) for r in rows])

If rows is actually just a flat list, you'll need to convert it to a numpy array to reshape it:

rows = np.array(rows).reshape(-1, 7).tolist()

answered Mar 11 '22 at 17:21

I see, thank you so much sir, may you also please elaborate on the use of `.reshape(-1, 7)`? I understand that as my `df` has 7 columns, it's expected to have `7` cells below each, but what's the `-1` used for? – NoahVerner Mar 11 '22 at 18:49
2

@Noah basically it just says to reshape the array into an array of unknown rows and 7 columns. [See here.](https://stackoverflow.com/questions/18691084/what-does-1-mean-in-numpy-reshape) – Mar 11 '22 at 18:52

score 1 · Answer 2 · answered Mar 11 '22 at 17:45

It looks like you have created one list containing 14 items. You could instead make it as a list containing 2 items where each item is a list with 7 values.

rows = [["2:00", "GBP", "", "Construction Output (MoM) (Jan)", "1.1%", "0.5%", "2.0%"],
       ["2:00", "GBP", "", "U.K. Construction Output (YoY) (Jan)", "9.9%", "9.2%", "7.4%"]]

With this, we can create a dataframe directly as shown below

df = pd.DataFrame(rows, columns=column_names)
print(df)

This outputs 2 rows

   Time Currency Volatility expected                                 Event Actual Forecast Previous
0  2:00      GBP                           Construction Output (MoM) (Jan)   1.1%     0.5%     2.0%
1  2:00      GBP                      U.K. Construction Output (YoY) (Jan)   9.9%     9.2%     7.4%

This one will work pretty well after implementing @richardec's answer, but as I didn't know how to reshape the array, I had to mark richardec's answer as the solution I was looking for. But I also appreciate you took some time to answer this post . — NoahVerner, Mar 11 '22 at 18:52

How to update a df using a for loop and arrays on Python?

2 Answers2