Calculating new rows in a Pandas Dataframe on two different columns

Question

So I'm a beginner at Python and I have a dataframe with Country, avgTemp and year. What I want to do is calculate new rows on each country where the year adds 20 and avgTemp is multiplied by a variable called tempChange. I don't want to remove the previous values though, I just want to append the new values.

This is how the dataframe looks:

Preferably I would also want to create a loop that runs the code a certain number of times Super grateful for any help!

If you need to copy the values from the dataframe as an example you can have it here:

Country        avgTemp        year

0 Afghanistan    14.481583    2012

1 Africa         24.725917    2012

2 Albania        13.768250    2012

3 Algeria        23.954833    2012

4 American Samoa 27.201417    2012

243 rows × 3 columns

Welcome to StackOverflow. Please take the time to read this post on [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and revise your question accordingly also [Please don't post images of code/data (or links to them)](http://meta.stackoverflow.com/questions/285551/why-may-i-not-upload-images-of-code-on-so-when-asking-a-question) — anky, Jan 29 '20 at 18:34
What is the issue, exactly? Have you actually tried anything, done any research? — AMC, Jan 29 '20 at 20:03

score 1 · Accepted Answer · answered Jan 29 '20 at 18:34

If you want to repeat the rows, I'd create a new dataframe, perform any operation in the new dataframe (sum 20 years, multiply the temperature by a constant or an array, etc...) and use then use concat() to append it to the original dataframe:

import pandas as pd
tempChange=1.15
data = {'Country':['Afghanistan','Africa','Albania','Algeria','American Samoa'],'avgTemp':[14,24,13,23,27],'Year':[2012,2012,2012,2012,2012]}
df = pd.DataFrame(data)
df_2 = df.copy()
df_2['avgTemp'] = df['avgTemp']*tempChange
df_2['Year'] = df['Year']+20
df = pd.concat([df,df_2]) #ignore_index=True if you wish to not repeat the index value
print(df)

Output:

          Country  avgTemp  Year
0     Afghanistan    14.00  2012
1          Africa    24.00  2012
2         Albania    13.00  2012
3         Algeria    23.00  2012
4  American Samoa    27.00  2012
0     Afghanistan    16.10  2032
1          Africa    27.60  2032
2         Albania    14.95  2032
3         Algeria    26.45  2032
4  American Samoa    31.05  2032

This is great! Thank you so much, if you have the time I have a second question, so currently I'm trying to make a loop so I can then do the same calculations for "df_2" and return "df_3", and keep doing this until I have a certain amount of new dataframes that I can then concatinate together. Thank you for your help! :) — Timo, Jan 30 '20 at 13:19
Sure, open a new question with the issue and I'll help you out! (Leave a comment here with the link to the new question so I can check it out) — Celius Stingher, Jan 30 '20 at 13:23
Thank you! https://stackoverflow.com/questions/59987492/looping-this-code-to-get-new-dataframe-based-on-previous-calculated-dataframe — Timo, Jan 30 '20 at 13:47

score 0 · Answer 2 · answered Jan 29 '20 at 18:27

0

where df is your data frame name:

 df['tempChange'] = df['year']+ 20  * df['avgTemp']

This will add a new column to your df with the logic above. I'm not sure if I understood your logic correct so the math may need some work

answered Jan 29 '20 at 18:27

Alireza Tajadod

327
1
8

score 0 · Answer 3 · answered Jan 29 '20 at 18:33

0

I believe that what you're looking for is

dfName['newYear'] =  dfName.apply(lambda x: x['year'] + 20,axis=1)
dfName['tempDiff'] =  dfName.apply(lambda x: x['avgTemp']*tempChange,axis=1)

This is how you apply to each row.

answered Jan 29 '20 at 18:33

Gorlomi

515
2
11

Calculating new rows in a Pandas Dataframe on two different columns

3 Answers3