0

My dataframe looks like this. 3 columns. All I want to do is write a FUNCTION that, when the first two columns are inputs, the corresponding third column (GHG intensity) is the output. I want to be able to input any property name and year and achieve the corresponding GHG intensity value. I cannot stress enough that this has to be written as a function using def. Please help!

                             Property Name  Data Year  \
467                             GALLERY 37       2018   
477                        Navy Pier, Inc.       2016   
1057                            GALLERY 37       2015   
1491                       Navy Pier, Inc.       2015   
1576                            GALLERY 37       2016   
2469                   The Chicago Theatre       2016   
3581                       Navy Pier, Inc.       2014   
4060                        Ida Noyes Hall       2015   
4231               Chicago Cultural Center       2015   
4501                            GALLERY 37       2017   
5303                         Harpo Studios       2015   
5450                   The Chicago Theatre       2015   
5556               Chicago Cultural Center       2016   
6275   MARTIN LUTHER KING COMMUNITY CENTER       2015   
6409   MARTIN LUTHER KING COMMUNITY CENTER       2018   
6665                        Ida Noyes Hall       2017   
7621                        Ida Noyes Hall       2018   
7668   MARTIN LUTHER KING COMMUNITY CENTER       2017   
7792                   The Chicago Theatre       2018   
7819                        Ida Noyes Hall       2016   
8664   MARTIN LUTHER KING COMMUNITY CENTER       2016   
8701                   The Chicago Theatre       2017   
9575               Chicago Cultural Center       2017   
10066              Chicago Cultural Center       2018   

       GHG Intensity (kg CO2e/sq ft)  
467                             7.50  
477                            22.50  
1057                            8.30  
1491                           23.30  
1576                            7.40  
2469                            4.50  
3581                           17.68  
4060                           11.20  
4231                           13.70  
4501                            7.90  
5303                           18.70  
5450                             NaN  
5556                           10.30  
6275                           14.10  
6409                           12.70  
6665                            8.30  
7621                            8.40  
7668                           12.10  
7792                            4.40  
7819                           10.20  
8664                           12.90  
8701                            4.40  
9575                            9.30  
10066                           7.50 

2 Answers2

0

Here is an example, with a a different data frame to test:

import pandas as pd

df = pd.DataFrame(data={'x': [3, 5], 'y': [4, 12]})

def func(df, arg1, arg2, arg3):
    ''' agr1 and arg2 are input columns; arg3 is output column.'''
    df = df.copy()
    df[arg3] = df[arg1] ** 2 + df[arg2] ** 2
    return df

Results are:

print(func(df, 'x', 'y', 'z'))

   x   y    z
0  3   4   25
1  5  12  169
jsmart
  • 2,921
  • 1
  • 6
  • 13
  • but in this example a new column is created. I don't want to create a new column at all. I want to print a pre-existing 3rd column with the first two as inputs. How do I do that? Thanks! – plzhelp Aug 10 '20 at 15:55
  • Please can you provide a small sample input data frame and the expected output? More info here: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – jsmart Aug 10 '20 at 16:03
  • the entire dataframe is above. i want the inputs, for example, to be year= 2016 and property_name= GALLERY 37, and the output to then be 7.4 (the corresponding value in the 3rd column. Does that make sense? – plzhelp Aug 10 '20 at 16:07
  • I mis-understood. If you want to perform a look-up, then the response from @Kuldip Chaudhari would work. – jsmart Aug 10 '20 at 16:11
0

You can try this code

def GHG_Intensity(PropertyName, Year):
    Intensity = df[(df['Property Name']==PropertyName) & (df['Data Year']==Year)]['GHG Intensity (kg CO2e/sq ft)'].to_list()
    return Intensity[0] if len(Intensity) else 'GHG Intensity Not Available'

print(GHG_Intensity('Navy Pier, Inc.', 2016))
Kuldip Chaudhari
  • 1,112
  • 4
  • 8
  • IT WORKED. I have been struggling with this for so long. Thank you. You have saved me. I hoe your life brings you eternal joy and happiness. – plzhelp Aug 10 '20 at 16:14