0

I'm pretty new on Python and here's my issue. It's pretty basic !

I'm trying to create a new column (called "SectorCode") based on another column (called "Sector"). For example, if the column "Sector" contains "Materials" Then it should display "10" on my "SectorCode" column. If I have "Industrials", then "20", and so on...

I found other topics but it almost always includes conditions based on number like here, and not based on NaN : https://www.dezyre.com/recipes/insert-new-column-based-on-condition-in-python (That's where i got my inspiration to build my code)

Here's my failing code :

import pandas as pd
import numpy as np
sector = pd.read_csv (r"C:\Users\alexa\sector.csv")
sector

dframe = pd.DataFrame(sector)
dframe.columns
Index(['Ticker', 'Sector'], dtype='object')


Sectorcode = []
for row in dframe['Sector']:
    if row = ('Energy') : Sectorcode.append(10)
        elif row = ('Materials') : Sectorcode.append(15)
        elif row = ('Industrials') : Sectorcode.append (20)
        elif row = ('Consumer Discretionary') : Sectorcode.append (25)
        elif row = ('Cosumer Staples') : Sectorcode.append (30)
        elif row = ('Health Care') : Sectorcode.append (35)
        elif row = ('Financials') : Sectorcode.append (40)
        elif row = ('Information Technology') : Sectorcode.append (45)
        elif row = ('Communication Services') : Sectorcode.append (50)
        elif row = ('Utilities'): Sectorcode.append (55)
        elif row = ('Real Estate'): Sectorcode.append (60)
        else : Sectorcode.append (0)
df['Sectorcode']= Sectorcode`

I get this error message :

"  File "<ipython-input-8-98c65bbfd42a>", line 3
    if row = ('Energy') : Sectorcode.append(10)
       ^

SyntaxError: invalid syntax"

My actual table looks like this :

         Ticker         Sector               
0         MSFT     Information Technology       
2         AAPL     Information Technology       
3         AMZN      Consumer Discretionary      
4         FB       Communication Services       
5         BRK.B        Financials               
6         XOM             Energy                
7         JNJ             Health Care     

etc....

And I'd like to have something like this :

         Ticker         Sector                 SectorCode
0         MSFT     Information Technology       45
2         AAPL     Information Technology       45
3         AMZN      Consumer Discretionary      25
4         FB       Communication Services       50
5         BRK.B        Financials               40
6         XOM             Energy                10
7         JNJ             Health Care           35

etc....

Thank you for your help ! :)

EDIT :

The following code worked :

Sectorcode = []
for row in dframe['Sector']:
    if row == ('Energy') : Sectorcode.append(10)
    elif row == ('Materials') : Sectorcode.append(15)
    elif row == ('Industrials') : Sectorcode.append (20)
    elif row == ('Consumer Discretionary') : Sectorcode.append (25)
    elif row == ('Cosumer Staples') : Sectorcode.append (30)
    elif row == ('Health Care') : Sectorcode.append (35)
    elif row == ('Financials') : Sectorcode.append (40)
    elif row == ('Information Technology') : Sectorcode.append (45)
    elif row == ('Communication Services') : Sectorcode.append (50)
    elif row == ('Utilities'): Sectorcode.append (55)
    elif row == ('Real Estate'): Sectorcode.append (60)
    else : Sectorcode.append (0)
dframe['Sectorcode']= Sectorcode
alexnesov
  • 125
  • 2
  • 6

1 Answers1

1

You are using one = which stands for assignment when you need to use two == to test for a condition.

Dave Fol
  • 515
  • 3
  • 11