I'm pretty new on Python and here's my issue. It's pretty basic !
I'm trying to create a new column (called "SectorCode") based on another column (called "Sector"). For example, if the column "Sector" contains "Materials" Then it should display "10" on my "SectorCode" column. If I have "Industrials", then "20", and so on...
I found other topics but it almost always includes conditions based on number like here, and not based on NaN : https://www.dezyre.com/recipes/insert-new-column-based-on-condition-in-python (That's where i got my inspiration to build my code)
Here's my failing code :
import pandas as pd
import numpy as np
sector = pd.read_csv (r"C:\Users\alexa\sector.csv")
sector
dframe = pd.DataFrame(sector)
dframe.columns
Index(['Ticker', 'Sector'], dtype='object')
Sectorcode = []
for row in dframe['Sector']:
if row = ('Energy') : Sectorcode.append(10)
elif row = ('Materials') : Sectorcode.append(15)
elif row = ('Industrials') : Sectorcode.append (20)
elif row = ('Consumer Discretionary') : Sectorcode.append (25)
elif row = ('Cosumer Staples') : Sectorcode.append (30)
elif row = ('Health Care') : Sectorcode.append (35)
elif row = ('Financials') : Sectorcode.append (40)
elif row = ('Information Technology') : Sectorcode.append (45)
elif row = ('Communication Services') : Sectorcode.append (50)
elif row = ('Utilities'): Sectorcode.append (55)
elif row = ('Real Estate'): Sectorcode.append (60)
else : Sectorcode.append (0)
df['Sectorcode']= Sectorcode`
I get this error message :
" File "<ipython-input-8-98c65bbfd42a>", line 3
if row = ('Energy') : Sectorcode.append(10)
^
SyntaxError: invalid syntax"
My actual table looks like this :
Ticker Sector
0 MSFT Information Technology
2 AAPL Information Technology
3 AMZN Consumer Discretionary
4 FB Communication Services
5 BRK.B Financials
6 XOM Energy
7 JNJ Health Care
etc....
And I'd like to have something like this :
Ticker Sector SectorCode
0 MSFT Information Technology 45
2 AAPL Information Technology 45
3 AMZN Consumer Discretionary 25
4 FB Communication Services 50
5 BRK.B Financials 40
6 XOM Energy 10
7 JNJ Health Care 35
etc....
Thank you for your help ! :)
EDIT :
The following code worked :
Sectorcode = []
for row in dframe['Sector']:
if row == ('Energy') : Sectorcode.append(10)
elif row == ('Materials') : Sectorcode.append(15)
elif row == ('Industrials') : Sectorcode.append (20)
elif row == ('Consumer Discretionary') : Sectorcode.append (25)
elif row == ('Cosumer Staples') : Sectorcode.append (30)
elif row == ('Health Care') : Sectorcode.append (35)
elif row == ('Financials') : Sectorcode.append (40)
elif row == ('Information Technology') : Sectorcode.append (45)
elif row == ('Communication Services') : Sectorcode.append (50)
elif row == ('Utilities'): Sectorcode.append (55)
elif row == ('Real Estate'): Sectorcode.append (60)
else : Sectorcode.append (0)
dframe['Sectorcode']= Sectorcode