2

this is my dataframe:

df = pd.DataFrame({'symbol': ['msft', 'amd', 'bac', 'citi'], 'close': [100, 30, 70, 80]})

I want to add another column called sector that checks the values of symbol and add the values that I want (tech for amd and msft for example).

My desired outcome looks like this:

   symbol  close   sector
  0  msft    100     tech
  1   amd     30     tech
  2   bac     70     bank
  3  citi     80     bank
Amir
  • 978
  • 1
  • 9
  • 26
  • do you have a heuristic defined for when it should add which `sector`? – nnolte Jul 06 '19 at 14:52
  • 1
    So use `pandas.Series.map`. Where is your attempt? – roganjosh Jul 06 '19 at 14:52
  • Probably, https://stackoverflow.com/questions/20250771/remap-values-in-pandas-column-with-a-dict, just assign it to a new column – ALollz Jul 06 '19 at 14:53
  • @kawillzocken if symbol is equal to msft or amd sector is tech and for the other two I want it to be bank – Amir Jul 06 '19 at 14:54
  • @roganjosh can you post the answer please? – Amir Jul 06 '19 at 14:56
  • @Amir, yes that's obvious but what is the underlying logic? Questions and Answers on SO are better posed as generalizable solutions to a specific problem. In this case, do you have some grouping of symbols to different sectors (and how are these defined, in a list, stored in a dict, in another DataFrame)? Or do you need to assign a small subset of symbols to one thing and everything else to another? Solutions will vary depending upon this information. – ALollz Jul 06 '19 at 14:56
  • @ALollz I want to assign each subset to a value. – Amir Jul 06 '19 at 14:58
  • No, why should I just post an answer to a question for which you have shown _no_ effort yourself? I already told you one potential method to use, did you actually look it up? You've already got feedback on some parts of the question that aren't clear, so even if I did post an answer, there's a chance that it doesn't solve your problem anyway – roganjosh Jul 06 '19 at 14:59
  • @Amir there are several ways to do what you want to achieve. Some of the possible ways of doing it include assign, numpy.where, dataframe.where, df[df["col"]]=="value" = 1 (assigning values based on condition) etc... – Biarys Jul 06 '19 at 15:01

2 Answers2

3

In case the sector-symbol relation is a straightforward lookup, you may use something like:

symbol_sector = {
    'amd': 'tech',
    'msft': 'tech',
    'bac': 'bank',
    'citi': 'bank'
}

df['sector'] = df['symbol'].map(symbol_sector)

If your relation is 1 to N (one sector for many symbols) you can create the symbol_sector as follows:

sector_symbol = { 
   'tech': {'msft', 'amd'}, 
   'bank': {'bac',  'city'}, 
}

symbol_sector = {
    symbol: sector
    for sector, symbols in sector_symbol.items()
    for symbol in symbols
}
josoler
  • 1,393
  • 9
  • 15
1

heuristic:

def assign_sector(sym): 
    if sym in ['msft', 'amd']: 
        return 'tech'
    return 'bank'

followed by:

df['sector'] = df['symbol'].apply(assign_sector)

The apply function applies the function given in the argument, here assign_sector, to every value in the pd.Series df['symbol']. And this you can then insert into df['sector'], making a new column.

nnolte
  • 1,628
  • 11
  • 25