2

I have the following dataframe called df containing a variable called Category. I would like to create new_variable with the below mentioned values in Python. How do I do?

df = pd.DataFrame({
    'Category': [
        'Foreign Stocks', 'Stocks China', 'Stocks',
        'Bonds', 'Bonds USA', 'Bonds India'
    ],
    'New_Variable': [
        'Stocks', 'Stocks', 'Stocks',
        'Bonds', 'Bonds', 'Bonds'
    ]
})

df

         Category New_Variable
0  Foreign Stocks       Stocks
1    Stocks China       Stocks
2          Stocks       Stocks
3           Bonds        Bonds
4       Bonds USA        Bonds
5     Bonds India        Bonds

Maybe it is me who cannot translate other similar answers to a solution for me, but I have not been able to find the same question elsewhere.

/Jonas

piRSquared
  • 285,575
  • 57
  • 475
  • 624
Jonas
  • 47
  • 4
  • 2
    Hi @Jonas. Welcome to Stackoverflow! This question doesn't give us enough information and the way you've phrased things is confusing. It is possible that you are not aware of that and may need some guidance. Please read [mcve] to get a better idea. You can always come back and [edit] your question. Basic gist is that we want to see an example of data, verbalized logic, and expectations. – piRSquared Aug 23 '18 at 14:00
  • Also, having your question marked a duplicate is not a big deal. If it is a duplicate, then you should be able to get your answer from the questions that are marked as the dup targets. If it isn't, say so. I personally don't think your question is clear enough to judge if it is a duplicate. Please consider [edit]ing it in order to clarify. If it is a dup still, so be it. Otherwise, we'll remove the dup if appropriate. – piRSquared Aug 23 '18 at 14:07
  • Thank you for your comments. I have tried to rephrase my question. Does it make sense now? – Jonas Aug 23 '18 at 14:09
  • @piRSquared - I agree, so reopend. – jezrael Aug 23 '18 at 14:09
  • 2
    Yes it's clearer. It may still be a dup. I'm searching for one now, but you should have some help below. – piRSquared Aug 23 '18 at 14:15
  • I know I've seen a question like this before but I can't find it. If anyone else finds it, please hammer it or ping me with a link. – piRSquared Aug 23 '18 at 14:33

1 Answers1

0

pandas.Series.str.extract

df.assign(New_Variable=df.Category.str.extract('(Bonds|Stocks)'))

         Category New_Variable
0  Foreign Stocks       Stocks
1    Stocks China       Stocks
2          Stocks       Stocks
3           Bonds        Bonds
4       Bonds USA        Bonds
5     Bonds India        Bonds
piRSquared
  • 285,575
  • 57
  • 475
  • 624
  • I got it. Thank you! – Jonas Aug 23 '18 at 14:18
  • I use `assign` a lot because it creates a new copy of the dataframe. From what I've written, you can reassign to the name `df`... so `df = df.assign(New_Variable=df.Category.str.extract('(Bonds|Stocks)'))` – piRSquared Aug 23 '18 at 14:20