1

Hi I am trying to assign certain values in columns of a dataframe.

# Count the number of title counts
full.groupby(['Sex', 'Title']).Title.count()
Sex     Title        
female   Dona              1
         Dr                1
         Lady              1
         Miss            260
         Mlle              2
         Mme               1
         Mrs             197
         Ms                2
         the Countess      1
male     Capt              1
         Col               4
         Don               1
         Dr                7
         Jonkheer          1
         Major             2
         Master           61
         Mr              757
         Rev               8
         Sir               1
Name: Title, dtype: int64

My tail of dataframe looks like follows:

    Age Cabin   Embarked    Fare    Name    Parch   PassengerId Pclass  Sex SibSp   Survived    Ticket  Title
413 NaN NaN S   8.0500  Spector, Mr. Woolf  0   1305    3   male    0   NaN A.5. 3236   Mr
414 39.0    C105    C   108.9000    Oliva y Ocana, Dona. Fermina    0   1306    1   female  0   NaN PC 17758    Dona
415 38.5    NaN S   7.2500  Saether, Mr. Simon Sivertsen    0   1307    3   male    0   NaN SOTON/O.Q. 3101262  Mr
416 NaN NaN S   8.0500  Ware, Mr. Frederick 0   1308    3   male    0   NaN 359309  Mr
417 NaN NaN C   22.3583 Peter, Master. Michael J    1   1309    3   male    1   NaN 2668    Master

The name of my dataframe is full and I want to change names of Title.

Here is the following code I wrote :

# Create a variable rate_title to modify the names of Title
rare_title = ['Dona', "Lady", "the Countess", "Capt", "Col", "Don", "Dr", "Major", "Rev", "Sir", "Jonkheer"]
# Also reassign mlle, ms, and mme accordingly
full[full.Title == "Mlle"].Title = "Miss"
full[full.Title == "Ms"].Title = "Miss"
full[full.Title == "Mme"].Title = "Mrs"
full[full.Title.isin(rare_title)].Title = "Rare Title"

I also tried the following code in pandas:

full.loc[full['Title'] == "Mlle", ['Sex', 'Title']] = "Miss"

Still the dataframe is not changed. Any help is appreciated.

cs95
  • 379,657
  • 97
  • 704
  • 746
Jd Baba
  • 5,948
  • 18
  • 62
  • 96
  • Possible duplicate of [Update row values where certain condition is met in pandas](https://stackoverflow.com/questions/36909977/update-row-values-where-certain-condition-is-met-in-pandas) – Jesse Nov 24 '17 at 00:43
  • 1
    @JesseBarnett Can you find another duplicate please? That answer is a mess, and doesn't really address this question. – cs95 Nov 24 '17 at 00:51

1 Answers1

2

Use loc based indexing and set matching row values -

miss = ['Mlle', 'Ms', 'Mme']
rare_title = ['Dona', "Lady", ...]

df.loc[df.Title.isin(miss), 'Title'] = 'Miss'
df.loc[df.Title.isin(rare_title), 'Title'] = 'Rare Title'
cs95
  • 379,657
  • 97
  • 704
  • 746
  • Thank you coldspeed. I will check your solution and get back to you in a bit – Jd Baba Nov 24 '17 at 01:06
  • I applied your code and checked the head and tail of the dataframe and seems like the title is not changed. – Jd Baba Nov 24 '17 at 01:26
  • @JaneshDevkota I have a few ideas as to why. How have you loaded your data? It appears a lot of them have leading white space characters. – cs95 Nov 24 '17 at 01:29
  • You are right. Looks like might code didn't work because of the white space characters. I removed them and it is working fine. – Jd Baba Nov 24 '17 at 01:37
  • Good going, I thought that was it. – cs95 Nov 24 '17 at 01:39