0

Im working on titanic dataset ( from kaggle webiste). This dataset got a variable called "Siblings/Spouse Aboard" aka (sibsp) with the values (0,1,2,3,4,5,8).im trying to achieve to find how many siblings were onboard in the ship. Age 20 is the threshold to interpret the values, for example if the age is 19, then interpret the number in Siblings/Spouse Aboard as siblings aboard. But if the age is 20, then interpret the number as spouse aboard unless the number is greater than 1, then interpret the number as siblings aboard. My code is given below, but i got stuck ( i sent 2 day reading the blogs, watching youtube , but im missing something), Please advise!

df.loc[np.logical_and(df.Age>19, df['Siblings/Spouses Aboard'] >1), ['sibling']] = 1
df.loc[np.logical_and(df.Age < 20, df['Siblings/Spouses Aboard']>0),['sibling']] = 1
df.groupby(['sibling', 'Age'])['sibling'].count

Thank you in advance for any help

Dataset

Sibling Column values

Tarik
  • 10,810
  • 2
  • 26
  • 40
learner
  • 9
  • 2
  • "But if the age is 20, then interpret the number as spouse aboard" Only at exactly 20? – Tarik Dec 12 '20 at 02:32
  • There are many ways to do this. One way is: `df.loc[(df['Age'] >= 20) & (df['Siblings/Spouses Aboard'] > 0), ['Siblings/Spouses Aboard']].count()` – David Erickson Dec 12 '20 at 02:35
  • 1
    @DavidErickson - Thanks for your response!, i was able to obtain the results on what i was looking for. – learner Dec 12 '20 at 03:44

0 Answers0