0
import pandas as pd

# Read CSV data file:
df = pd.read_csv('~/nclab-data-read/titanic.csv')

# Port where most passengers embarked:
port = df['Embarked'].mode()[0]
**# Count these passengers:
n_port = df[['Name']].loc[df['Embarked'] == 1].count()[0]**

I believe I have something incorrect in the bottom row, but can't figure out what.

9769953
  • 10,344
  • 3
  • 26
  • 37
  • `sum(df['Embarked'] == port)` is probably enough for what you want. – 9769953 May 15 '21 at 20:48
  • 1
    To improve this and future questions please include a small subset of your data as a __copyable__ piece of code that can be used for testing as well as your expected output. See [MRE - Minimal, Reproducible, Example](https://stackoverflow.com/help/minimal-reproducible-example), and [How to make good reproducible pandas examples](https://stackoverflow.com/q/20109391/15497888). – Henry Ecker May 15 '21 at 20:49
  • Geesh, so simple. Thank you very much! – NVsquirrel May 16 '21 at 00:04
  • Series.value_counts works well here, df['embarked'].value_counts().head(1) – Vaishali May 16 '21 at 00:26

1 Answers1

0

count() returns the number of non-null values. If applied to a DataFrame, it returns an array with 1 value per column (hence you need to take the index 0).

When applied to a Series, you get the number directly.

n_port = df.loc[df['Embarked'] == 1, 'Name'].count()

Obviously, both lines will return the same result.

simon
  • 615
  • 4
  • 13