0

I am able to run a count on the entire dataset easily using

import pandas as pd
data['eventcode'].value_counts()

which produces counts for all the unique values in the column 'eventcode'. Now I want to run the same count process but only where a different column has value 1. How should I go about this? Thanks in advance.

Carcigenicate
  • 43,494
  • 9
  • 68
  • 117
wizman243
  • 3
  • 1
  • Does this answer your question? [Python Pandas Counting the Occurrences of a Specific value](https://stackoverflow.com/questions/35277075/python-pandas-counting-the-occurrences-of-a-specific-value) – William Baker Morrison Nov 19 '20 at 17:38

4 Answers4

2

You can first filter using the other column and then execute value_counts() like such:

data[data['othercolumn'] == 1]['eventcode'].value_counts()
ApplePie
  • 8,814
  • 5
  • 39
  • 60
2

You can use df.loc:

data.loc[data['othercolumn'] == 1,'eventcode'].value_counts()
Wasif
  • 14,755
  • 3
  • 14
  • 34
1

If you need to do this for multiple unique values you can groupby + size and then select the value you need for the subset from the result.

import pandas as pd
import numpy as np

np.random.seed(410112)
df = pd.DataFrame({'othercol': np.random.choice(range(3), 100),
                   'eventcode': np.random.choice(list('abc'), 100)})

s = df.groupby(['othercol', 'eventcode']).size()
#othercol  eventcode
#0         a            10
#          b            10
#          c             9
#1         a            17
#          b            15
#          c            10
#2         a            10
#          b            12
#          c             7

# Where `df['othercol'] == 1`
s.loc[1]
#eventcode
#a    17
#b    15
#c    10
ALollz
  • 57,915
  • 7
  • 66
  • 89
0

Basicly iterate through uttering vectors and add to your counter

cnt = 0
def set_count(row):
    if row[different_column_name] == 1
        cnt +=1
data['different_column_name'].apply(set_count)
Vahalaru
  • 380
  • 4
  • 15
  • Careful with this code as it can only be executed once or cnt must be reset manually. Also, using apply() is generally less performant than using the pandas built-in methods and functions. – ApplePie Nov 19 '20 at 17:59