Calculate the number of non empty cells when the name of a column contains 'XXX'

Question

I have 59 columns whose name is in the format: nn: xxxxxx (ttttttt), where tttttt is some name which is repeated for some particular columns. Now I want to calculate the sum of non-empty cells when tttttt='XXXXXX'. I know how to calculate the number of non-empty cells in a column, but how do I add the condition of ttttt being XXXXXX in the name of a column?

import pandas as pd
df = pd.read_csv("dane.csv", sep=';')
shape = list(df.shape)
nonempty=df.apply(lambda x: shape[0]-x.isnull().sum())

Input:

1: Brandenburg (Post-Panamax)               2: Acheron (Feeder)                        5: Fenton (Feeder)
ES-NL-10633096/1938/X1@hkzydbezon.dk/6749   DE-JP-20438082/2066/A2@qwinfhcaer.cu/68849 NL-LK-02275406/2136/A1@ozmmfdpfts.de/73198
BE-BR-61613986/3551/B1@oqk.bf/39927         NL-LK-02275406/2136/A1@ozmmfdpfts.de/73198
PH-SA-39552610/2436/A1@venagi.hr/80578
PA-AE-59814691/4881/X1@zhicvzvksl.cl/25247  OM-PH-31303222/3671/Z1@jtqy.ml/52408

So for instance for this input, lets say I want to calculate the number of non empty cells for the name in the column 'Feeder'

Could you share some of the data from the CSV, so that we can see the headers and execute your code? — AMC, Oct 29 '19 at 19:06
are you nulls properly defined as `NaN` or just a string with spaces? you might need to convert them first — Umar.H, Oct 29 '19 at 19:07
@AlexanderCécile the headers are as in the input example I included. and the nulls are defined fine — ryszard eggink, Oct 29 '19 at 19:20

score 2 · Accepted Answer · answered Oct 29 '19 at 19:05

You can use filter:

df.filter(like='(Feeder)').isna().sum()

or a more precise version, which requires (Feeder) to appear at the end of the column:

df.filter(regex='.*(\(Feeder\))$').isna().sum()

Output:

2: Acheron (Feeder)    1
5: Fenton (Feeder)     3
dtype: int64

Calculate the number of non empty cells when the name of a column contains 'XXX'

1 Answers1