I want to count how often a regex-expression (prior and ensuing characters are needed to identify the pattern) occurs in multiple dataframe columns. I found a solution which seems a litte slow. Is there a more sophisticated way?
column_A | column_B | column_C |
---|---|---|
Test • test abc | winter • sun | snow rain blank |
blabla • summer abc | break • Data | test letter • stop. |
So far I created a solution which is slow:
print(df["column_A"].str.count("(?<=[A-Za-z]) • (?=[A-Za-z])").sum() + df["column_B"].str.count("(?<=[A-Za-z]) • (?=[A-Za-z])").sum() + df["column_C"].str.count("(?<=[A-Za-z]) • (?=[A-Za-z])").sum())