I use the following python code to read a CSV with 50K rows. Every row has a 4 digit code, for example '1234'.
import csv
import pandas as pd
import re
df = pd.read_csv('Parkingtickets.csv', sep=';',encoding='ISO-8859-1')
df['Parking tickets']
I would like to sort the code and get the count of the top 5 occurrence of the same code.
codes = df['Parking tickets']
Counter(codes).most_common(5)
With this is got kind of what I'm looking for, but it doesn't count only the digit codes and some may have two codes in the same row. How can I use "re.findall(r'\d{4}')"? I know I need to use it, but don't understand how to implement it.