I have a very large dataset (10 GB) in csv format with various columns and rows. One of the columns is IDs (represented as strings) of some class of individuals. The IDs are all scrambled in the data, and each individual ID may occur more than once. I'd like to find the ID of the individual that occurs most frequently in the data. Ideally, I would like a count of how many times each ID occurs in the dataset. Eventually I'd also like to do statistical analysis on the individual ids. Whats the fastest way to accomplish this. I did try groupby, but don't know how to find the ID corresponding to the groups, and their size.
import pandas as pd
df = pd.read_csv('file')
user_groups = df.groupby(['IDs'])