You can achieve this by using the groupby method in conjunction with the head method in pandas. Here's a solution to keep only the first 100 duplicates for each pair of 'a' and 'b':
import pandas as pd
# Your example DataFrame
data = {'a': [1, 1, 2, 2, 1], 'b': [2, 2, 3, 3, 2], 'c': [3, 3, 4, 4, 3]}
df = pd.DataFrame(data)
# Set the number of duplicates you want to keep
num_dups_to_keep = 100
# Group the DataFrame by columns 'a' and 'b', and keep only the first 'num_dups_to_keep' rows for each group
result = df.groupby(['a', 'b']).head(num_dups_to_keep)
# Reset the index
result = result.reset_index(drop=True)
print(result)
This code snippet will group the DataFrame by the 'a' and 'b' columns, and then keep only the first 100 rows for each group. If you have less than 100 duplicates for a specific pair, it will keep all of them.