I have a dataset which tracks when a user reads a website. A user can read a website and at anytime therefore the user will appear multiple time. I want to create a column which tracks the number of times a user reads a specific website. But since it is a time series, the count should be incremental. I have about 28gbs so pandas will not be able to handle the work load, so I have to write it in sql.
Sample data below:
Date ID WebID
201901 Bob X-001
201902 Bob X-002
201903 Bob X-001
201901 Sue X-001
Expected Results:
Date ID WebID Count
201901 Bob X-001 1
201902 Bob X-002 1
201903 Bob X-001 2
201901 Sue X-001 1