I have two tables/dataframes: users and activity.
In users I have the following columns: UserId, Country, DOB, Gender, RegDate, WeekAfterRegDate
where:
UserId: the Id of each user (only appears once in this table), there only one row for each UserId in this dataframe/table --> It is also the key column that links both tables/dataframes
DOB: date of birth
RegDate: Registration date of the user
WeekAfterRegDate: The date after 7 days since registration
In activity I have the following columns: UserId, Date, Revenue
where:
UserId: the same column as in the users, but it can appear in more than one row here as there dare different revenues
I need to calculate the average revenue generated per user in the first week
And I have been given these clues, which might be useful:
- Merge the 2 datasets
- Calculate the days since registration for each user and date in the activity table
- Consider ALL REVENUE (not just the one generated by each user) generated in the first 7 days after registration for each user
In summary what I need to do is make a loop that sums Renevue between two Dates for each UserId. The period between the two dates is RegDate and WeekAfterRegDate.
I have been trying different methods, like groupby, etc, but I am a bit lost.