1

I have the following challenge:

I have a dataframe that looks like this after selecting the column I need from an import:

user_id    datetime
1          1473225887
1          1373225887
1          1673225887
2          1173225887
2          1573225887

What I would like to do is two fold: (1) convert the datetime values to a normal date notation, rather than the unix_timestamp, using the datetime function. I have not managed to do this yet.

(2) group the data on user_id, and only keep the first datetime (so the earliest date) of every user_id.

The code that I have written so far is below. Note that I am a beginner in Python, I have not yet managed classes so I'd like to start off without classes.

I hope you can help me out here! Thanks a lot in advance!

def run():
    engagement_dataset = import_engagements()
    engagement_dataset_first_event = first_engagement(engagement_dataset)

def import_engagements():
    df_engagements = pd.read_csv('df_engagements.csv',
                                 sep=';')
    required_columns = ['engagement_unix_timestamp', 'user_id']
    df_engagements = df_engagements[required_columns]
    df_engagements.rename(columns={'engagement_unix_timestamp': 'datetime'}, inplace=True)
    return df_engagements

def first_engagement(engagement_dataset):
    engagement_dataset_grouped = engagement_dataset.groupby(['user_id'])['datetime'].idxmin().reset_index()
        return engagement_dataset_grouped

run()
julien1337
  • 37
  • 4
  • [Here](https://stackoverflow.com/questions/19801727/convert-datetime-to-unix-timestamp-and-convert-it-back-in-python) is an answer that discusses unix datetime conversions, and for the second part, you should be able to use `groupby().min()` rather than `idxmin`to get you started – G. Anderson Oct 25 '18 at 15:36

1 Answers1

1

(1) You can convert a unix formatted datetime with:

df['datetime_formatted'] = pd.to_datetime(df['datetime'], unit='s')

(2) Then you can group by user and aggregate via agg finding the minimum date for that user:

df.groupby('user_id').agg({'datetime_formatted':'min'})
Franco Piccolo
  • 6,845
  • 8
  • 34
  • 52