Thanks to the help I've received with filtering my text file
example of the text file data:
user_1384 visit_2184 1330746454
user_1385 visit_2185 1330776888
user_1385 visit_2185 1330776913
user_1386 visit_2186 1330794280
user_1387 visit_2187 1330800094
user_1388 visit_2188 1330805203
user_1388 visit_2188 1330805217
in this thread:
Filtering results of a Counter function
For my hobby project I've picked the solution to filter the data with pandas module and it works like a charm.
The code:
import pandas as pd
df = pd.read_csv("zadanie_3_dane.txt", header=None, sep='\s+')
df.columns = ['users', 'visits', 'dates']
n = 1
print(df['users'].value_counts()[:n])
print(df['visits'].value_counts()[:n])
The next thing I want to learn is to count the number of 'users', that started their 'visits' between a certain hour ( f.e. 12:00 hour and 16:00 hour )
The beggining of a 'visit' would be the first time a 'user' logs in. I want to count only the unique 'users', I don't want to count the duplicate users.
I've read that I should ( should I ? ) first change my datestamp format to the hour format:
df.index = pd.to_datetime(df.index)
print((df['users'].between_time('12:00', '16:00')))
My puny attempt doesn't work and once again i bow down to the knowledge of the mighty Stack.
When I'll understand the above I'd like to learn also how to calculate the maximum number of the 'visit's that happen at the same time.
If anyone has any leads for things I want to learn, your help will be greatly appreaciated.
Cheers!