I'd like to ask for help with finishing my python code.
I have a huge text file, filled with 3 columns:
- First has User names, for example: user_003
- Second has number of visits, for example visit_456
- Third has datestamps of these visits.
Example:
(...)
user_123 visit_188 1330796847
user_123 visit_188 1330797173
user_123 visit_189 1330802227
user_123 visit_189 1330802277
user_123 visit_190 1330806287
user_123 visit_190 1330806353
(...)
I've written a small portion of a script that counts the frequencies of ALL words in my text file: user names, visits and datestamps
I can easily print out the number of several first most appearing words (for the moment I've filled the value of the 'most.common' definition with the number 10).
All I need to do now is to filter out the precise results of my script, so I'd be able to show only (not a whole list of the word appearances):
- first: what is the name and the number of the most common visit
- second: what is the name of the user that appears the most in my text file
I've tried several things, but sadly nothing comes to my mind atm. I'll gladly accept any help. Thanks in advance.
My code:
import re
from collections import Counter
with open("bigfile.txt", "r") as f:
data = f.read()
words = re.findall(r'\w+', data)
word_counts = Counter(words).most_common(10)
print(word_counts)
output:
[('user_819', 27), ('user_356', 25), ('visit_637', 25), ('user_520', 24), ('user_1222', 24), ('user_191', 22), ('user_473', 22), ('user_542', 22), ('user_812', 22), ('visit_1383', 22)]