How to calculation the number of occurrences of each url

Asked Sep 08 '20 at 08:11

Active Sep 08 '20 at 08:11

Viewed 38 times

I have some troubles related to calculation the number of occurrences of each url in my log file. I have one working variant but I'm sure that I can do it better:

import pandas as pd
import numpy as np

df = pd.DataFrame({'url_id' : [1,2,3,4,2,2,4], 'url' : ['microsoft.com', 'yandex.ru', 'google.com', 'bbc.com', 'yandex.ru', 'yandex.ru', 'bbc.com']})
df['dummy'] = 1
print(df.groupby(['url_id', 'url'])['dummy'].sum())

Output is:

url_id  url          
1       microsoft.com    1
2       yandex.ru        3
3       google.com       1
4       bbc.com          2

Name: dummy, dtype: int64

asked Sep 08 '20 at 08:11

Roman Kazmin

1

Use `df = df.groupby(['url_id', 'url']).size().reset_index(name='count')` – jezrael Sep 08 '20 at 08:12
second answer in dupe. – jezrael Sep 08 '20 at 08:12

How to calculation the number of occurrences of each url

0 Answers0