I need help with this python code. I am making an application that will read the mailbox data (mbox.txt) and count the number of email messages per organization (i.e. domain name of the email address) using a database with the following schema to maintain the counts. The top organizational count is 536.
This is the Schema: CREATE TABLE Counts (org TEXT, count INTEGER)
I've tried so many times I just can't get the count of 536. Here's my code below:
import sqlite3
conn = sqlite3.connect('emaildb.sqlite')
cur = conn.cursor()
cur.execute('DROP TABLE IF EXISTS Counts')
cur.execute('''
CREATE TABLE Counts (org TEXT, count INTEGER)''')
fname = input('Enter file name: ')
if (len(fname) < 1): fname = 'mbox.txt'
fh = open(name)
for line in fh:
if not line.startswith('From: '): continue
pieces = line.split()
org = pieces[1]
cur.execute('SELECT count FROM Counts WHERE org = ? ', (org,))
row = cur.fetchone()
if row is None:
cur.execute('''INSERT INTO Counts (org, count)
VALUES (?, 1)''', (org,))
else:
cur.execute('UPDATE Counts SET count = count + 1 WHERE org = ?',
(org,))
conn.commit()
# https://www.sqlite.org/lang_select.html
sqlstr = 'SELECT org, count FROM Counts ORDER BY count DESC LIMIT 10'
for row in cur.execute(sqlstr):
print(str(row[0]), row[1])
cur.close()
The highest number that I got is 195. Here is the output of the code above:
Enter the file name:
zqian@umich.edu 195
mmmay@indiana.edu 161
cwen@iupui.edu 158
chmaurer@iupui.edu 111
aaronz@vt.edu 110
ian@caret.cam.ac.uk 96
jimeng@umich.edu 93
rjlowe@iupui.edu 90
dlhaines@umich.edu 84
david.horwitz@uct.ac.za 67
Here's the link where I got the text file and wrote it to a text file called mbox.txt (https://www.py4e.com/code3/mbox.txt)