I am trying to create a .xlsx log of an entire Outlook inbox. To do this, I have win32com getting the inbox object and iterating through each item.
My issue arises with how slow the process is, as it needs to deal with 10,000-100,000 emails. At the moment I have attempting it with 15,000 emails and it is taking over an hour.
I believe my solution to this is multiprocessing? But I am unable to pass the win32com object to a function, as it cannot be pickled.
import win32com.client
from multiprocessing import Pool, cpu_count
def create_email(data_input):
this_email = [data_input.Subject, data_input.Body]
return this_email
def init_pool():
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
account = outlook.Folders.Item("EMAIL HERE")
inbox = account.Folders.Item("Inbox").Items
p = Pool(cpu_count())
inbox_list = p.map(create_email, inbox)
if __name__ == '__main__':
init_pool()
Replace EMAIL HERE with the email being used.