1

I know that this has been asked here several times, and I have tried what has apparently worked for others...I have more than 1000 Outlook .msg files with .xlsx file attachments stored in folders on my desktop and I only need to extract the .xlsx files to combine into a single dataframe.

I have tried the VBA macro, and Python [Win32] (Parsing outlook .msg files with python) and msg-extractor. The best I can do is to extract a single attachment from a single .msg file

Any advice is greatly appreciated. Thank you!

import argparse
import csv
import os as os
import pathlib
import sys
from datetime import date, datetime, timedelta, tzinfo
from enum import Enum, IntEnum
from tempfile import mkstemp

import dateutil.parser as duparser
from dateutil.rrule import rrulestr, rruleset
import pywintypes
import pytz
import win32com.client  

path = r'C:\Users\Me\Desktop\MyFiles\feb_2018'
files = [f for f in os.listdir(path) if '.msg' in f]
print (files)
for file in files:
    outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
    msg = outlook.OpenSharedItem(os.path.join(path, file))
    att=msg.Attachments
    for i in att:
        i.SaveAsFile(os.path.join(path, i.FileName))       


Vassilis M
  • 11
  • 4
  • The VBA macro seems to work best, but only extracts 1 file, is there a command I am missing somewhere in the code? – Vassilis M Oct 31 '19 at 16:55

3 Answers3

3

I have not tried saving the attachments using win32com, so I can't tell why only a single attachment from a single file is getting saved. But I was able to save multiple attachments using msg-extractor

import extract_msg

for file in files:
    msg = extract_msg.Message(file)
    msg_attachment = msg.attachments
    attach_path = "path where the files have to be saved."
    for attachment in msg_attachment:
        if not os.path.exists(attach_path):
            os.makedirs(attach_path)
        attachment.save(customPath=attach_path)
anshu
  • 51
  • 7
0

I figured out a solution to extract multiple files with Win32 by including a counter:

path = r'C:\Users\filepath' #change path to directory where your msg files are located
files = [f for f in os.listdir(path) if '.msg' in f]
print (files)
counter=0
for file in files:
    outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
    msg = outlook.OpenSharedItem(os.path.join(path, file))
    att=msg.Attachments
    for i in att:
        counter +=1
        i.SaveAsFile(os.path.join(path, str(counter)+i.FileName))
Azhar Khan
  • 3,829
  • 11
  • 26
  • 32
Vassilis M
  • 11
  • 4
0

This topic is quite old now but no need to use the counter:

path = r'C:\Users\filepath' #change path to directory where your msg files are located
files = [f for f in os.listdir(path) if '.msg' in f]
print (files)
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")

for file in files:
    msg = outlook.OpenSharedItem(os.path.join(path, file))
    for att in msg.Attachments:
        fullPath = os.path.join(path, att.FileName)
        if not os.path.isfile(fullPath):
            att.SaveAsFile(fullPath)
Azhar Khan
  • 3,829
  • 11
  • 26
  • 32