0

Pretty new to Python. My goal is to download only email attachments from certain senders of .xls and .docx filetypes to a specified folder. I have the sender conditions working but can't get the program to filter to the specific filetypes I want. The code below downloads all attachments from the listed senders including image signatures (not desired.) The downloaded attachments contain data that will be further used in a df. I'd like to keep it within win32com since I have other working email scraping programs that use it. I appreciate any suggestions.

Partially working code:

import win32com.client

Outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
inbox = outlook.GetDefaultFolder(6)

Items = inbox.Items
Item = Items.GetFirst()

def saveAttachments(email:object):
        for attachedFile in email.Attachments:
                try:
                        filename = attachedFile.FileName
                        attachedFile.SaveAsFile("C:\\Outputfolder"+filename)
                except Exception as e:
                        print(e)
for mailItem in inbox.Items:
        if mailItem.SenderName  == "John Smith" or mailItem.SenderName  == "Mike Miller":
                saveAttachments(mailItem)
kmrosan
  • 1
  • 1

2 Answers2

0

Firstly, don't loop through all item in a folder - use Items.Find/FindNext or Items.Restrict with a query on the SenderName property - see https://learn.microsoft.com/en-us/office/vba/api/outlook.items.restrict

As for the attachment, a image attachment is not any different from any other attachment. You can check the file extension or the size. You can also read the PR_ATTACH_CONTENT_ID property (DASL name http://schemas.microsoft.com/mapi/proptag/0x3712001F) using Attachment.PropertyAccessor.GetProperty and check if it is used in an img tag in the MailItem.HTMLBody property.

Dmitry Streblechenko
  • 62,942
  • 4
  • 53
  • 78
0

Currently you save all attached files on the disk:

 for attachedFile in email.Attachments:
                try:
                        filename = attachedFile.FileName
                        attachedFile.SaveAsFile("C:\\Outputfolder"+filename)
                except Exception as e:
                        print(e)

only email attachments from certain senders of .xls and .docx filetypes to a specified folder.

The Attachment.FileName property returns a string representing the file name of the attachment. So, parsing the filename by extracting the file extension will help you to filter files that should be saved on the disk.

Also you may be interested in avoiding hidden attachments used for inline images in the message body. Here is an example code in VBA (the Outlook object model is common for all programming languages, I am not familiar with Python) that counts the visible attachments:

Sub ShowVisibleAttachmentCount()
    Const PR_ATTACH_CONTENT_ID As String = "http://schemas.microsoft.com/mapi/proptag/0x3712001F"
    Const PR_ATTACHMENT_HIDDEN As String = "http://schemas.microsoft.com/mapi/proptag/0x7FFE000B"

    Dim m As MailItem
    Dim a As Attachment
    Dim pa As PropertyAccessor
    Dim c As Integer
    Dim cid as String

    Dim body As String

    c = 0

    Set m = Application.ActiveInspector.CurrentItem
    body = m.HTMLBody

    For Each a In m.Attachments
        Set pa = a.PropertyAccessor
        cid = pa.GetProperty(PR_ATTACH_CONTENT_ID)

        If Len(cid) > 0 Then
            If InStr(body, cid) Then
            Else
                'In case that PR_ATTACHMENT_HIDDEN does not exists, 
                'an error will occur. We simply ignore this error and
                'treat it as false.
                On Error Resume Next
                If Not pa.GetProperty(PR_ATTACHMENT_HIDDEN) Then
                    c = c + 1
                End If
                On Error GoTo 0
            End If
        Else
            c = c + 1
        End If
    Next a
    MsgBox c
End Sub

Also you may check whether the message body (see the HTMLBody property of Outlook items) contains the PR_ATTACH_CONTENT_ID property value. If not, the attached can be visible to users if the PR_ATTACHMENT_HIDDEN property is not set explicitly.

Also you may find the Sending Outlook Email with embedded image using VBS thread helpful.

Eugene Astafiev
  • 47,483
  • 3
  • 24
  • 45