1

I want to get one table that are inside the body of one .msg file with Python. I can get the body content, but I need the table separated into dataframe, for example.

I can get the body content, but I can't separe the table of the body

import win32com.client
import os

dir = r"C:\Users\Murilo\Desktop\Emails\030"

file_list = os.listdir(dir)

for file in file_list:
    if file.endswith(".msg"):
        outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
        msg = outlook.OpenSharedItem(dir + "/" + file)
        print(msg.Body)

I need the table that exists in body content, but not all body

hem
  • 1,012
  • 6
  • 11
Murilo Leite
  • 21
  • 1
  • 4

3 Answers3

1

If it is an HTML table, use MailItem.HTMLBody (instead of the plain text Body) and extract the table from HTML.

Dmitry Streblechenko
  • 62,942
  • 4
  • 53
  • 78
  • 1
    Thanks! I used that and works perfectly with Pandas (read_html). This create a list of dataframes that contains all tables on e-mail body. Each item of list is one dataframe of one table on body. data = pd.read_html(msg.HTMLBody) – Murilo Leite Jul 04 '19 at 19:45
0

I would look at the extract_msg library. It should allow you to open a .msg file as plain XML and be very easy to extract a table from the content.

msg = extract_msg.Message(fileLoc)
    msg_message = msg.body

    content = ('Body: {}'.format(msg_message))
0

The Outlook object model provides three main ways for working with item bodies:

  1. Body.
  2. HTMLBody.
  3. The Word editor. The WordEditor property of the Inspector class returns an instance of the Word Document which represents the message body. So, you can use the Word object model do whatever you need with the message body. The Copy and Paste methods of the Document will do the trick.

See Chapter 17: Working with Item Bodies for more information.

But I think the easiest and cleanest way is to use the Word object model. You can read more how to deal with the Word Object Model and how to use it to extract the table content in the How to read contents of an Table in MS-Word file Using Python? post.

Eugene Astafiev
  • 47,483
  • 3
  • 24
  • 45