0

Is there a method or way to scrape a large batch of daily emails (uniform in format) for content (say a published price for instance) on R or Python? If so, which is the optimal method to extract the data onto a csv file or on excel?

Thank you.

user2554330
  • 37,248
  • 4
  • 43
  • 90
Adriel
  • 23
  • 1
  • 5

1 Answers1

0

I hope than I can help you.

If you see the post:

Parsing outlook .msg files with python

You can see how obtain the features of .msg file.

import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
msg = outlook.OpenSharedItem(r"C:\test_msg.msg")
text = msg.Body

Now with the text variable you can parse the text of message, for it you can use regular expression.

If you have more doubt, put a example of text to parser.

Francisco Gonzalez
  • 437
  • 1
  • 3
  • 15