Is there a method or way to scrape a large batch of daily emails (uniform in format) for content (say a published price for instance) on R or Python? If so, which is the optimal method to extract the data onto a csv file or on excel?
Thank you.
Is there a method or way to scrape a large batch of daily emails (uniform in format) for content (say a published price for instance) on R or Python? If so, which is the optimal method to extract the data onto a csv file or on excel?
Thank you.
I hope than I can help you.
If you see the post:
Parsing outlook .msg files with python
You can see how obtain the features of .msg file.
import win32com.client
outlook = win32com.client.Dispatch("Outlook.Application").GetNamespace("MAPI")
msg = outlook.OpenSharedItem(r"C:\test_msg.msg")
text = msg.Body
Now with the text variable you can parse the text of message, for it you can use regular expression.
If you have more doubt, put a example of text to parser.