1

I need to create an application that parses a PST file and converts the mails into multiple EML files. Basically, I need to do the opposite of what's being asked in this question.

Is there any sample code or guidelines to achieve this feature?

Community
  • 1
  • 1
TtT23
  • 6,876
  • 34
  • 103
  • 174
  • 1
    Did you tried anything so far? Please read [faq] and [ask] – Soner Gönül Jan 04 '13 at 09:35
  • @SonerGönül Googling basically shows literally nothing on how to approach this problem. I suspect I need to work with the COM interop of outlook to achieve this feature. It's not that I'm asking for PLEAZE GIEV CODZ, I really don't know where to begin. – TtT23 Jan 04 '13 at 09:37

2 Answers2

4

You could use the Outlook Redemption library which is capable of opening PST and extracting messages as .EML (among other formats). Redemption is a COM Object (32 or 64 bit) that can be used in C# without any problem. Here is a Console Application sample code that demonstrates this:

using System;
using System.IO;
using System.Text;
using Redemption;

namespace DumpPst
{
    class Program
    {
        static void Main(string[] args)
        {
            // extract 'test.pst' in the 'test' folder
            ExtractPst("test.pst", Path.GetFullPath("test"));
        }

        public static void ExtractPst(string pstFilePath, string folderPath)
        {
            if (pstFilePath == null)
                throw new ArgumentNullException("pstFilePath");

            RDOSession session = new RDOSession();
            RDOPstStore store = session.LogonPstStore(pstFilePath);
            ExtractPstFolder(store.RootFolder, folderPath);
        }

        public static void ExtractPstFolder(RDOFolder folder, string folderPath)
        {
            if (folder == null)
                throw new ArgumentNullException("folder");

            if (folderPath == null)
                throw new ArgumentNullException("folderPath");

            if (folder.FolderKind == rdoFolderKind.fkSearch)
                return;

            if (!Directory.Exists(folderPath))
            {
                Directory.CreateDirectory(folderPath);
            }

            foreach(RDOFolder child in folder.Folders)
            {
                ExtractPstFolder(child, Path.Combine(folderPath, ToFileName(child.Name)));
            }

            foreach (var item in folder.Items)
            {
                RDOMail mail = item as RDOMail;
                if (mail == null)
                    continue;

                mail.SaveAs(Path.Combine(folderPath, ToFileName(mail.Subject)) + ".eml", rdoSaveAsType.olRFC822);
            }
        }

      /// <summary>
      /// Converts a text into a valid file name.
      /// </summary>
      /// <param name="fileName">The file name.</param>
      /// <returns>
      /// A valid file name.
      /// </returns>
      public static string ToFileName(string fileName)
      {
          return ToFileName(fileName, null, null);
      }

      /// <summary>
      /// Converts a text into a valid file name.
      /// </summary>
      /// <param name="fileName">The file name.</param>
      /// <param name="reservedNameFormat">The reserved format to use for reserved names. If null '_{0}_' will be used.</param>
      /// <param name="reservedCharFormat">The reserved format to use for reserved characters. If null '_x{0}_' will be used.</param>
      /// <returns>
      /// A valid file name.
      /// </returns>
      public static string ToFileName(string fileName, string reservedNameFormat, string reservedCharFormat)
      {
          if (fileName == null)
              throw new ArgumentNullException("fileName");

          if (string.IsNullOrEmpty(reservedNameFormat))
          {
              reservedNameFormat = "_{0}_";
          }

          if (string.IsNullOrEmpty(reservedCharFormat))
          {
              reservedCharFormat = "_x{0}_";
          }

          if (Array.IndexOf(ReservedFileNames, fileName.ToLowerInvariant()) >= 0 ||
              IsAllDots(fileName))
              return string.Format(reservedNameFormat, fileName);

          char[] invalid = Path.GetInvalidFileNameChars();

          StringBuilder sb = new StringBuilder(fileName.Length);
          foreach (char c in fileName)
          {
              if (Array.IndexOf(invalid, c) >= 0)
              {
                  sb.AppendFormat(reservedCharFormat, (short)c);
              }
              else
              {
                  sb.Append(c);
              }
          }

          string s = sb.ToString();

          // directory limit is 255
          if (s.Length > 254)
          {
              s = s.Substring(0, 254);
          }

          if (string.Equals(s, fileName, StringComparison.Ordinal))
          {
              s = fileName;
          }
          return s;
      }

      private static bool IsAllDots(string fileName)
      {
          foreach (char c in fileName)
          {
              if (c != '.')
                  return false;
          }
          return true;
      }

      private static readonly string[] ReservedFileNames = new[]
      {
          "con", "prn", "aux", "nul",
          "com0", "com1", "com2", "com3", "com4", "com5", "com6", "com7", "com8", "com9",
          "lpt0", "lpt1", "lpt2", "lpt3", "lpt4", "lpt5", "lpt6", "lpt7", "lpt8", "lpt9"
      };
    }
}
Simon Mourier
  • 132,049
  • 21
  • 248
  • 298
  • I suggest the filename-cleanup code should use a known-good set of characters rather than the reverse, and should take a maximum length, since there is a limit on filenames but not Subject? – Ben Jan 04 '13 at 10:48
  • @ben - I don't know what "known-good" means in an unicode international world. ToFileName follows (most of) the specification for a valid file name. It doesn't handle prehistoric things such as COM1 allright... Anyway, that was not the subject. Adapt as you see fit. – Simon Mourier Jan 04 '13 at 18:14
  • You don't have to worry about NUL and COM1 because you are putting a `.eml` extension on the end. By "Known-good" I mean things which are not going to surprise either the filesystem, or the user, or any further automatic processing steps. So you probably don't want quotes or whitespace either, even though they are permitted by the filesystem. So I would suggest allowing `[-a-zA-Z0-9_.]` and pretty much nothing else, unless you have a known requirement for it. That's what I do. – Ben Jan 05 '13 at 11:08
  • @ben - I don't agree with you (a folder could be called COM1) but, like as said, this is not the subject of the question. – Simon Mourier Jan 06 '13 at 08:47
1

You need essentially to do the inverse of what is being asked in that question.

  • Load the PST file using Outlook interop (or redemption as above)
  • Enumerate all the files.
  • Use CDO, System.Mail or similar to compose an EML file for each file in the PST.

The thing to note is that a PST doesn't contain EML files, it contains MSG files. So you will have to do some form of conversion, and you will not get back exactly what was originally sent.

See also this question: Are there .NET Framework methods to parse an email (MIME)?

Community
  • 1
  • 1
Ben
  • 34,935
  • 6
  • 74
  • 113