4

I have a schedule job to fetch recent emails from a folder and write those emails into a file (.eml file). But it was taking a long time (5 to 6 minutes to read an email of size 9mb) to finish. Since I am using JavaMail API I set the properties as below , in my code , to improve the performance and this way it was taking very less time (20 seconds).

props.setProperty("mail.imaps.partialfetch","false");
props.setProperty("mail.imaps.fetchsize", "1048576");
  1. Does this (setting fetchsize to a larger value) create any other issues in my application?.
  2. Setting fetchsize to 1048576 means that my schedule job will take this much memory always and remaining memory will be allocated to rest of my application. Is my understanding correct here?. If not , could someone help me understand this better with an example?.

Entire code is as below,

package com.indiscover;

import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;

import java.util.Properties;

import javax.mail.Flags;
import javax.mail.Folder;
import javax.mail.Message;
import javax.mail.MessagingException;
import javax.mail.NoSuchProviderException;
import javax.mail.Session;
import javax.mail.Store;
import javax.mail.search.FlagTerm;

public class ReadMail {

    public static void main(String[] args) throws InterruptedException, IOException {
        String protocol="imaps";
        String emailAddress = "email_id";
        String password = "password";

        Properties props = new Properties();
        props.setProperty("mail.store.protocol", protocol);
        props.setProperty("mail.imaps.socketFactory.class", "javax.net.ssl.SSLSocketFactory");
        props.setProperty("mail.imaps.socketFactory.fallback", "false");
        props.setProperty("mail.imaps.port", "993");
        props.setProperty("mail.imaps.socketFactory.port", "993");
        props.setProperty("mail.imaps.partialfetch","false");
        props.setProperty("mail.imaps.fetchsize", "1048576"); 

        Session session = Session.getInstance(props, null);

        try {

            Store store = session.getStore(protocol);
            store.connect("imap-mail.outlook.com", emailAddress, password);
            Folder inbox = store.getFolder("Archive/Test");
            inbox.open(Folder.READ_WRITE);

            //search for all "unseen" messages
            Flags recent = new Flags(Flags.Flag.RECENT);
            FlagTerm recentFlagTerm = new FlagTerm(recent, true);
            Message messages[] = inbox.search(recentFlagTerm);


            for (int i = 0; i < messages.length; i++) {
                Message message = messages[i];
                String subject = message.getSubject();

                processSaveToFile(message,subject);
            }

            inbox.close(false);
            store.close();

        }catch (NoSuchProviderException ex) {
            System.out.println("No provider.");
            ex.printStackTrace();
        } catch (MessagingException ex) {
            System.out.println("Could not connect to the message store.");
            ex.printStackTrace();
        }

    }

    private static void processSaveToFile (Message msg, String subject) throws MessagingException, IOException
    {
       String whereToSave = "/Users/XXX/Documents/" + "some_random_name" + ".eml";

       OutputStream out = new FileOutputStream(new File(whereToSave));
       try {
           msg.writeTo(out);
       }
       finally {
           if (out != null) { out.flush(); out.close(); }
       }
     }

}
user3742125
  • 617
  • 2
  • 9
  • 31

1 Answers1

4

From the JakartaMail FAQ Retrieving large message bodies seems inefficient at times:

If you are using the IMAP provider, you could try increasing the mail.imap.fetchsize property (the current default is 16k). This will cause data to be fetched from the server in larger chunks. Note that you risk the possibility of the JVM running out of memory when you do this.

As you pointed out you just need enough heapspace.

Setting fetchsize to 1048576 means that my schedule job will take this much memory always and remaining memory will be allocated to rest of my application.

Digging through the source code for the imap package it looks like the fetchsize is used to allocate a grow-able byte array per IMAPInputStream. It appears it will depend on the lifetime of the IMAPInputStream and how many IMAPInputStream are reachable in memory will determine how the heap usage will behave. From your source code it looks like it should be fairly predictable.

Run a memory profile on your application to tune your heap settings.

jmehrens
  • 10,580
  • 1
  • 38
  • 47