4

I'm working on an IMAP client using Ruby and Rails. I can successfully import messages, mailboxes, and more... However, after the initial import, how can I detect any changes that have occurred since my last sync?

Currently I am storing the UIDs and UID validity values in the database, comparing them, and searching appropriately. This works, but it doesn't detect deleted messages or changes to message flags, etc.

Do I have to pull all messages every time to detect these changes? How do other IMAP clients do it so quickly (i.e. Apple Mail and Postbox). My script is already taking 10+ seconds per account with very few email addresses:

# select ourself as the current mailbox
@imap_connection.examine(self.location)

# grab all new messages and update them in the database
# if the uid's are still valid, we will just fetch the newest UIDs
# otherwise, we need to search when we last synced, which is slower :(
if self.uid_validity.nil? || uid_validity == self.uid_validity
  # for some IMAP servers, if a mailbox is empty, a uid_fetch will fail, so then
  begin
    messages = @imap_connection.uid_fetch(uid_range, ['UID', 'RFC822', 'FLAGS'])
  rescue
    # gmail cries if the folder is empty
    uids = @imap_connection.uid_search(['ALL'])
    messages = @imap_connection.uid_fetch(uids, ['UID', 'RFC822', 'FLAGS']) unless uids.empty?
  end

  messages.each do |imap_message|
    Message.create_from_imap!(imap_message, self.id)
  end unless messages.nil?
else
  query = self.last_synced.nil? ? ['All'] : ['SINCE', Net::IMAP.format_datetime(self.last_synced)]
  @imap_connection.search(query).each do |message_id|
    imap_message = @imap_connection.fetch(message_id, ['RFC822', 'FLAGS', 'UID'])[0]

    # don't mark the messages as read
    #@imap_connection.store(message_id, '-FLAGS', [:Seen])

    Message.create_from_imap!(imap_message, self.id)
  end
end

# now assume all UIDs are valid
self.uid_validity = uid_validity

# now remember that we just fetched all those messages
self.last_synced = Time.now
self.save!
sethvargo
  • 26,739
  • 10
  • 86
  • 156
  • possible dup? http://stackoverflow.com/questions/1084780/getting-only-new-mail-from-an-imap-server – John Douthat Apr 09 '12 at 19:26
  • related, but not a dupe. I already know how to fetch new messages. I need a way to fetch messages that have been deleted or "changed"... – sethvargo Apr 09 '12 at 19:35
  • The message is deleted when it's removed from the Trash folder (i.e. manually or after 30 days). Regarding other clients, I guess that they are doing FETCH 1:* [UID] which is quite fast, and then compare the sets. – Roman Apr 09 '12 at 22:06
  • So they are just completely comparing and/or replacing the sets on their server? – sethvargo Apr 10 '12 at 02:31

2 Answers2

13

There is an IMAP extension for Quick Flag Changes Resynchronization (RFC-4551). With this extension it is possible to search for all messages that have been changed since the last synchronization (based on some kind of timestamp). However, as far as I know this extension is not widely supported.

There is an informational RFC that describes how IMAP clients should do synchronization (RFC-4549, section 4.3). The text recommends issuing the following two commands:

tag1 UID FETCH <lastseenuid+1>:* <descriptors>
tag2 UID FETCH 1:<lastseenuid> FLAGS

The first command is used to fetch the required information for all unknown mails (without knowing how many mails there are). The second command is used to synchronize the flags for the already seen mails.

AFAIK this method is widely used. Therefore, many IMAP servers contain optimizations in order to provide this information quickly. Typically, the network bandwidth is the limiting factor.

Community
  • 1
  • 1
nosid
  • 48,932
  • 13
  • 112
  • 139
1

The IMAP protocol is brain dead this way, unfortunately. IDLE really should be able to return this kind of stuff while connected, for example. The FETCH FLAGS suggestion above is the only way to do it.

One thing to be careful of, however, is that UIDs are only valid for a given session per the spec. You should not store them, even if some servers persist them.

Aaron Zinman
  • 2,696
  • 2
  • 19
  • 16
  • 1
    That's a super expensive operation though - especially for users who have 10000 emails in their inbox. There has to be a way to cache existing messages. – sethvargo Aug 27 '12 at 12:14
  • 1
    @Aaron Zinman UIDs aren't session-specific. And synchronization specs of IMAP4 strongly suggest to use UIDs since it remains constant for a message within a folder. In RFC4549 http://tools.ietf.org/html/rfc4549, it says "Since a disconnected client has no way of knowing what changes might have occurred to the mailbox while it was disconnected, message numbers are not useful to a disconnected client. All disconnected client operations should be performed using UIDs, so that the client can be sure that it and the server are talking about the same messages during the synchronization process." – Amith Koujalgi Apr 12 '14 at 07:48
  • @sethvargo Working on a similar project at the moment and considering the possibility of someone with tons of emails, pagination would be your best bet. You wouldn't show 10,000 emails on one page so deal with things for a single page at a time. When dealing with new messages, you can simply pull them using a preset maximum range. This is explained in RFC 4549. Same thing for flags, you can limit the number of flags you request at a time. – ReX357 May 28 '16 at 20:12