1

I would like to hold a cache of emails to a group, and verify that the cache is not missing any uids periodically. First thing I tried was this method using imaplib which the search comes back with all UID's of the Mailbox. Then you verify all ids have a cached counterpart.

con = imaplib.IMAP4_SSL("imap.gmail.com")
con.login(user, password)

>>> con.select('INBOX')
('OK', [b'15613'])

>>> con.search(None, "TO", "mail_group")
('OK', [b'13267 13277 13285 13286 13290 13306 15591 15612 15613"]

I switched to imap_tools which also has a similar query but a nicer API.

con = imap_tools.MailBox('imap.gmail.com').login(user, password, initial_folder='INBOX')

con.folder.status('INBOX')
{'MESSAGES': 15615, 'RECENT': 0, 'UIDNEXT': 66119, 'UIDVALIDITY': 1, 'UNSEEN': 51}

>>> con.search(AND(to='mail_group'))
['15534', '15557', '15558', '15565', '15566', '15567', '15571', '15573', '15576', '15579', '15580', '15582', '15584', '15588', '15589', '15591', '15612', '15613']

Using this search I then fetch ids that are not on disk. The account is an audit account and does not delete any message so I assumed that these ids will not change.

The problem I'm seeing using imap_tools is when you fetch a UID that you got in the search operation, the msg.uid does not match it.

>>> msg = list(con.fetch(uid))[0]
>>> msg.uid
'66268'
>>> uid
15765

I'm not sure how 66268 connects with 15765. So why is 66268 the imap_tools UID for this message and how can you reconcile with these two different ids? Am I approaching this the wrong way?

UPDATE:

For imap-tools>=0.45.0 Added new method: uids

Vladimir
  • 6,162
  • 2
  • 32
  • 36
Peter Moore
  • 1,632
  • 1
  • 17
  • 31
  • 1
    Your first query was using message sequence numbers (MSN), not uids since you didn't do a UID search. These are not reliable if you ever delete messages. – Max Jun 24 '21 at 15:09
  • @Max Oh i see how do you go about getting a list of UIDs from the search method ? – Peter Moore Jun 24 '21 at 15:36
  • 1
    For imaplib, they didn't provide a specific function for it, but you can use the .uid function and supply the command name as the first paramater: con.uid('SEARCH', 'TO', 'whatever'). This also works for 'FETCH', 'STORE'.... imap_tools probably has some sort of flag for using UIDs, but I do not know that library. – Max Jun 24 '21 at 16:36

1 Answers1

1

Text in docs: First of all read about uid at rfc3501.

  1. UPDATE: For imap-tools >= 0.45.0 added new method: uids uids = mailbox.uids()

  2. the quote from lib README:

    "BaseMailBox.search - search mailbox for matching message numbers (this is not uids)"

  3. Read how to work uids. Link to it is there in docs.

    Here is more info on this: https://datatracker.ietf.org/doc/html/rfc3501#section-2.3.1.1

    And part from there: The unique identifier of a message MUST NOT change during the session, and SHOULD NOT change between sessions.

    So, caching uids is bad idea

  4. Example of getting uids with imap-tools by fetch:

    uids = [i.uid for i in mailbox.fetch(headers_only=1, bulk=1)]
    
Vladimir
  • 6,162
  • 2
  • 32
  • 36
  • 1
    I have implemented this for some months now and have not seen UIDs change. Thank you for the update on the API. The wording of the RFC is unclear. What does the UID must not and should not change between sessions mean to you? In English that is the same thing. Why does the term `should not` appear? When does a difference in UID appear? – Peter Moore Sep 17 '21 at 11:23
  • @PeterMoore, RFC about MUST and SHOULD - https://datatracker.ietf.org/doc/html/rfc3501#section-1.2 – Vladimir Sep 17 '21 at 11:29
  • got it that is the definition of the terms :-) but I don't understand why caching UIDs is a bad idea. In English terms they are static by the specification. – Peter Moore Sep 17 '21 at 11:51
  • @PeterMoore, SHOULD NOT change between sessions - that mean that they may be changed. – Vladimir Sep 20 '21 at 03:46
  • I am using this `uids = [i.uid for i in mailbox.fetch()]` to check if the mailbox is empty. lol – ARNON Oct 08 '21 at 11:20