3

I'm happily using imaplib to get the message IDs in a specific label:

connection.select("MyLabel")
connection.uid('SEARCH', None, 'ALL'))

but if I've got some chats in that label, they aren't returned, so they are invisible to IMAP. I've read Accessing Chat Folder in Python Using Imaplib, though this is for searching in the Chats label, not finding chats in another label, and it doesn't appear to make this case work.

I could perhaps perform a second search in "Chats" for messages labelled "MyLabel", but this is an extra query and is asking for quite a bit of set up from users of my application.

Community
  • 1
  • 1
mrooney
  • 1,994
  • 2
  • 19
  • 30
  • Could you please add the output of `sock.select("[Gmail]/Chats", True)` followed by `sock.uid('FETCH', '1:*', 'X-GM-LABELS')` to inspect the labels you got? You can also try `sock.debug = 4` to get the debug statements from `imaplib`. – dnozay Jul 12 '13 at 18:56

1 Answers1

2

Gmail labels are exposed as toplevel mailboxes, not the other way around. To search multiple mailboxes, you need to do multiple queries, thus performing select() on the appropriate mailbox then doing the search command (or uid in your case).

Configuring your gmail account for access to Chats over IMAP:

The link you gave: Accessing Chat Folder in Python Using Imaplib is still very relevant as users will need to allow IMAP access to their chat logs. You can also check the imap extensions used by Gmail, with description of X-GM-RAW and X-GM-LABELS.

If you are using Gmail for business, I am not sure if it works (I don't have an account to verify), but this link: https://developers.google.com/gmail/imap_extensions#checking_for_the_presence_of_extensions may help you see if the extensions are present.

Modified utf-7 encoding:

Most imap servers store mailbox names and labels in a modified version of utf-7. You can't use straight labels like that for gmail unless you are using plain us-ascii. IMAPClient knows how to encode/decode using the modified utf7 encoding used by most IMAP servers. There is a bug open against imaplib so you may want to use imapclient.imap_utf7 module to encode mailbox names and/or labels until imaplib starts supporting the modified utf-7 encoding on its own. Other thing I found online: while you may be able to STORE labels successfully with a particular encoding, you fail miserably to SEARCH for them (also when xoauth is involved) unless you are using that modified utf-7 encoding or indicating the charset. Other projects already do most of the work for gmail, e.g. BaGoMa (backup google mail) which ships with imap-utf7 support. So far, I've been able to create a label through the UI with latin-1 character and SEARCH for it using the utf-8 charset.

Here is how to encode your label:

from imapclient import imap_utf7
label = imap_utf7.encode(u'yourlabel')

see also this question: IMAP folder path encoding (IMAP UTF-7) for Python

You can inspect your labels with:

>>>> sock.select("[Gmail]/Chats", True)
>>>> sock.uid('FETCH', '1:*', 'X-GM-LABELS')

This is useful to check what labels you have and for debugging encoding problems.

Example:

import imaplib
import getpass
import atexit
from imapclient import imap_utf7

def find_messages(sock, label):
    mailbox = imap_utf7.encode(label)
    label = imap_utf7.encode(label.encode('utf-8'))
    try:
        # process regular mailbox
        sock.select(mailbox)
    except sock.error:
        pass
    else:
        resp, data = sock.uid('SEARCH', None, '(ALL)')
        assert resp == 'OK'
        for uid in data[0].split():
            # because we do select, this uid will be valid.
            yield uid   
    try:
        # now process chats with that label
        sock.select("[Gmail]/Chats", True)
    except sock.error:
        # access to chats via IMAP is disabled most likely
        pass
    else:
        # resp, data = sock.uid('SEARCH', 'X-GM-RAW', 'label:%s' % label)
        sock.literal = label
        resp, data = sock.uid('SEARCH', 'CHARSET', 'UTF-8', 'X-GM-LABELS')
        assert resp == 'OK'
        for uid in data[0].split():
            # because we do select, this uid will be valid.
            yield uid

def test():
    email = "XXXXXXXX@gmail.com"
    label = u"français" # oui oui merci beaucoup.
    sock = imaplib.IMAP4_SSL("imap.gmail.com", 993)
    sock.login(email, getpass.getpass())
    for uid in find_messages(sock, label):
        # e.g.
        print sock.uid('FETCH', uid, '(BODY[HEADER])')
    sock.close()
    sock.logout()

tested on my machine!

>>> test()
Password: 
('OK', [('1 (UID 14 BODY[HEADER] {398}', 'MIME-Version: 1.0\r\nReceived: by 10.XXX.XXX.XXX with HTTP; Thu, 11 Jul 2013 09:54:32 -0700 (PDT)\r\nDate: Thu, 11 Jul 2013 09:54:32 -0700\r\nDelivered-To: XXXXXXXX@gmail.com\r\nMessage-ID: <XXXXXXXX@mail.gmail.com>\r\nSubject: test email\r\nFrom: Damien <XXXXXXXX@gmail.com>\r\nTo: Damien <XXXXXXXX@gmail.com>\r\nContent-Type: text/plain; charset=ISO-8859-1\r\n\r\n'), ')'])
('OK', [('1 (UID 1 BODY[HEADER] {47}', 'From: Damien XXXXXXXX <XXXXXXXX@gmail.com>\r\n\r\n'), ')'])
('OK', [('2 (UID 2 BODY[HEADER] {46}', 'From: Vincent XXXXXXXX <XXXXXXXX@gmail.com>\r\n\r\n'), ')'])

Undocumented interface:

imaplib is able to use literals, this is useful in particular when using a different encoding. This works by setting the IMAP4.literal attribute before running the command.

sock.literal = label
resp, data = sock.uid('SEARCH', 'CHARSET', 'UTF-8', 'X-GM-LABELS')
Community
  • 1
  • 1
dnozay
  • 23,846
  • 6
  • 82
  • 104
  • Thanks for the detailed response, though it isn't working for me. Are you sure you can match chats with specific labels, and just not regular emails? In your example, the second sock.uid isn't set to resp, data, so those are still the old variables from your normal search, so that could explain false positives. If I search for "label:foo is:chat" in Gmail I get a match, but doing a select("[Gmail]/Chats") followed by uid("SEARCH", "X-GM-RAW", "label:foo") returns no results. Doing uid("SEARCH", "X-GM-RAW", "") shows many, so I seem to have access to chats in general. Thanks for any thoughts! – mrooney Jul 10 '13 at 22:04
  • Thanks but look at your line "sock.uid('SEARCH', 'X-GM-LABELS', label)". This needs to be "resp, data = sock.uid('SEARCH', 'X-GM-LABELS', label)". Currently you are just re-using the old resp and data and not actually looking at the query with the label inside of chats. Mine is just a plain ascii label, by the way, but that was a good thought. – mrooney Jul 11 '13 at 02:00
  • :] So, does it still work for you when the second yield is actually showing the correct "resp" and "data" variables? If so, I'm really curious about what could be different in my setup. – mrooney Jul 11 '13 at 16:52
  • Thanks for all your help! I'm using OAUTH2 so instead of a sock.login I've got sock.authenticate("XOAUTH2", token_callable). It is the same API after that but maybe it makes a difference. – mrooney Jul 11 '13 at 18:08
  • Thanks, my label is just a plain, alphabetical label, but since I lose the bounty either way, you definitely deserve it for all your hard research, I'm sure this will be useful to others! – mrooney Jul 12 '13 at 16:58