4

I'm a newbie in Python and Spark world. And am trying to build a pyspark code to send an email from Databricks along with the attachment from the mount point location. I'm using below code to implement the same -

import smtplib
from pathlib import Path
from email.mime.multipart import MIMEMultipart
from email.mime.base import MIMEBase
from email.mime.text import MIMEText
from email.utils import COMMASPACE, formatdate
from email import encoders


def send_mail(send_from = <from_email>, send_to = <to_email>, subject = "Test", message = "Test", files=["/mnt/<Mounted Point Directory>/"],
              server="<SMTP Host>", port=<SMTP Port>, username='<SMTP Username>', password='<SMTP Password>',
              use_tls=True):

    msg = MIMEMultipart()
    msg['From'] = send_from
    msg['To'] = COMMASPACE.join(send_to)
    msg['Date'] = formatdate(localtime=True)
    msg['Subject'] = subject

    msg.attach(MIMEText(message))

    for path in files:
        part = MIMEBase('application', "octet-stream")
        with open(path, 'rb') as file:
            part.set_payload(file.read())
        encoders.encode_base64(part)
        part.add_header('Content-Disposition',
                        'attachment; filename="{}"'.format(Path(path).name))
        msg.attach(part)

    smtp = smtplib.SMTP(server, port)
    if use_tls:
        smtp.starttls()
    smtp.login(username, password)
    smtp.sendmail(send_from, send_to, msg.as_string())
    smtp.quit()

But for some reason the code is giving me File or directory not exists exception.

Am I missing anything over here.

Thanks

Alex Ott
  • 80,552
  • 8
  • 87
  • 132
Dipanjan Mallick
  • 1,636
  • 2
  • 8
  • 20

1 Answers1

7

You need to modify the code to make it working with DBFS, because the open function doesn't know anything about DBFS or other file systems, and can work only with local files (see documentation about DBFS).

You can do it as following:

  • if you're on "full Databricks", not Community Edition then you need to prepend the /dbfs to the file name, like, /dbfs/mnt/.... - this /dbfs mount is the way of accessing files on DBFS from the code that works with local files (but there are some limitations when writing to that location).
  • Or you can use dbutils.fs.cp command to copy file from DBFS to local file, and use that copy of file to attach, like this:
dbutils.fs.cp("/mnt/...", "file:///tmp/local-name")
with open("/tmp/local-name", "r"):
...
Alex Ott
  • 80,552
  • 8
  • 87
  • 132
  • Is it possible to use open directly from the DBFS, reason being I dont want to copy it to local file system. – Dipanjan Mallick Apr 14 '21 at 18:11
  • yes, as I wrote - just prepend `/dbfs` to file name, like, `/dbfs/mnt/.....` – Alex Ott Apr 14 '21 at 18:38
  • Thanks @AlexOtt. The compile-time issue is now resolved. However, for some reason, I'm not receiving the email using both SMTP details as well as localhost. Is there something you could help me with or anything else that needs to included/modified in this code? – Dipanjan Mallick Apr 15 '21 at 10:29
  • I'm not familiar with that library, maybe there are some networking problems or something like (port 25 often is blocked). Look into documentation to see if it's possible to enable some logging to see what this library is doing – Alex Ott Apr 15 '21 at 11:02