0

I'm using IntelliJ and Scala to program a Spark job.

I have an object. When I run it on the local machine I get an error about Java Heap size. I go into IntelliJ settings and I am able to add more space.

I have since spun up a Spark 2.2 cluster on Azure.

When I submit the job to Azure via IntelliJ, I get two errors that aren't there when running it locally

1

YARN Diagnostics: User class threw exception: java.lang.OutOfMemoryError: Java heap space

How do I set the Java heap space on Spark sitting on Azure?

2

YARN Diagnostics: User class threw exception: java.lang.NoClassDefFoundError: 
org/apache/commons/mail/DefaultAuthenticator

I believe the issue is at this line of code:

val email = new SimpleEmail
email.setHostName("smtp.googlemail.com")

email.setSmtpPort(465)
email.setAuthenticator(new DefaultAuthenticator("MY EMAIL Address", "MyPassword"))

How do I send an email from Spark on Azure? This code works fine locally. What do I need to do to get this working?

trincot
  • 317,000
  • 35
  • 244
  • 286
JetS79
  • 71
  • 1
  • 8

1 Answers1

1

How do I set the Java heap space on Spark sitting on Azure?

The NameNode Java heap size depends on many factors such as the load on the cluster, the numbers of files, and the numbers of blocks. The default size of 1 GB works well with most clusters, although some workloads can require more or less memory.

To modify the NameNode Java heap size.

HDFS => Config => Advanced => NameNode Java heap size = 2048 MB => Save

To modify the YARN Java heap size.

YARN => Config => Advanced => ResourceManager Java heap size = 2048 MB => Save

How do I send an email from Spark on Azure? This code works fine locally. What do I need to do to get this working?

You may refer the suggestions outlined in the SO thread which addresses similar issue.

CHEEKATLAPRADEEP
  • 12,191
  • 1
  • 19
  • 42