8

I am trying to run local jar file with spark-submit which is working perfectly fine. Here is the command-

spark-submit --class "SimpleApp" --master local myProject/target/scala-2.11/simple-project_2.11-1.0.jar

But when I am trying with curl

curl -X POST --data '{
 "file": "file:///home/user/myProject/target/scala-2.11/simple-project_2.11-1.0.jar",
 "className": "SimpleApp",
}'  
-H 
"Content-Type: application/json" 
http://server:8998/batches

It is throwing error

"requirement failed: Local path /home/user/myProject/target/scala-2.11/simple-project_2.11-1.0.jar cannot be added to user sessions."

Here is my livy.conf file, as some article suggest to change few things.

# What host address to start the server on. By default, Livy will bind to all network interfaces.
livy.server.host = 0.0.0.0

# What port to start the server on.
livy.server.port = 8998

# What spark master Livy sessions should use.
livy.spark.master = local

# What spark deploy mode Livy sessions should use.
livy.spark.deploy-mode = client

# List of local directories from where files are allowed to be added to user sessions. By
# default it's empty, meaning users can only reference remote URIs when starting their
# sessions.
livy.file.local-dir-whitelist= /home/user/.livy-sessions/

Please help me out with this.

Thanks in Advance.

Divya Arya
  • 439
  • 5
  • 22

3 Answers3

10

I recently got the solution of local file reading from Apache Livy as I was creating the wrong request with cURL. I just replaced file reading protocol from 'file://' with 'local:/' and that works for me.

curl -X POST --data '{
  "file": "local:/home/user/myProject/target/scala-2.11/simple-project_2.11-1.0.jar",
  "className": "SimpleApp",
}'  
-H 
"Content-Type: application/json" 
http://server:8998/batches

That was quite a small mistake but still, my jar file cannot be accessed from HDFS.

Thank you all for helping out.

Divya Arya
  • 439
  • 5
  • 22
0

Presence of Apache Livy jar file is the mandatory requirement. It wouldn't work without the corresponding jar file.

My advice is next: just append livy jar file to classpath with java's cp option:

java -cp /usr/local/livy.jar com.myclass.Main

or simply use SBT:

libraryDependencies += "org.apache.livy" % "livy-api" % "0.4.0-incubating"

Maven:

<dependency>
    <groupId>org.apache.livy</groupId>
    <artifactId>livy-api</artifactId>
    <version>0.4.0-incubating</version>
</dependency>

Or your favorite build tool.

BTW, you also can upload livy jar file to HDFS and use it on your Hadoop cluster, it can significantly simplify your life.

koiralo
  • 22,594
  • 6
  • 51
  • 72
  • I tried with hdfs as well but I still had issues with that. You can check my query here- https://stackoverflow.com/questions/50969333/apache-livy-curl-not-working-for-spark-submit-command – Divya Arya Jun 26 '18 at 09:36
  • You have to make `fat jar` file with your codebase + necessary jar - `sbt assembly` or some maven plugin, upload this jar file to `hdfs` and run `spark-submit` with this jar file which is placed on `hdfs` or use `curl` as well. –  Jun 26 '18 at 09:46
  • If you don't want to make a fat jar file and upload it to hdfs, you can consider python scripts, it could be submitted like a plain text withou any jar file. –  Jun 26 '18 at 09:47
  • @Divine I've provided the answer to your another question in details, please, check it. –  Jun 26 '18 at 10:02
0

The below answer worked for me as stated in here Apache Livy cURL not working for spark-submit command

To use local files for livy batch jobs you need to add the local folder to the livy.file.local-dir-whitelist property in livy.conf.

Description from livy.conf.template:

List of local directories from where files are allowed to be added to user sessions. By default it's empty, meaning users can only reference remote URIs when starting their sessions.