Questions tagged [livy]

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface

From http://livy.incubator.apache.org.

What is Apache Livy?

Apache Livy is a service that enables easy interaction with a Spark cluster over a REST interface. It enables easy submission of Spark jobs or snippets of Spark code, synchronous or asynchronous result retrieval, as well as Spark Context management, all via a simple REST interface or a RPC client library. Apache Livy also simplifies the interaction between Spark from application servers, thus enabling the use of Spark for interactive web/mobile applications. Additional features include:

  • Have long running Spark Contexts that can be used for multiple Spark jobs, by multiple clients
  • Share cached RDDs or Dataframes across multiple jobs and clients
  • Multiple Spark Contexts can be managed simultaneously, and the Spark Contexts run on the cluster (YARN/Mesos) instead of the Livy Server, for good fault tolerance and concurrency
  • Jobs can be submitted as precompiled jars, snippets of code or via java/scala client API
  • Ensure security via secure authenticated communication

References

288 questions
19
votes
3 answers

how to set livy.server.session.timeout on EMR cluster boostrap?

I am creating an EMR cluster, and using jupyter notebook to run some spark tasks. My tasks die after approximately 1 hour of execution, and the error is: An error was encountered: Invalid status code '400' from…
bill
  • 293
  • 2
  • 6
  • 17
19
votes
3 answers

Comparing Apache Livy with spark-jobserver

I know Apache Livy is the rest interface for interacting with spark from anywhere. So what is the benefits of using Apache Livy instead of spark-jobserver. What are the drawbacks of spark-jobserver for which Livy is used as an alternative. And I…
user118
  • 191
  • 1
  • 6
18
votes
2 answers

Spark nodes keep printing GC (Allocation Failure) and no tasks run

I am running a Spark job using Scala, but it gets stuck not executing and tasks by my worker nodes. Currently I am submitting this to Livy, which submits to our Spark Cluster with 8 cores and 12GB of RAM with the following configuration: data={ …
Eric Meadows
  • 887
  • 1
  • 11
  • 19
13
votes
2 answers

why Livy or spark-jobserver instead of a simple web framework?

I'm building a RESTful API on top of Apache Spark. Serving the following Python script with spark-submit seems to work fine: import cherrypy from pyspark.sql import SparkSession spark = SparkSession.builder.appName('myApp').getOrCreate() sc =…
Parzival
  • 2,004
  • 4
  • 33
  • 47
10
votes
0 answers

Mock livy server for unit test

I am currently trying to mock a livy server from scala to run unit tests. Basically, I would like to test submitting jar to a livy client. I'm trying to adapt the code I found here (livy)HttpClientSpec.scala and I'm getting an error at…
celadari
  • 101
  • 2
9
votes
3 answers

Livy Server: return a dataframe as JSON?

I am executing a statement in Livy Server using HTTP POST call to localhost:8998/sessions/0/statements, with the following body { "code": "spark.sql(\"select * from test_table limit 10\")" } I would like an answer in the following…
matheusr
  • 567
  • 9
  • 29
8
votes
1 answer

Jupyter starting a kernel in a docker container?

I want to switch my notebook easily between different kernels. One use case is to quickly test a piece of code in tensorflow 2, 2.2, 2.3, and there are many similar use cases. However I prefer to define my environments as dockers these days, rather…
Roelant
  • 4,508
  • 1
  • 32
  • 62
8
votes
3 answers

Apache Livy doesn't work with local jar file

I am trying to run local jar file with spark-submit which is working perfectly fine. Here is the command- spark-submit --class "SimpleApp" --master local myProject/target/scala-2.11/simple-project_2.11-1.0.jar But when I am trying with curl curl -X…
Divya Arya
  • 439
  • 5
  • 22
6
votes
1 answer

Scala sbt assembly jar does not work (class implementation not found) but code works when through IntelliJ

When launching my code with scala -cp assembly.jar class.A --config-path confFile I get java.lang.IllegalStateException: No LivyClientFactory implementation was found But when launching through IntelliJ it works just fine. Also I checked inside my…
Yassine
  • 123
  • 3
6
votes
1 answer

Livy session stuck on starting after successful spark context creation

I've been trying to create a new spark session with Livy 0.7 server that runs on Ubuntu 18.04. On that same machine I have a running spark cluster with 2 workers and I'm able to create a normal spark-session. My problem is that after running the…
Benda
  • 151
  • 1
  • 6
6
votes
1 answer

Spark Session returned an error : Apache NiFi

We are trying to run a spark program using NiFi. This is the basic sample we tried to follow. We have configured Apache-Livy server in 127.0.0.1:8998. ExecutiveSparkInteractive processor is used to run sample Spark code. val gdpDF =…
Sachith Muhandiram
  • 2,819
  • 10
  • 45
  • 94
6
votes
0 answers

error: 'local path ______ cannot be added to user session' in Apache Livy

I am trying to submit Python file to REST API but it is always giving the error. I am using the local mode and the command which I am running is given below: $curl -X POST --data…
neha
  • 1,858
  • 5
  • 21
  • 35
6
votes
4 answers

Submit a Spark job from C# and get results

As per title, I would like to request a calculation to a Spark cluster (local/HDInsight in Azure) and get the results back from a C# application. I acknowledged the existence of Livy which I understand is a REST API application sitting on top of…
Stefano d'Antonio
  • 5,874
  • 3
  • 32
  • 45
5
votes
2 answers

Invalid status code '400' from .. error payload: "requirement failed: Session isn't active

I am running Pyspark scripts to write a dataframe to a csv in jupyter Notebook as below: df.coalesce(1).write.csv('Data1.csv',header = 'true') After an hour of runtime I am getting the below error. Error: Invalid status code from…
user11568522
5
votes
2 answers

How to add functions from custom JARs to EMR cluster?

I created an EMR cluster on AWS with Spark and Livy. I submitted a custom JAR with some additional libraries (e.g. datasources for custom formats) as a custom JAR step. However, the stuff from the custom JAR is not available when I try to access it…
rabejens
  • 7,594
  • 11
  • 56
  • 104
1
2 3
19 20