the web interface of a running Spark application to monitor and inspect Spark job executions in a web browser
Questions tagged [spark-ui]
76 questions
12
votes
2 answers
How to open Spark UI when working on Google Colab?
How can I monitor the progress of a job through the Spark WEB UI? Running Spark locally, I can access Spark UI through the port 4040, using http://localhost:4040.

Salem Othman
- 121
- 1
- 4
9
votes
1 answer
How to view AWS Glue Spark UI
In my Glue job, I have enabled Spark UI and specified all the necessary details (s3 related etc.) needed for Spark UI to work.
How can I view the DAG/Spark UI of my Glue job?

Ankur Shrivastava
- 223
- 4
- 14
6
votes
0 answers
Spark UI: How to understand the min/med/max in DAG
I would like to fully understand the meaning of the information about min/med/max.
for example:
scan time total(min, med, max)
34m(3.1s, 10.8s, 15.1s)
means of all cores, the min scan time is 3.1s and the max is 15.1, the total time accumulated is…

mingzhao.pro
- 709
- 1
- 6
- 20
6
votes
3 answers
SparkUI not showing Tab (Jobs, Stages, Storage, Environment,...) when run in standalone mode
I'm running spark master through the following command:
./sbin/start-master.sh
After that I went to http://localhost:8080, and I saw the following page.
I was expecting to see the tab with Jobs, Environments, ... like the following
Could someone…

Giuseppe Scopelliti
- 132
- 2
- 6
5
votes
2 answers
Spark SQL : Why am I seeing 3 jobs instead of one single job in the Spark UI?
As per my understanding, there will be one job for each action in Spark.
But often I see there are more than one jobs triggered for a single action.
I was trying to test this by doing a simple aggregation on a dataset to get the maximum from each…

Remis Haroon - رامز
- 3,304
- 4
- 34
- 62
4
votes
1 answer
How to fix SparkUI Executors , java.io.FileNotFoundException
I've deployed Spring boot server with Apache Spark and
everything works stably. But http://X.X.X.X:4040/executors/
SparkUI executors endpoint throws java.io.FileNotFoundException and cannot find /opt/x/x!/BOOT-INF/lib/spark-core_2.11-2.2.0.jar. I…

Alexey Malyshev
- 43
- 4
4
votes
1 answer
Apache Spark: Relationship between action and job, Spark UI
To the best of my understanding till date, in spark a job is submitted whenever an action is called on a dataset/dataframe. the job may further be divided into stages and tasks, which I understand how to find out the number of stages and tasks.…

Vipul Rajan
- 494
- 1
- 5
- 16
3
votes
1 answer
What is spark spill (disk and memory both)?
As per the documentation:
Shuffle spill (memory) is the size of the deserialized form of the shuffled data in memory.
Shuffle spill (disk) is the size of the serialized form of the data on disk.
My understanding of shuffle is this:
Every…

figs_and_nuts
- 4,870
- 2
- 31
- 56
3
votes
2 answers
What is shufflequerystage in spark DAG?
What is the shufflequerystage box that I see in the spark DAGs. How is it different from the excahnge box in the spark stages?

figs_and_nuts
- 4,870
- 2
- 31
- 56
3
votes
1 answer
Multiple jobs from a single action (Read, Transform, Write)
Currently using PySpark on Databricks Interactive Cluster (with Databricks-connect to submit jobs) and Snowflake as Input/Output data.
My Spark application is supposed to read data from Snowflake, apply some simple SQL transformations (mainly…

Gohmz
- 1,256
- 16
- 31
3
votes
0 answers
How to avoid showing some secret values in sparkUI
I am passing some secret keys in spark-submit command.
I am using below to redact the key:
--conf 'spark.redaction.regex='secret_key'
though it is working,the secret_key is visible in sparkUI during job execution.The redaction takes place at the…

mukesh dewangan
- 59
- 4
3
votes
0 answers
Job and task duration relationship is Spark UI
I am trying to understand the spark UI to monitor the timings but having difficulty in understanding job duration and task duration relationship.
For Below jobs it says total run time 13 Min , but when i open the stage (which have 1 stage and 1…

gkarya42
- 429
- 6
- 22
3
votes
0 answers
Any API to get the data on the query DAG from Spark UI SQL tab
The spark UI has an SQL tab. It can show the query detail as a DAG
https://www.cloudera.com/documentation/enterprise/5-9-x/topics/operation_spark_applications.html
After the application finishes, the DAG also annotates its nodes with statistic…

Joe C
- 2,757
- 2
- 26
- 46
3
votes
1 answer
No start-history-server.sh when pyspark installed through conda
I have installed pyspark in a miniconda environment on Ubuntu through conda install pyspark. So far everything works fine: I can run jobs through spark-submit and I can inspect running jobs at localhost:4040. But I can't locate…

oulenz
- 1,199
- 1
- 15
- 24
3
votes
1 answer
Can't access to SparkUI though YARN
I'm building a docker image to run zeppelin or spark-shell in local against a production Hadoop cluster with YARN. edit: the environment was macOS
I can execute jobs or a spark-shell well but when I try to access on Tracking URL on YARN meanwhile…

Pau Trepat
- 697
- 1
- 6
- 24