7

I use Spark 1.5.2 for a Spark Streaming application.

What is this Storage Memory in Executors tab in web UI? How was this to reach 530 MB? How to change that value?

enter image description here

Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
AkhilaV
  • 423
  • 3
  • 8
  • 18

1 Answers1

16

CAUTION: You use the very, very old and currently unsupported Spark 1.5.2 (which I noticed after I had posted the answer) and my answer is about Spark 1.6+.


The tooltip of Storage Memory may say it all:

Memory used / total available memory for storage of data like RDD partitions cached in memory.

Storage Memory in Executors tab in web UI

It is part of Unified Memory Management feature that was introduced in SPARK-10000: Consolidate storage and execution memory management that (quoting verbatim):

Memory management in Spark is currently broken down into two disjoint regions: one for execution and one for storage. The sizes of these regions are statically configured and fixed for the duration of the application.

There are several limitations to this approach. It requires user expertise to avoid unnecessary spilling, and there are no sensible defaults that will work for all workloads. As a Spark user, I want Spark to manage the memory more intelligently so I do not need to worry about how to statically partition the execution (shuffle) memory fraction and cache memory fraction. More importantly, applications that do not use caching use only a small fraction of the heap space, resulting in suboptimal performance.

Instead, we should unify these two regions and let one borrow from another if possible.

Spark Properties

You can control the storage memory using spark.driver.memory or spark.executor.memory Spark properties that set up the entire memory space for a Spark application (the driver and executors) with the split between regions controlled by spark.memory.fraction and spark.memory.storageFraction.


You should consider watching the slides Memory Management in Apache Spark by the author Andrew Or and the video Deep Dive: Apache Spark Memory Management by the author himself (again).


You may want to read how the Storage Memory values (in web UI and internally) are calculated in How does web UI calculate Storage Memory (in Executors tab)?

Community
  • 1
  • 1
Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
  • we are using a spark application which uses spark 1.5.1. Our application is not at all using storage memory. Is there any way we can reduce Storage memory in 1.51.? – chandu ram Sep 30 '18 at 09:09
  • Don't remember how it was in 1.5.1 and am truly shocked it's still in use. Please upgrade to 2.3.2 at your earliest convenience. – Jacek Laskowski Oct 08 '18 at 15:52
  • Isn't the tooltip description is slightly incorrect? It should be 'Memory used / total available memory for storage or RDD And internal Execution like shuffle, sort' – nir May 01 '19 at 00:00
  • @nir Perhaps. Please report it to Spark's JIRA to bring it to the attention of Spark devs. – Jacek Laskowski May 01 '19 at 01:41
  • @nir Did you ever verify if what you said is actually the correct description? – lightbox142 Jul 19 '21 at 20:32
  • @lightbox142 I have not. I've been off the data projects for while. I hope spark community have fixed it if needed. – nir Aug 30 '21 at 20:05