1

How can I change java.io.tmpdir folder for my Hadoop 3 Cluster running on YARN?

By default it gets something like /tmp/***, but my /tmp filesystem is to small for everythingYARN Job will write there.

Is there a way to change it ?

I have also set hadoop.tmp.dir in core-site.xml, but it looks like, it is not really used.

qwertz1123
  • 1,173
  • 10
  • 27

2 Answers2

0

perhaps its a duplicate of What should be hadoop.tmp.dir ?. Also, go through all .conf's in /etc/hadoop/conf and search tmp, see if anything is hardcoded. Also specify:

  • Whether you see (any) files getting created @ what you specified as hadoop.tmp.dir.
  • What pattern of files are being formed @ /tmp/** after your changes are applied.

I have also noticed hive creating files in /tmp. So, you may also have a look @ hive-site.xml. Similar for any other ecosystem product you are using.

sujit
  • 2,258
  • 1
  • 15
  • 24
  • I am interested in files created by spark jobs running on yarn. I can see this in command line of the processes that yarn runs. The pattern looks like /tmp/hadoop-username/***. I have configured hadoop.tmp.dir in core-site.xml. The problem is, that this has not helped. – qwertz1123 Mar 16 '18 at 13:37
0

I have configured yarn.nodemanager.local-dirs property in yarn-site.xml and restarted the cluster. After that spark stopped using /tmp file system and used directories, configured in yarn.nodemanager.local-dirs. java.io.tmpdir property for spark executors was also set to directories defined in yarn.nodemanager.local-dirs property.

<property>
      <name>yarn.nodemanager.local-dirs</name>
      <value>/somepath1,/anotherpath2</value>
</property> 
qwertz1123
  • 1,173
  • 10
  • 27