3

I use spark 1.1.0 on a standalone cluster with 3 nodes.

I want to see the detailed logs of Completed Applications so I've set in my program :

set("spark.eventLog.enabled","true")
set("spark.eventLog.dir","file:/tmp/spark-events")

but when I click on the application in the webui, I got a page with the message :

Application history not found (app-20150126000651-0331) No event logs found for application xxx$ in file:/tmp/spark-events/xxx-1422227211500. Did you specify the correct logging directory?

despite the fact that the directory exist and contains 3 files :

APPLICATION_COMPLETE*, EVENT_LOG_1* and SPARK_VERSION_1.1.0*

Any suggestion to solve the problem ?

Thanks.

Federico
  • 83
  • 7
Alex
  • 351
  • 1
  • 12
  • I'm having the same issue. Although, the job history file does get created in the `spark.eventLog.dir`. In my case it is `/tmp/app-20151203103109-0013` – Ivan Balashov Dec 03 '15 at 10:35
  • @IvanBalashov did you find a solution for this? – meson10 Mar 30 '16 at 06:58
  • @meson10 I didn't manage to make links work from spark master UI, but, I started spark history server pointing it to `spark.eventLog.dir` in GCS, and watched stats from there. – Ivan Balashov Mar 30 '16 at 11:43
  • @IvanBalashov oh ok. Thanks. Btw, do you expose Spark Cluster via Public IPs? – meson10 Mar 30 '16 at 17:56
  • @meson10 No, private IPs only, as installed by `bdutil`. – Ivan Balashov Mar 30 '16 at 20:06
  • @IvanBalashov Hmm. So the application servers sit pretty much inside the same VPC/Subnets? – meson10 Mar 31 '16 at 02:14
  • @meson10 Yes, which is quite common. Are you having any issues with accessing the cluster externally? If so, I'm not sure if this is the best place to discuss it though :) – Ivan Balashov Mar 31 '16 at 07:27
  • @IvanBalashov haha. Not really a problem but was wondering how the world does it besides the Spark JobServer. – meson10 Mar 31 '16 at 16:27

2 Answers2

1
  1. why is your application name xxx$ and then xxx in your error message ? Is that really what Spark reports ?
  2. Permissions problem : check that the directory in which you log is readable and executable by the user under which you run Spark (and that the inside files a readable as well).
  3. Check that you do specify master correctly, i.e. --master spark://<localhostname>:7077
  4. Dig in the EVENT_LOG_1* file. The last event (on the last line) of the file should be an "Application Complete" event. If it doesn't, it's likely that your application did not call sc.stop(), though the logs should still show up nonetheless.
Francois G
  • 11,957
  • 54
  • 59
  • 1. No I changed my application name 2. I've made the directory and the files readable and executbale for all user but still not work 3. no problem with the master specification in my spark-submit 4. The event_log file ends with : {"Event":"SparkListenerJobEnd","Job ID":0,"Job Result":{"Result":"JobSucceeded"}} {"Event":"SparkListenerApplicationEnd","Timestamp":1422399566881} – Alex Jan 27 '15 at 23:28
  • The same for spark 1.2. Server pointed to correct dir, dir gets history of finished apps and history server doesn't display it. – Capacytron Apr 17 '15 at 15:18
  • Any luck with that? I'm experiencing the same issue with DSE 4.6 and DSE 4.7 – Federico May 25 '15 at 07:03
  • 1
    The 4th point worked for me, thanks! I had forgotten sc.stop() – Joren Van Severen Jun 02 '15 at 15:53
  • Re 4th: It doesn't make any sense. What if the application crashed and stop context could not be called? No logs then? – Ivan Balashov Dec 03 '15 at 10:25
0

I had the same error "Did you specify the correct logging directory?" and for me the fix was to add a '/' at the end of the path for 'spark.eventLog.dir' i.e. /root/ephemeral-hdfs/spark-events/

>> cat spark/conf/spark-defaults.conf
    spark.eventLog.dir /root/ephemeral-hdfs/spark-events/
    spark.executor.memory   5929m
Chenna V
  • 10,185
  • 11
  • 77
  • 104