3

I am using the latest hadoop version 3.0.0 build from source code. I have my timeline service up and running and have configured hadoop to use that for job history also. But when I click on history in the resoucemanager UI I get the below error:-

HTTP ERROR 404

Problem accessing /jobhistory/job/job_1444395439959_0001. Reason:

    NOT_FOUND

Can someone please point out what I am missing here. Following is my yarn-site.xml:-

<configuration>

<!-- Site specific YARN configuration properties -->
<property>
  <description>The hostname of the Timeline service web application.</description>
  <name>yarn.timeline-service.hostname</name>
  <value>0.0.0.0</value>
</property>
<property>
  <description>Address for the Timeline server to start the RPC server.</description>
  <name>yarn.timeline-service.address</name>
  <value>${yarn.timeline-service.hostname}:10200</value>
</property>

<property>
  <description>The http address of the Timeline service web application.</description>
  <name>yarn.timeline-service.webapp.address</name>
  <value>${yarn.timeline-service.hostname}:8188</value>
</property>

<property>
  <description>The https address of the Timeline service web application.</description>
  <name>yarn.timeline-service.webapp.https.address</name>
  <value>${yarn.timeline-service.hostname}:8190</value>
</property>

<property>
  <description>Handler thread count to serve the client RPC requests.</description>
  <name>yarn.timeline-service.handler-thread-count</name>
  <value>10</value>
</property>
<property>
  <description>Indicate to ResourceManager as well as clients whether
  history-service is enabled or not. If enabled, ResourceManager starts
  recording historical data that Timelien service can consume. Similarly,
  clients can redirect to the history service when applications
  finish if this is enabled.</description>
  <name>yarn.timeline-service.generic-application-history.enabled</name>
  <value>true</value>
</property>

<property>
  <description>Store class name for history store, defaulting to file system
  store</description>
  <name>yarn.timeline-service.generic-application-history.store-class</name>
  <value>org.apache.hadoop.yarn.server.applicationhistoryservice.FileSystemApplicationHistoryStore</value>
</property>
<property>
     <description>URI pointing to the location of the FileSystem path where the history will be persisted.</description>
     <name>yarn.timeline-service.generic-application-history.fs-history-store.uri</name>
     <value>/tmp/yarn/system/history</value>
</property>
<property>
     <description>T-file compression types used to compress history data.</description>
     <name>yarn.timeline-service.generic-application-history.fs-history-store.compression-type</name>
     <value>none</value>
</property>



 <property>
     <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>
</configuration>

and my mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>localhost:10200</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>localhost:8188</value>
</property>
<property>
<name>mapreduce.job.emit-timeline-data</name>
<value>true</value>
</property>
</configuration>

JPS output:

6022 NameNode
27976 NodeManager
27859 ResourceManager
6139 DataNode
6310 SecondaryNameNode
28482 ApplicationHistoryServer
29230 Jps
penguin
  • 133
  • 2
  • 15
Harinder
  • 11,776
  • 16
  • 70
  • 126

1 Answers1

3

If you want to see the logs through YARN RM web UI, then you need to enable the log aggregation. For that, you need to set the following parameters, in yarn-site.xml:

  <property>
      <name>yarn.log-aggregation-enable</name>
      <value>true</value>
  </property>
  <property>
     <name>yarn.nodemanager.remote-app-log-dir</name>
     <value>/app-logs</value>
  </property>
  <property>
      <name>yarn.nodemanager.remote-app-log-dir-suffix</name>
      <value>logs</value>
  </property>

If you do not enable log aggregation, then NMs will store the logs locally. With the above settings, the logs are aggregated in HDFS at "/app-logs/{username}/logs/". Under this folder, you can find logs for all the applications run so far. Again the log retention is determined by the configuration parameter "yarn.log-aggregation.retain-seconds" (how long to retain the aggregated logs).

When the MapReduce applications are running, then you can access the logs from the YARN's web UI. Once the application is completed, the logs are served through Job History Server.

Also, set following configuration parameter in yarn-site.xml:

<property>
  <name>yarn.log.server.url</name>
  <value>http://{job-history-hostname}:8188/jobhistory/logs</value>
</property>
Manjunath Ballur
  • 6,287
  • 3
  • 37
  • 48
  • 1
    Thanks for helping. But after adding these also the error is same. – Harinder Oct 09 '15 at 16:50
  • I am using HDP installation. These settings work fine for me. Did you re-start the services, after changing the configuration? – Manjunath Ballur Oct 09 '15 at 16:56
  • Yes I did and I have seen the aggregation errors on some servers, but in that case it says aggregation not enabled on the UI. This seems different. Perhaps something to do with the timeline server setting. Cause history server is out now – Harinder Oct 09 '15 at 16:58
  • One thing i noticed is that the logs are shown as long as the job runs. I have yarn.log-aggregation.retain-seconds property also, but still getting deleted – Harinder Oct 09 '15 at 17:22
  • The logs from the directory were not getting deleted earlier. It started happening after adding the aggregation properties. – Harinder Oct 09 '15 at 17:33
  • Can you check from the command line? For e.g. execute command: "yarn log -applicationId ". Replace with the actual application ID. Also, please check the settings from this blog: http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/ – Manjunath Ballur Oct 09 '15 at 17:42
  • It worked with the history server. I guess the timeline server has not fully replaced the job history server for now, This is saw at some JIRA issue opened for apache hadoop. So back to using history server. Thanks for you help – Harinder Oct 10 '15 at 07:52