Before you read my question, you should know that it's not a duplicate to all other similar questions on Stackoverflow that I already read!
I've developed (and built and compiled) a Hadoop program on Windows 10
(development machine) using Eclipse. The program reads WARC files and rewrite them in JSON format. It uses these classes to override the Writable format:
WarcFileInputFormat.java
WarcFileRecordReader.java
WritableWarcRecord.java
I added hadoop- 2.6.0
jars files to my project.
This is the java version of development machine:
$java -version
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
I tested my code on Ubuntu 14
machine (testing machine) and it just worked perfectly. this machine has hadoop- 2.6.0
:
Hadoop 2.6.0
Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1
Compiled by jenkins on 2014-11-13T21:10Z
Compiled with protoc 2.5.0
From source with checksum 18e43357c8f927c0695f1e9522859d6a
This command was run using /home/username/hadoop-2.6.0/share/hadoop/common/hadoop-common-2.6.0.jar
This is the java version of testing machine:
java version "1.8.0_91"
Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
I moved to a CentOS oracle server and ran my program. This server has the same java version as my development machine:
java version "1.8.0_151"
Java(TM) SE Runtime Environment (build 1.8.0_151-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.151-b12, mixed mode)
and it had this Hadoop version:
Hadoop 2.6.0-cdh5.13.1
Subversion http://github.com/cloudera/hadoop -r 0061e3eb8ab164e415630bca11d299a7c2ec74fd
Compiled by jenkins on 2017-11-09T16:37Z
Compiled with protoc 2.5.0
From source with checksum 16d5272b34af2d8a4b4b7ee8f7c4cbe
This command was run using /opt/cloudera/parcels/CDH-5.13.1-1.cdh5.13.1.p0.2/jars/hadoop-common-2.6.0-cdh5.13.1.jar
On this server I got the following error for each job:
18/02/23 11:59:45 INFO mapreduce.Job: Task Id : attempt_1517308433710_0012_m_000003_0, Status : FAILED
Error: data/warc/WarcFileInputFormat : Unsupported major.minor version 52.0
Here are the things I tried and they didn't work:
- Adding the WARC readers' classes to my project as classes not as a jar to build them using the correct java version.
- Changing Hadoop jar files version from
hadoop- 2.6.0
tohadoop- 2.6.0-cdh5.13.1
- Using
mapred
instead ofmapreduce
as the reader classes are usingmapred
I'm not sure what exactly causes this issue especially that it points to the class WarcFileInputFormat.java that I didn't write but added to my project from the link I provided above. What surprised me is that the program worked perfectly on Ubuntu
machine that has a lower java version, but didn't work on the CentOS
server that has the same java version as the Windows
development machine!
Any hints?