2

I wanted to start learning the big data technology from the scratch. I wanted to know is it necessary to learn java for operating with hadoop as i am already well versed in python?

Luiggi Mendoza
  • 85,076
  • 16
  • 154
  • 332

3 Answers3

2

No, you don't necessarily need java knowledge, as you can write map-reduce jobs perfectly well in pig or hive (similar to SQL). However, as with all layers of abstraction, at some point you may well need to know what is going on "behind the scenes" and being able to look, understand and debug the underlying java is a big advantage.

There is a lot of effort currently going into providing a more complete SQL interface to hadoop, with tools such as Impala (Cloudera), Presto (Facebook), Phoenix and Hive (already mentioned).

davek
  • 22,499
  • 9
  • 75
  • 95
1

Not needed at all , though thats just my opinion. if you python well you should be fine.

check this out writing a hadoop map reduce in python. theres a lot of ways to implement solutions with hadoop. Just because a great deal of them are in Java doesnt mean java is the only tool to solve use . If your working with legacy that is written in java then knowing the basics may help but to be honest i think you could just reference things as you come across them. There is no need to spend a week learning the intricacies of Java 7 and whats new in Java 8 for your current needs.

Frank Visaggio
  • 3,642
  • 9
  • 34
  • 71
1

Check out MRJob, a python based wrapped for hadoop jobs running, logging and monitoring.

Although pure java solutions might be faster in some cases, you hardly ever will need to debug java code.

Community
  • 1
  • 1
alko
  • 46,136
  • 12
  • 94
  • 102