For Machine Learning/Data Mining, we need to learn about data, which means you need to learn Hadoop, which has implementation in Java for MapReduce (correct me if I am wrong). Hadoop also provides a streaming API to support other languages(like Python). Most grad students/researchers I know solve ML problems in Python. We see job posts for Hadoop and Java combination very often.
I observed that Java and Python (in my observation) are most widely used languages for this domain.
My question is what is the most popular language for working on this domain. What factors involve in deciding which language/framework one should choose?
I know both Java and Python but confused always:
- whether I start programming in Java (because of hadoop implementation)
- whether I start programming in Python (because its easier and quicker to write)
This is a very open ended question, I am sure the advice might help me and people who have same doubt.