How to access hdfs files direclty in python?

Asked Sep 10 '18 at 20:16

Active Sep 11 '18 at 06:49

Viewed 1,112 times

I am working on Hadoop and Spark Framework for clustering of images. I am using Python as my programming language.For map-reduce framework MRJOB package is used. The doubt i am having is how to access the hdfs files directly in python? For example if my file on hdfs is /a.txt now how do i access it in python directly to apply further processing. I looked at many libraries but i am not getting a concrete answer.I saw snakebite but it is only for python 2.

edited Sep 11 '18 at 06:49

OneCricketeer

179,855
19
132
245

asked Sep 10 '18 at 20:16

Alay Majmudar

1

Why not reading directly the file using Pyspark? An example: `sc.textFile("hdfs:///your_path_to/a.txt")` – NicolasKittsteiner Sep 10 '18 at 20:32
https://stackoverflow.com/a/51548097/2308683 – OneCricketeer Sep 10 '18 at 21:52

How to access hdfs files direclty in python?

0 Answers0