6

I'm new in python & I'm trying to connect with Hadoop HDFS system. I got the following reference code as which I tried to implement it, but it's showed error while importing the package.

from pyarrow import HdfsClient

# Using libhdfs
hdfs = HdfsClient('192.168.0.119', '50070', 'cloudera', driver='libhdfs')

Error: ImportError: cannot import name 'HdfsClient'

I even tried to install it using "pip", but

Could not find a version that satisfies the requirement HdfsClient (from versi ons: ) No matching distribution found for HdfsClient

then I tried using "conda", but again

Collecting package metadata: done Solving environment: failed

PackagesNotFoundError: The following packages are not available from current cha nnels:

  • hdfsclient

Current channels:

To search for alternate channels that may provide the conda package you're looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.

Actually I'm trying to connect to the HUE using:

IP Add -> 192.168.0.119

Port name -> 50070

Username -> cloudera

password -> cloudera

But it's not working out. Can anyone please suggest to connect it in a better way or how to import "HdfsClient" package in Python 3.

David
  • 366
  • 3
  • 22

1 Answers1

2

HDFSClient is deprecated. You might want to use pyarrow.hdfs.connect. Also try pip freeze to see if the relevant library is installed in your python environment or not. ex.

from pyarrow import hdfs
hdfs.connect('192.168.0.119', 50070, 'cloudera', driver='libhdfs')
frb
  • 3,738
  • 2
  • 21
  • 51
lego king
  • 558
  • 5
  • 11
  • Thanks @kapil but it still showing error, `FileNotFoundError: [WinError 2] The system cannot find the file specified`. And I do have password as well for connecting to the hadoop & there is no parameter for password in hdfs.connect() method. – David Apr 03 '19 at 09:01
  • You get this error while making hdfs connection or while reading a file. can you paste your code for more clarity. – lego king Apr 03 '19 at 18:01
  • No, I'm not reading any file, I just used your "**pyarrow**" code (mentioned above by you) & simply tried to execute as all the details inside hdfs.connect() method are all same. – David Apr 04 '19 at 03:52