Use this tag for questions related to the Python package named HDFS.
Questions tagged [python-hdfs]
8 questions
21
votes
5 answers
What's the best module for interacting with HDFS with Python3?
I see there is hdfs3, snakebite, and some others. Which one is the best supported and comprehensive?

Farhat
- 1,203
- 2
- 12
- 19
2
votes
1 answer
How do i set the path of libhdfs.so for pyarrow?
I'm trying to use pyarrow and i keep getting the following error.
ImportError: Can not find the shared library: libhdfs3.so
so i read some stackoverflow and it says that i need to set enviorment variable for ARROW_LIBHDFS_DIR.
The path to…

Kush Singh
- 157
- 3
- 11
1
vote
0 answers
How to use hdfscli python library?
I have following use case,
I wanted to connect a remote hadoop cluster. So, I got all the hadoop conf files (coresite.xml, hdfs-site.xml and others) and stored it in one directory in local file system. I got the correct keytab and krb5.conf file for…

Neil
- 11
- 2
1
vote
0 answers
Writing JSON content to HDFS location using Python
I am trying to write JSON content to HDFS location using Python,but for every key and value in my JSON content, I am seeing prefix of u and ''.
Original JSON content
{
"id": 2344556,
"resource_type": "user",
"ext_uid": null,
"email":…

Rahul
- 467
- 1
- 8
- 24
0
votes
0 answers
remove only the file given in hdfs path and not the entire hdfs path
I am trying to delete the file 20221229_20230221-101756_Backtest_M.txt
given in hdfs path :
hdfs_path = '/dev/flux_entrant/depot/backtesting/'
To do it, I am using :
fs =…

user8810618
- 115
- 11
0
votes
1 answer
How can I get passed Connection error in pywebhfds?
I have a locally single-node hosted hadoop. my name and datanode are same.
I'm trying to create a file using python library.
self.hdfs = PyWebHdfsClient(host='192.168.231.130', port='9870', user_name='kush',
…

Kush Singh
- 157
- 3
- 11
0
votes
1 answer
Connect to HDFS with keytab of a serviceID with Python3.6
I am trying the below piece of code to connect to hdfs and do some file related operation. Please note I am trying to connect a Cloudera HDFS instance from a Centos7 environment with python3.6 installed into it.
import io
from csv import…

Shanit
- 109
- 1
- 2
- 7
0
votes
1 answer
in python hdfs Is there a way to use wildcard or regex in the list method?
In linux hadoop fs -ls I can use wildcard (/sandbox/*) but the pyhon hdfs client list method fails on this as an unknown path. Is there a different way to use wildcards in python-hdfs?

Ezer K
- 3,637
- 3
- 18
- 34