1

New to python coding, getting following error

I can view that testdata.json' that this location using

hdfs dfs -ls /data/testdata.json'

Traceback (most recent call last): File "testdata.json'", line 6, in with open('hdfs:///data/testdata.json') as data_file: IOError: [Errno 2] No such file or directory: 'hdfs:///data/testdata.json' python process_sensor_file.py

#!/bin/python
import json
from pprint import pprint

with open('hdfs:///data/testdata.json', "r") as data_file:
     source_data = json.load(data_file)
print(source_data)

print(json.dumps(source_data, indent=2))

for item in source_data['CityData']:
            Longitude = item['Longitude']
            TimeStamp = item['TimeStamp']
            print(Longitude, TimeStamp)
S M
  • 101
  • 3
  • 16
  • read this link, it may help https://stackoverflow.com/questions/42447912/how-to-read-the-file-from-hdfs – Haifeng Zhang Feb 27 '18 at 20:01
  • I am having issue with open statement for json file. File "testdata.json'", line 6, in with open('hdfs:///data/testdata.json') as data_file: IOError: [Errno 2] No such file or directory – S M Feb 27 '18 at 20:05

1 Answers1

0

You need an HDFS driver for python to be able to read from HDFS such as hdfs3

from the docs:

from hdfs3 import HDFileSystem
hdfs = HDFileSystem(host='localhost', port=8020)
with hdfs.open('/data/testdata.json') as f:
    data = f.read(1000000)
stacksonstacks
  • 8,613
  • 6
  • 28
  • 44