0

I am defining bottle api where I need to accept a file from the client and then save that file to HDFS on the local system.

The code looks something like this.

@route('/upload', method='POST')
def do_upload():
    import pdb; pdb.set_trace()
    upload = request.files.upload
    name, ext = os.path.splitext(upload.filename)

    save_path = "/data/{user}/{filename}".format(user=USER, filename=name)

    hadoopy.writetb(save_path, upload.file.read())
    return "File successfully saved to '{0}'.".format(save_path)

The issue is, the request.files.upload.file is an object of type cStringIO.StringO which can be converted to a str with a .read() method. But the hadoopy.writetb(path, content) expects the content to be some other format and the server sticks at that point. It doesn't give exception, it doesn't give error or any result. Just stands there as if it was in infinite loop.

Does anyone know how to write incoming file in bottle api to HDFS?

Keyur Golani
  • 573
  • 8
  • 26

1 Answers1

0

From the hadoopy documentation, it looks like the second parameter to writetb is supposed to be an iterable of pairs; but you're passing in bytes.

...the hadoopy.writetb command which takes an iterator of key/value pairs...

Have you tried passing in a pair? Instead of what you're doing,

hadoopy.writetb(save_path, upload.file.read())  # 2nd param is wrong

try this:

hadoopy.writetb(save_path, (path, upload.file.read()))

(I'm not familiar with Hadoop so it's not clear to me what the semantics of path are, but presumably it'll make sense to someone who knows HDFS.)

ron rothman
  • 17,348
  • 7
  • 41
  • 43
  • Path would just be any filesystem path string – OneCricketeer Sep 10 '17 at 17:17
  • Yes. Path is just a string like `hdfs://localhost:8082/root/data/files/foo` but the thing is that `upload.file.read()` is giving string which `writetb` is not accepting that! – Keyur Golani Sep 10 '17 at 18:16
  • @KeyurGolani Did you read the API for `writetb`? The second param is supposed to be a tuple, not a string. I edited my answer to make it clearer. Let me know if it works. – ron rothman Sep 10 '17 at 18:36
  • Yes. I have read the documentation. It says that the second argument is `iterator of (key, value)` and in this case the question is posted just because I don't know what is an iterator of (key, value) because all I have is a StringIO object and ability to convert it into a string. I have tried `hadoopy.writetb(save_path, (path, upload.file.read()))` already and I tried it again after your suggestion. It doesn't work. The server goes into some infinite loop and doesn't return anything back. – Keyur Golani Sep 10 '17 at 19:55
  • Since `writetb` clearly expects an iterator of pairs as its second argument, I would never expect the code in your question to work. You may want to update your Q with code that isn't working for you but that does conform to hadoopy's api. Good luck! – ron rothman Sep 10 '17 at 20:07