I want to pass a JSON string as command line argument to my reducer.py file but I'm unable to do so.
Command I execute is:
hadoop jar contrib/streaming/hadoop-streaming.jar -file /home/hadoop/mapper.py -mapper 'mapper.py' -file /home/hadoop/reducer.py -reducer 'reducer.py {"abc":"123"}' -input /user/abc.txt -output /user/output/
When I print argv array in reducer.py, it shows output as:
['/mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1423459215008_0057/container_1423459215008_0057_01_000004/./reducer.py', '{', 'abc', ':', '123', '}']
First argument is the path of reducer.py but my second argument gets split by double quotes.
I want to achieve second argument as a complete JSON string. For example like: ['/mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1423459215008_0057/container_1423459215008_0057_01_000004/./reducer.py','{"abc":"123"}']
So that I can load that argument as Json in reducer.py
Any help is appreciated. Thanks !
EDIT: Tried escaping JSON using command:
hadoop jar contrib/streaming/hadoop-streaming.jar -file /home/hadoop/mapper.py -mapper 'mapper.py' -file /home/hadoop/reducer.py -reducer 'reducer.py "{\"abc\":\"123\"}"' -input /user/abc.txt -output /user/output/
Gives output as:
['/mnt/var/lib/hadoop/tmp/nm-local-dir/usercache/hadoop/appcache/application_1423459215008_0058/container_1423459215008_0058_01_000004/./redu.py', '{\\', 'abc\\', ':\\', '123\\', '}']