I am trying to wrap the hive cli (I know I could use hive server) in a python subprocess and stream the output. When I run this query via the cli:
hive -e "SELECT * from table" Logging initialized using configuration in file:/etc/hive/conf/hive-log4j.properties Hive history file=/tmp/user/hive_job_log_42a5d7e9-663d-4244-b3e8-0f8ec8af010a_1754850433.txt Total MapReduce jobs = 1 .....
The cli streams the info to shell. When I wrap this query in python:
def execute_subprocess(cmd):
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, cwd=os.path.dirname(os.path.abspath(__file__)), shell=True)
while True:
nextline = process.stdout.readline()
if nextline == '' and process.poll() != None:
break
sys.stdout.write(nextline)
sys.stdout.flush()
out = process.communicate()
output = out[0]
exitCode = process.returncode
if (exitCode == 0):
return output
else:
raise Exception(output)
and execute
execute_subprocess(['hive','-e','SELECT * FROM TABLE'])
I do not see the hive cli output at all. Why Not?
Thanks!