3

I want to write a Pandas dataframe in HDFS, just like that

import numpy as np
import pandas as pd
data = pd.DataFrame(np.arange(4).reshape(2,2),columns=['a','b'])
data.to_csv('/localpath/test.csv')
"""
the outputfile will be :
   a  b
0  0  1
1  2  3

"""

to_csv() will write to a local path, but how can I write the panda dataframe to hdfs and the file in hdfs the same as the file output in localpath?

Thanks!

haochi zhang
  • 113
  • 2
  • 3
  • 6

0 Answers0