4

I have a string, the format is the same as csv, with first row as column name and rest of the records be data. How do I use pyspark to load this string into data frame.

str = '''
        sale_id, cust_name, amount
        111, abc, 10000
        222, bcd, 15000
      '''
Vaebhav
  • 4,672
  • 1
  • 13
  • 33
huyuxiang
  • 125
  • 6

1 Answers1

6

Found answer:

import pandas as pd
import io

data = io.StringIO(str)
pd_df = pd.read_csv(data, sep=",")
df = spark.createDataFrame(pd_df)
display(df)
huyuxiang
  • 125
  • 6