1

I have around 300 variables and I am trying to pass customschema via csv. Below is the sample code which I am using. However on uploading the schema via csv files...The output doesnt contain columns list:

Output : StructType(List(StructField(StructType([,StringType,true)))

Code in CSV:

schema = StructType([ \
            StructField("COl1",StringType(),True), \
            StructField("COL2",DecimalType(20,10),True), \
            StructField("COL3",DecimalType(20,10),True)
        ])

# reading schema
sch_df = spark.read.option("header", "true").csv("schema.csv").schema
# Passing schema
df = spark.read.schema(sch_df).option("header", "true").csv("/sample.csv")

Can you please provide the right away to upload the schema via csv file?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Amy Jack
  • 55
  • 10
  • How does the `schema.csv` look like? Could you post an example? – werner Apr 26 '21 at 16:21
  • Does this look similar to yours https://stackoverflow.com/questions/67196155/uploading-custom-schema-from-a-csv-file-using-pyspark/67579764#67579764? – pltc May 24 '21 at 03:42
  • The output doesn't contain what you want because you're using the wrong attribute. You want the method `printSchema()` – OneCricketeer Aug 20 '21 at 01:31

0 Answers0