-3

read a CSV file from an S3 location from dynamic frame options

csv_dynamicframe = glueContext.create_dynamic_frame.from_options(
    "s3",
    connection_options = {
        "paths": [root_path]
        },
    format = "csv",
    format_options = {
        'withHeader': True,
        # "quoteChar": -1, 
        "separator": ",",
        'encoding': 'utf-8'
        },
    transformation_ctx = "csv_dynamicframe",
    schema=dynamic_frame_catalog.schema()
)

I have tried this but it is not working.

Sajjad Ali
  • 91
  • 2
  • 10
adk
  • 17
  • 3

1 Answers1

0

just get the same problem like you. And after I check the official document from AWS Glue document[1], check the section "CSV configuration reference".

It seems CANNOT set encoding when using dynamic frame.

However, there is an alternative method. use the "spark.read".
try to refer this one [2]

and following is the sample code

dataFrame = spark.read\
    .format("csv")\
    .option("header", "true")\
    .option("encoding", "UTF-8")\
    .load("s3://YOUR_CSV_S3_URI")

Hope this helps!!

[1] https://docs.aws.amazon.com/glue/latest/dg/aws-glue-programming-etl-format-csv-home.html [2] How to parse CSV file with UTF-8 encoding?