2

Is it possible to read fixed length file in AWS Glue using DynamicFrameReader from_options without using Crawlers? I found the below solution using spark but is there a way to do this in Glue directly ? pyspark parse fixed width text file

Aji C S
  • 71
  • 7

1 Answers1

4

I found the solution using AWS documentation. we can use format="grokLog".

For Ex:- for a file with below structure

abcdef1234

ghijkl4567

and the column structure is of length of 3, 3 and 4. Then we can use the below code logic.

from_options(connection_type='s3', connection_options={"paths": ["s3://mybucket/object_a"]}, format="grokLog", format_options={"logFormat":"(?<c1>.{3})(?<c2>.{3})(?<c3:int>.{4})"})

Aji C S
  • 71
  • 7