Is it possible to read fixed length file in AWS Glue using DynamicFrameReader from_options without using Crawlers? I found the below solution using spark but is there a way to do this in Glue directly ? pyspark parse fixed width text file
Asked
Active
Viewed 1,137 times
2
-
I believe you should be able to. Are you facing any issues? – Prabhakar Reddy Oct 01 '21 at 09:58
1 Answers
4
I found the solution using AWS documentation. we can use format="grokLog".
For Ex:- for a file with below structure
abcdef1234
ghijkl4567
and the column structure is of length of 3, 3 and 4. Then we can use the below code logic.
from_options(connection_type='s3', connection_options={"paths": ["s3://mybucket/object_a"]}, format="grokLog", format_options={"logFormat":"(?<c1>.{3})(?<c2>.{3})(?<c3:int>.{4})"})

Aji C S
- 71
- 7