2

I'm starting to tinker with gawk's dynamic extensions, and I'm wanting to implement a preprocessor for the files that awk will be operating on. Specifically, I'm wanting to unzip gzipped files when they are passed as arguments to gawk. E.g. the invocation would be like:

awk -f myscript.awk file1.gz file2.gz file3.gz

And myscript.awk would call a dynamic extension somehow to preprocess each input file and decompress it before feeding the contents into awk's pattern-action statements. Are dynamic extensions capable of such preprocessing? What would it look like?

Later on, I would like to create a similar extension that would decode files, decrypt files, etc. before passing into awk. For these tasks I would usually preprocess the files and then pipe into awk or similar, but there are always disadvantages. It seems if dynamic extensions can preprocess files, I should be able to avoid the disadvantages.

Rusty Lemur
  • 1,697
  • 1
  • 21
  • 54
  • 1
    Does this help you? [How to use awk for a compressed file](https://stackoverflow.com/questions/13137501/how-to-use-awk-for-a-compressed-file) – Corentin Limier Jan 02 '20 at 17:25
  • I'm not sure that you can do it inside a `.awk` file (read by awk with `-f` option). It might be easier to create a simple `.sh` bash script that would preprocess the files and call awk inside that script. – Corentin Limier Jan 02 '20 at 17:27
  • Thanks Corentin. I've used many similar tactics in projects before, but they have disadvantages that I'd like to overcome (e.g. loss of FILENAME usefulness, loss of FNR/NR usefulness, requiring multiple invocations of awk, etc). However, as I'm reading over the extension API documentation, I'm getting less optimistic that it will allow preprocessing a file in the way I want. – Rusty Lemur Jan 02 '20 at 17:38
  • Maybe try to switch to Python which has a lot of libraries to deal with those issues :) – Corentin Limier Jan 02 '20 at 18:00
  • I agree Python has the capability of dealing with these issues quite nicely, but I'm interested in awk solutions for various reasons. – Rusty Lemur Jan 02 '20 at 18:19
  • Also, I disagree with the decision to close this post. The question is unique, as I am looking for a way to preprocess files with awk's dynamic extensions, not merely dealing with compressed files or preprocessing outside of awk. It might not be possible for such preprocessing, but I would like further visibility and input before closing it. – Rusty Lemur Jan 02 '20 at 18:22
  • I voted to reopen it – Corentin Limier Jan 02 '20 at 19:07
  • Finally, one of the first posts about the new features of GNU awk! – kvantour Jan 03 '20 at 14:41

1 Answers1

1

A question on the gawk extlib mailing list got a response pointing to this:

https://www.gnu.org/software/gawk/manual/html_node/Input-Parsers.html

It looks like this should be able to do it!

Rusty Lemur
  • 1,697
  • 1
  • 21
  • 54