0

Passing the following commands I would expect the first to split the string (which is also a regex) into two array elements and the second command (match) to print [[:blank:]].

echo "new[[:blank:]]+File\(" | awk '{ split($0, a, "[[:blank:]]"); print a[1]}'

prints the whole string as it has not split

echo "new[[:blank:]]+File\(" | awk '{ match($0, /[[:blank:]]/, m)}END{print m[0]}'

prints nothing

What am I missing here?

UPDATE

I'm calling an awk script with the following command;

awk -v regex1=new[[:blank:]]+File\( -f parameterisedRegexAwkScript.awk "$file" >> "output.txt"

Then in the my script I attempt to split on the string literal with the following command;

len = split(regex1, regex, /[[:blank:]]/, seps

but when I print len it's value is 1 when I would have expected it to be 2

Michael Heneghan
  • 297
  • 1
  • 3
  • 13
  • 1
    Thanks for showing your efforts. Your string `echo "new[[:blank:]]+File\(" ` is not being taken as a regexp its taken as literal string in echo, could you please show more clear samples of input and expected output so that we can get better understanding of your question, thank you(Not my downvote btw). – RavinderSingh13 Apr 15 '21 at 14:09
  • Hi @RavinderSingh13, thanks for your response. I've updated the question with a more detailed input and output. Hope this helps, thanks – Michael Heneghan Apr 15 '21 at 14:20

1 Answers1

1
echo "new[[:blank:]]+File\(" | awk '{ split($0, a, "[[:blank:]]"); print a[1]}'

3rd argument for split works like setting FS in BEGIN, so in this case you instruct to split at any whitespace, you need to escape [ and ]. Let file.txt content be

new[[:blank:]]+File\(

then

awk '{split($0, a, "\\[\\[:blank:\\]\\]"); print a[1]}' file.txt

output

new

(tested in gawk 4.2.1)

Daweo
  • 31,313
  • 3
  • 12
  • 25
  • 4
    The 3rd arg to split() is a regexp, not a string, so use regexp `/` rather than string `"` delimiters. You also don't need to escape a `]` as it's only a regexp metachar if preceded by an opening `[`. Your `split()` call should then be `split($0, a, /\[\[:blank:]]/)` instead of `split($0, a, "\\[\\[:blank:\\]\\]")`. – Ed Morton Apr 15 '21 at 14:36
  • @EdMorton this worked thanks. I do find that the value of a[2] is 'File(' instead of 'File\\('. Even when the command is updated to pass regex1='new[[:blank:]]+File\\\(' I get awk: warning: escape sequence `\(' treated as plain `('. But when I try regex1="new[[:blank:]]+File\\\\\\\(" or regex1='new[[:blank:]]+File\\\\(' it sets a[2] to 'File\\\(' – Michael Heneghan Apr 15 '21 at 15:06
  • You're welcome. Of course but that's a different question so please ask a new question about that if you'd like help with it. – Ed Morton Apr 15 '21 at 15:08
  • @EdMorton - https://stackoverflow.com/questions/67111174/attempting-to-pass-an-escape-char-to-awk-as-a-variable I hope it makes more sense now – Michael Heneghan Apr 15 '21 at 15:16