6

Have seen many posts asking similar question. Can't get it working.

Input looks like:

<field one with spaces>|<field two with spaces>

Trying to parse with awk.

Have tried many variants from excellent posts:

FS = "^[\x00- ]*|[\x00- ]*[|][\x00- ]*|[\x00- ]*$";
FS = "^[\x00- ]*|[\x00- ]*\|[\x00- ]*|[\x00- ]*$";
FS = "^[\x00- ]*|[\x00- ]*\\|[\x00- ]*|[\x00- ]*$";

Still can't get the pipe delimiter to work.

Using CentOS.

Any help?

codeforester
  • 39,467
  • 16
  • 112
  • 140
scorpdaddy
  • 71
  • 1
  • 1
  • 3

1 Answers1

14
 echo "field one has spaces | field two has spaces" \
 | awk '
   BEGIN {
      FS="|" 
 }
 {
   print $2
   print $1
   # or what ever you want
 }'

 #output

  field two has spaces
  field one has spaces

You can also reduce this to

awk -F'|' {
    print $2
    print $1
}'

Edit Also, not all awks can take a multi-character regex for the FS value.

Edit2 Somehow I missed this originally, but I see you are trying to include \x00 in the char classes pre and post of the | char. I assume you mean for \x00 == null char? I don't think you're going to be able to have awk parse a file with null chars embedded. You could prep-rocess your input like

 tr '\x00'   ' ' < file.txt > spacesForNulls.txt 

OR delete them altogether with

tr -d '\x00' < file.txt > deletedNulls.txt

and eliminate that part of your regex. But as above, some awk don't support regex for the FS value. And, I don't use the tr trick very much, you may find that it requires a slightly different notation for the null char, depending on your version of tr.

I hope this helps.

shellter
  • 36,525
  • 7
  • 83
  • 90
  • Great point with `\x00`. Or the op should use a more specialized tool like `perl` or `ruby`. ++ – sjsam Jan 31 '18 at 14:46
  • `I don't think you're going to be able to have awk parse a file with null chars embedded` Or a second thought? `awk '{gsub("\x00","")}1` is possible. – sjsam Jan 31 '18 at 15:14