2

I try to write KSH script for processing a file consisting of name-value pairs, several of them on each line.

Format is:

NAME1 VALUE1,NAME2 VALUE2,NAME3 VALUE3, etc

Suppose I write:

read l
IFS=","
set -A nvls $l
echo "$nvls[2]"

This will give me second name-value pair, nice and easy. Now, suppose that the task is extended so that values could include commas. They should be escaped, like this:

NAME1 VALUE1,NAME2 VALUE2_1\,VALUE2_2,NAME3 VALUE3, etc

Obviously, my code no longer works, since "read" strips all quoting and second element of array will be just "NAME2 VALUE2_1".

I'm stuck with older ksh that does not have "read -A array". I tried various tricks with "read -r" and "eval set -A ....", to no avail. I can't use "read nvl1 nvl2 nvl3" to do unescaping and splitting inside read, since I dont know beforehand how many name-value pairs are in each line.

Does anyone have a useful trick up their sleeve for me?

PS I know that I have do this in a nick of time in Perl, Python, even in awk. However, I have to do it in ksh (... or die trying ;)

ADEpt
  • 5,504
  • 1
  • 25
  • 32

2 Answers2

1

As it often happens, I deviced an answer minutes after asking the question in public forum :(

I worked around the quoting/unquoting issue by piping the input file through the following sed script:

sed -e 's/\([^\]\),/\1\
/g;s/$/\
/

It converted the input into:

NAME1.1 VALUE1.1
NAME1.2 VALUE1.2_1\,VALUE1.2_2
NAME1.3 VALUE1.3
<empty line>
NAME2.1 VALUE2.1
<second record continues>

Now, I can parse this input like this:

while read name value ; do
  echo "$name => $value"
done

Value will have its commas unquoted by "read", and I can stuff "name" and "value" in some associative array, if I like.

PS Since I cant accept my own answer, should I delete the question, or ...?

ADEpt
  • 5,504
  • 1
  • 25
  • 32
  • Does using sed count? You could also use awk or perl or ... to do the munging. The sed regex surprises me slightly; I would have used two backslashes inside the square brackets, but I guess that is not actually necessary. – Jonathan Leffler Oct 11 '08 at 04:22
  • As to deleting the question - I don't know what the recommended procedure is, but I doubt that destroying your words of wisdom is really what they want. If the worst comes to the worst, I could copy your answer for you and let you select that - but it is a complete cheat. – Jonathan Leffler Oct 11 '08 at 04:24
  • Oh. I just stumbled upon http://stackoverflow.com/questions/209329/stackoverflow-should-i-answer-my-own-question-or-not. Seems like it's better to leave it as it is. Maybe someone will found this useful and upvote it :) – ADEpt Oct 16 '08 at 22:14
0

You can also change the \, pattern to something else that is known not to appear in any of your strings, and then change it back after you've split the input into an array. You can use the ksh builtin pattern-substitution syntax to do this, you don't need to use sed or awk or anything.

read l
l=${l//\\,/!!}
IFS=","
set -A nvls $l
unset IFS
echo ${nvls[2]/!!/,}
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • The only caveat here is that older KSH (as still found on SunOS, for example) does not have that nifty substitution function. – ADEpt Jan 04 '09 at 09:18