Parsing a comma-separated "key: value" string

Question

I have a text file with several blocks with several lines that may look like this:

{ key1: value, key2: value,
  key3: value,
  key4: value, key5: value }

{ key1: value, key2: value, key3: value,
  key4: value, key5: value }

Given a key, how can I get all the corresponding values? Note that neither the key names nor the values have a fixed length, blocks start and finish with braces and pairs are separated by commas.

My first try was with grep and cut, but I couldn't get all the keys. I guess that this should be easy with sed or awk, but their syntax confuses me a lot.

Are you attempting to parse [JSON](http://en.wikipedia.org/wiki/JSON)? — devnull, Aug 28 '13 at 09:35
looks like `json`. ruby/python/perl/... has packages for parsing it. http://stackoverflow.com/questions/1955505/parsing-json-with-sed-and-awk this might be what you're looking for, though I wouldn't use bash (or extra utils) for this... — Karoly Horvath, Aug 28 '13 at 09:36
No, it is a yaml file. I didn't find a handy tool to parse a large yaml file (actually I got two of them, but they could not read the entire file). Since I only need those fields, I thought bash was a good choice. — ChronoTrigger, Aug 28 '13 at 09:41
What characters compose the keys and values? Are they placed around doublequotes? Is it possible for keys or values to have commas or colons? — konsolebox, Aug 28 '13 at 09:46
@konsolebox, keys are not around double quotes and values are just numbers in scientific notation (e.g. 5.22e+1). — ChronoTrigger, Aug 28 '13 at 09:47

konsolebox · Accepted Answer · 2013-08-28T10:31:02.997

4

First solution with grep:

grep -o 'key5: [^, }]*' file

Shows someting like:

key5: value
key5: value

To remove the keys:

grep -o 'key5: [^, }]*' file | sed 's/^.*: //'

value
value

edited Aug 28 '13 at 10:31

answered Aug 28 '13 at 09:58

konsolebox

72,135
12
99
105

I realized that when entering `key5`, I get `value }`. How to avoid that brace? – ChronoTrigger Aug 28 '13 at 10:22
1

@ChronoTrigger Your values could only contain numbers right? I updated it as well to exclude bracket `}` and space. – konsolebox Aug 28 '13 at 10:31

score 1 · Answer 2 · answered Aug 28 '13 at 09:53

This only works if the key and value are on the same line, and if the key is not contained in any value, if values and keys do not contain spaces, commas, or colons:

awk -F'[, :]+' '{for (i=1;i<NF;i++) if ($i=="key3") print $(i+1)}' file

or if you want to the key from a variable

awk -F'[, :]+' -v key="$key" '{for (i=1;i<NF;i++) if ($i==key) print $(i+1)}' file

score 1 · Answer 3 · answered Aug 28 '13 at 09:58

1

Using sed and grep:

sed 's|[{},]|\n|g' your-file.txt | grep -Po '(?<=key1:).*$'

sed reformats the file to have only one pair key-value on each line; then use grep with lookbehind to extract only values correpsonding to a specified key.

answered Aug 28 '13 at 09:58

Bentoy13

4,886
1
20
33

Parsing a comma-separated "key: value" string

3 Answers3