0

I have a text file with several blocks with several lines that may look like this:

{ key1: value, key2: value,
  key3: value,
  key4: value, key5: value }

{ key1: value, key2: value, key3: value,
  key4: value, key5: value }

Given a key, how can I get all the corresponding values? Note that neither the key names nor the values have a fixed length, blocks start and finish with braces and pairs are separated by commas.

My first try was with grep and cut, but I couldn't get all the keys. I guess that this should be easy with sed or awk, but their syntax confuses me a lot.

ChronoTrigger
  • 8,459
  • 1
  • 36
  • 57
  • 1
    Are you attempting to parse [JSON](http://en.wikipedia.org/wiki/JSON)? – devnull Aug 28 '13 at 09:35
  • looks like `json`. ruby/python/perl/... has packages for parsing it. http://stackoverflow.com/questions/1955505/parsing-json-with-sed-and-awk this might be what you're looking for, though I wouldn't use bash (or extra utils) for this... – Karoly Horvath Aug 28 '13 at 09:36
  • No, it is a yaml file. I didn't find a handy tool to parse a large yaml file (actually I got two of them, but they could not read the entire file). Since I only need those fields, I thought bash was a good choice. – ChronoTrigger Aug 28 '13 at 09:41
  • What characters compose the keys and values? Are they placed around doublequotes? Is it possible for keys or values to have commas or colons? – konsolebox Aug 28 '13 at 09:46
  • @konsolebox, keys are not around double quotes and values are just numbers in scientific notation (e.g. 5.22e+1). – ChronoTrigger Aug 28 '13 at 09:47
  • @ChronoTrigger Can they contain spaces? – konsolebox Aug 28 '13 at 09:52

3 Answers3

4

First solution with grep:

grep -o 'key5: [^, }]*' file

Shows someting like:

key5: value
key5: value

To remove the keys:

grep -o 'key5: [^, }]*' file | sed 's/^.*: //'

value
value
konsolebox
  • 72,135
  • 12
  • 99
  • 105
1

This only works if the key and value are on the same line, and if the key is not contained in any value, if values and keys do not contain spaces, commas, or colons:

awk -F'[, :]+' '{for (i=1;i<NF;i++) if ($i=="key3") print $(i+1)}' file

or if you want to the key from a variable

awk -F'[, :]+' -v key="$key" '{for (i=1;i<NF;i++) if ($i==key) print $(i+1)}' file
user000001
  • 32,226
  • 12
  • 81
  • 108
1

Using sed and grep:

sed 's|[{},]|\n|g' your-file.txt | grep -Po '(?<=key1:).*$' 

sed reformats the file to have only one pair key-value on each line; then use grep with lookbehind to extract only values correpsonding to a specified key.

Bentoy13
  • 4,886
  • 1
  • 20
  • 33