6

I have the following records:

31 Stockholm
42 Talin
34 Helsinki
24 Moscow
15 Tokyo

And I want to convert it to JSON with AWK. Using this code:

#!/usr/bin/awk
BEGIN {
    print "{";
    FS=" ";
    ORS=",\n";
    OFS=":";
};

{    
    if ( !a[city]++ && NR > 1 ) {
        key = $2;
        value = $1;
        print "\"" key "\"", value;
    }
};

END {
    ORS="\n";
    OFS=" ";
    print "\b\b}";
};

Gives me this:

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15, <--- I don't want this comma
}

The problem is that trailing comma on the last data line. It makes the JSON output not acceptable. How can I get this output:

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}
Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
AlexStack
  • 16,766
  • 21
  • 72
  • 104
  • I'm glad you asked the question while learning to use awk, since @ed-morton and his brilliant hack of the record separator handling made the lightbulb go off for my somewhat related problem. – Rob Fagen Aug 09 '17 at 17:26

4 Answers4

10

Mind some feedback on your posted script?

#!/usr/bin/awk        # Just be aware that on Solaris this will be old, broken awk which you must never use
BEGIN {
    print "{";        # On this and every other line, the trailing semi-colon is a pointless null-statement, remove all of these.
    FS=" ";           # This is setting FS to the value it already has so remove it.
    ORS=",\n";
    OFS=":";
};

{
    if ( !a[city]++ && NR > 1 ) {      # awk consists of <condition>{<action} segments so move this condition out to the condition part
                                       # also, you never populate a variable named "city" so `!a[city]++` won't behave sensibly.
        key = $2;
        value = $1;
        print "\"" key "\"", value;
    }
};

END {
    ORS="\n";                          # no need to set ORS and OFS when the script will no longer use them.
    OFS=" ";
    print "\b\b}";                     # why would you want to print a backspace???
};

so your original script should have been written as:

#!/usr/bin/awk
BEGIN {
    print "{"
    ORS=",\n"
    OFS=":"
}

!a[city]++ && (NR > 1) {    
    key = $2
    value = $1
    print "\"" key "\"", value
}

END {
    print "}"
}

Here's how I'd really write a script to convert your posted input to your posted output though:

$ cat file
31 Stockholm
42 Talin
34 Helsinki
24 Moscow
15 Tokyo
$
$ awk 'BEGIN{print "{"} {printf "%s\"%s\":%s",sep,$2,$1; sep=",\n"} END{print "\n}"}' file
{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
2

You have a couple of choices. An easy one would be to add the comma of the previous line as you are about to write out a new line:

  • Set a variable first = 1 in your BEGIN.

  • When about to print a line, check first. If it is 1, then just set it to 0. If it is 0 print out a comma and a newline:

    if (first) { first = 0; } else { print ","; }
    

    The point of this is to avoid putting an extra comma at the start of the list.

  • Use printf("%s", ...) instead of print ... so that you can avoid the newline when printing a record.

  • Add an extra newline before the close brace, as in: print "\n}";

Also, note that if you don't care about the aesthetics, JSON doesn't really require newlines between items, etc. You could just output one big line for the whole enchilada.

danfuzz
  • 4,253
  • 24
  • 34
1

You should really use a json parser but here is how with awk:

BEGIN {
    print "{"    
}
NR==1{
    s= "\""$2"\":"$1
    next
}
{
    s=s",\n\""$2"\":"$1
}
END {
    printf "%s\n%s",s,"}"
}

Outputs:

{
"Stockholm":31,
"Talin":42,
"Helsinki":34,
"Moscow":24,
"Tokyo":15
}
Community
  • 1
  • 1
Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
0

Why not use json parser? Don't force awk to do something isn't wasn't designed to do. Here is a solution using python:

import json

d = {}
with open("file") as f:
    for line in f:
       (val, key) = line.split()
       d[key] = int(val)

print json.dumps(d,indent=0)

This outputs:

{
"Helsinki": 34, 
"Moscow": 24, 
"Stockholm": 31, 
"Talin": 42, 
"Tokyo": 15
}
Chris Seymour
  • 83,387
  • 30
  • 160
  • 202
  • 1
    I appreciate the help, but I wanted to solve this issue with AWK in order to learn the tool. I have a Nodejs script that does the job properly. – AlexStack Mar 26 '13 at 05:09