0

I have data in a text file named data.txt, like

1. John (1994)  92      
2. Granny (1972)    82  

I want to convert this data to JSON format using Awk. Expected result:

[{
  "ID" : ​    "1"​,
  "Name" : ​  "John",
  "Birth" : ​ "1994",
  "Marks" : ​ "92"
}]

I tried it using jq

jq -R '[ split("\n")[] | select(length > 0) | split(" ") | {ID: .[0], Name: .[1], Birth: .[2], Marks: .[3]}]' data.txt
Micha Wiedenmann
  • 19,979
  • 21
  • 92
  • 137
Henry Oberoy
  • 41
  • 2
  • 7

3 Answers3

2
awk ' BEGIN { print "[" ; }  { print " {\n" "   \"ID\" : \""   $1  "\",\n"  "   \"Name\" : \""  $2 "\",\n"  "   \"Birth\" : \""  $3  "\",\n"  "   \"Marks\" : \""  $4  "\"\n" " }" }   END { print "]" } ' data.txt

or, you can do the following, too.

awk ' BEGIN { print "[" ; }                     \
      { print  " {"                             \
        print  "   \"ID\" : \""     $1  "\","   \
        print  "   \"Name\" : \""   $2  "\","   \
        print  "   \"Birth\" : \""  $3  "\","   \
        print  "   \"Marks\" : \""  $4  "\""    \
        print  " }"                             \
      }                                         \
      END { print "]" } '  data.txt       

Then you can see the following output.

[
 {
   "ID" : "1.",
   "Name" : "John",
   "Birth" : "(1994)",
   "Marks" : "92"
 }
 {
   "ID" : "2.",
   "Name" : "Granny",
   "Birth" : "(1972)",
   "Marks" : "82"
 }
]
Dr.K
  • 31
  • 4
  • thanks for your response.. what if the name consist of space separated string. i mean if full name is "John Carlo". then ? – Henry Oberoy Sep 13 '18 at 03:00
0

If you are curious about how to do it with jq, here is one way.

parse.jq

split("\n") | 
map(match("(\\d+)\\. +([\\w ]+) +\\((\\d+)\\) +(\\d+)")) | .[] |
{ 
  "ID"    : (.captures[0].string),
  "Name"  : (.captures[1].string),
  "Birth" : (.captures[2].string),
  "Marks" : (.captures[3].string)
}

Run it like this:

jq -R -f parse.jq infile.txt

Output:

{
  "ID": "1",
  "Name": "John",
  "Birth": "1994",
  "Marks": "92"
}
{
  "ID": "2",
  "Name": "Granny",
  "Birth": "1972",
  "Marks": "82"
}
Thor
  • 45,082
  • 11
  • 119
  • 130
0

For the record, the following jq one-liner produces (what seems to be) the desired result:

jq -R '[capture("(?<ID>[0-9]+)\\. *(?<Name>[^(]*) \\((?<Birth>[^)]*)\\) *(?<Marks>[0-9]*)")]' data.txt

namely:

[
  {
    "ID": "1",
    "Name": "John",
    "Birth": "1994",
    "Marks": "92"
  }
]
[
  {
    "ID": "2",
    "Name": "Granny",
    "Birth": "1972",
    "Marks": "82"
  }
]

If one wants to capture the objects in a single array, one could use inputs, e.g.:

jq -nR '[inputs|capture("(?<ID>[0-9]+)\\. *(?<Name>[^(]*) \\((?<Birth>[^)]*)\\) *(?<Marks>[0-9]*)")]' data.txt

The OP also asked:

if the name consist of space separated string

The above regex allows spaces within the name.

peak
  • 105,803
  • 17
  • 152
  • 177