1

I have a CSV file that I want to convert to a JSON file with the quotes from the CSV removed using JQ in a shell script.

Here is the CSV named input.csv:

1,"SC1","Leeds"
2,"SC2","Barnsley"

Here is the JQ extract:

jq --slurp --raw-input --raw-output \
   'split("\n") | .[1:] | map(split(",")) |
    map({
         "ListElementCode": .[0],
         "ListElement": "\(.[1]) \(.[2])
      })' \
  input.csv > output.json

this writes to output.json:

[
  {
    "ListElementCode": "1",
    "ListElement": "\"SC1\" \"Leeds\""
  },
  {
    "ListElementCode": "2",
    "ListElement": "\"SC2\" \"Barnsley\""
  }
]

Any idea how I can remove the quotes around the 2 text values that get put into the ListElement part?

maloney
  • 1,633
  • 3
  • 26
  • 49
  • 2
    This is one of those places where it's unfortunate that jq only has native CSV _output_ support and not native CSV _input_ support. This is feasible to implement, of course, if one uses the regex functions; but it's not available for free, as it would if you were, say, using Python with its csv module. – Charles Duffy Jul 06 '20 at 13:06
  • `this writes to output.json` is a false assumption, it's a syntax error – Gilles Quénot Jul 06 '20 at 13:07
  • @GillesQuenot why is it a false assumption? When I run the script, it creates that file and i've pasted the output for you to see. What am I not understanding? – maloney Jul 06 '20 at 13:11
  • Seems a work for perl, python ruby and a template engine (Template::Toolkit for perl, Jinja2 for Python) – Gilles Quénot Jul 06 '20 at 13:27
  • 1
    [csv to json using jq](https://stackoverflow.com/questions/29663187/csv-to-json-using-jq) is worth reviewing, but I'm not convinced that any of the answers I've seen there yet are truly compliant with the CSV format. – Charles Duffy Jul 06 '20 at 13:46

4 Answers4

1

To solve only the most immediate problem, one could write a function that strips quotes if-and-when they exist:

jq -n --raw-input --raw-output '
    def stripQuotes: capture("^\"(?<content>.*)\"$").content // .;

    [inputs | split(",") | map(stripQuotes) |
     {
         "ListElementCode": .[0],
         "ListElement": "\(.[1]) \(.[2])"
     }]
' <in.csv >out.json

That said, to really handle CSV correctly, you can't just split(","), but need to split only on commas that aren't inside quotes (and need to recognize doubled-up quotes as the escaped form of a single quote). Really, I'd use Python instead of jq for this job -- and of this writing, the jq cookbook agrees that native jq code is only suited for "trivially simple" CSV files.

Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
1

As mentioned, a Ruby answer:

ruby -rjson -rcsv -e '
  data = CSV.foreach(ARGV.shift)
            .map do |row|
              {
                ListElementCode: row.first,
                ListElement: row.drop(1).join(" ")
              }
            end
  puts JSON.pretty_generate(data)
' input.csv
[
  {
    "ListElementCode": "1",
    "ListElement": "SC1 Leeds"
  },
  {
    "ListElementCode": "2",
    "ListElement": "SC2 Barnsley"
  }
]
glenn jackman
  • 238,783
  • 38
  • 220
  • 352
1

Using a proper CSV/JSON parser in :

#!/usr/bin/env perl

use strict; use warnings;

use JSON::XS;
use Text::CSV qw/csv/;

# input.csv:
#1,"SC1","Leeds"
#2,"SC2","Barnsley"
my $vars = [csv in => 'input.csv'];
#use Data::Dumper;
#print Dumper $vars; # display the data structure

my $o = [ ];
foreach my $a (@{ $vars->[0] }) {
   push @{ $o }, {
       ListElementCode => $a->[0],
       ListElement     => $a->[1] . " " . $a->[2]
   };
}

my $coder = JSON::XS->new->ascii->pretty->allow_nonref;
print $coder->encode($o);

Output

[
   {
      "ListElement" : "SC1 Leeds",
      "ListElementCode" : "1"
   },
   {
      "ListElement" : "SC2 Barnsley",
      "ListElementCode" : "2"
   }
]
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
1

Here's an uncomplicated and efficient way to solve this particular problem:

jq -n --raw-input --raw-output '
  [inputs
   | split(",")
   | { "ListElementCode": .[0],
       "ListElement": "\(.[1]|fromjson) \(.[2]|fromjson)"
     } ]' input.csv 

Incidentally, there are many robust command-line CSV-to-JSON tools, amongst which I would include:

peak
  • 105,803
  • 17
  • 152
  • 177