0

I am looking for a program, which concatenates strings with a char, prepends one char and appends another. I think I have used the wrong keywords for my search, but I was not able to find the perfect unix tool for that issue.

Suppose I have a file (note the starting empty lines):

file in.txt:



{
"some": "json",
"with_different": "intendation",
  "which": [],
  "has": 2
}

{
 "json":"objects"
}

and generate out.txt

[
{
"some": "json",
"with_different": "intendation",
  "which": [],
  "has": 2
}
,
{
 "json":"objects"
}
]

Basically, I want a JSON-array from that, meaning:

  1. get rid of first empty lines ( uniq | tail --lines=+2),
  2. replace empty lines with comma (sed -e 's/^$/,g/') and
  3. prepend/append it with [ and ] (awk 'BEGIN {print "["} {print $1} END {print "]"}).

uniq <in.txt | tail --lines=+2 | sed -e 's/^$/,/g' | awk 'BEGIN {print "["} {print $1} END {print "]"}' is giving me what I want, but I sure think, that this is not elegant.

I have found paste, xargs, join, but they do not help me. Also I know about the OFS variable in awk, which may replace the sed part, but I don't know how to convince awk to treat all 'non-empty' lines as $1 (probably using IFS, but IFS='^$' is surely not working.) And then we still have the other boilerplate around it.

I am hoping that someone can point me to magic-program like magic -d"," -s"[" -e"]" <in, provided I have cleaned the empty lines above, or the objects are one-liners

file in:

{"some":"json",  "which":[],  "has": 2}

{ "json":"objects"}

to file out:

[
{"some":"json",  "which":[],  "has": 2}
,
{ "json":"objects"}
]

Other example would be echo "a b c" | magic -d',' -s'[' -e']' returns [a,b,c].

Or, to not only give JSON examples: echo "my new component" | magic -d'-' -s'<' -e'>' returns <my-new-component>.

Notes:

Joel
  • 1,725
  • 3
  • 16
  • 34
  • 2
    `jq -s . file.txt`? JQ is not a standard utility though – oguz ismail Aug 12 '20 at 07:40
  • works for this scenario, as its `json`. I still wonder, if there is `unix` tools being used more cleverly to solve this (also more general) problem, specifying delim, pre/suffix – Joel Aug 12 '20 at 07:46
  • [Here](https://pubs.opengroup.org/onlinepubs/9699919799/idx/utilities.html) is a list of standard utilities, see if there is one. – oguz ismail Aug 12 '20 at 07:50
  • 1
    You should watch out for the `uniq` at the start of your current solution. You might (a) have adjacent "non-empty" lines that you want to preserve (e.g. opening or closing curly brackets if their identation matches, and (b) have adjacent "whitespace only" lines that you want to `uniq` but are not affected as they have different whitespace. – borrible Aug 12 '20 at 08:56
  • Are you looking for a tool to convert what you show under `in.txt` into the text under `in` or something else? If so, then naming your expected output `out` would be clearer than naming it `in`, if not then idk what it is you're looking for. – Ed Morton Aug 12 '20 at 18:57
  • @EdMorton added out-examples, hopefully that makes it clearer. @borrible Good point and true for the general case. In this case, as those in-files are generated, I is granted, that `uniq – Joel Aug 12 '20 at 21:43

2 Answers2

1
$ awk -v RS= 'BEGIN{sep="[\n"} {printf "%s%s", sep, $0; sep="\n,\n"} END{print "\n]"}' in.txt
[
{
"some": "json",
"with_different": "intendation",
  "which": [],
  "has": 2
}
,
{
 "json":"objects"
}
]

.

$ awk -v RS= 'BEGIN{sep="[\n"} {printf "%s%s", sep, $0; sep="\n,\n"} END{print "\n]"}' in
[
{"some":"json",  "which":[],  "has": 2}
,
{ "json":"objects"}
]
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • Thanks, Using `RS` is indeed better then my approach. However, I seek a shorter command then that long `awk`. If there is no other suggestions in a couple of days, I'll accept this. – Joel Aug 13 '20 at 02:04
  • Why do you care how long the command is? Aren't clarity and simplicity more important than brevity? If you want to make it shorter then you could rename the variable `sep` to `s` and save yourself 6 characters and init it on the command line instead of in a BEGIN section to save a few more and get rid of white space to save yet more, e.g. `awk -v RS= -v s='[\n' '{printf "%s%s",s,$0;s="\n,\n"}END{print "\n]"}'` but IMHO that's all just pointlessly obfuscating the code. – Ed Morton Aug 13 '20 at 13:40
  • Maybe I should have written “simpler” instead of “short”, that’s actually what I’m after. Since I have that problem rather often, I thought that there is a _dedicated_ program for that, which I hoped one could point me to. Sure, I could put that program as a function or alias in my .zshrc, but that would neither tell me whether I could also use a very simple other cmd. That’s why I asked in the first place. And btw, I also don’t get why there is downvotes (esp. without constructive comments)... – Joel Aug 13 '20 at 20:05
  • 1
    OK - no, there is no dedicated program for that as it's not a common problem and your input file isn't in a format defined by any standard. idk why you're getting downvotes, I didn't downvote, but constructive comments are fairly frequently met with negative, personal attacks in response (just happened to me in a different thread) so it's not unreasonable to just downvote without leaving a comment. – Ed Morton Aug 13 '20 at 20:13
1

Using GNU sed:

sed '/./,$!d; s/^/[\n/; :a; n; s/^$/,/; $s/$/\n]/; ba' in.txt

with the assumption that there is no trailing blank line in the input (leading ones are discarded).

Alternatively:

sed -n '/./{s/^/[\n/; :a; p; n; s/^$/,/; $s/$/\n]/; ba}' in.txt
M. Nejat Aydin
  • 9,597
  • 1
  • 7
  • 17
  • How does the first command match leading blank lines? Maybe you can also add more info on the intermediate commands. Looking at `man sed` these are my assumptions: `s/^/[\n/;` matches start and replaces it w/ `[`, `:a;` label to jump to after each line, `n;`: appends current line, `s/^$/,/;` repl. empty lines w/ `,`, `$s/$/\n]/;` replaces final line w/ `]`, `ba` brances back to lbl `a`. – Joel Aug 14 '20 at 02:31
  • 1
    @Joel `/./` matches a non-blank line. `/./,$` matches lines between the first non-blank line and the last line, inclusively. The `!` negates matching line range, thus remaining lines are leading blank lines, which are deleted by the `d` command. – M. Nejat Aydin Aug 14 '20 at 02:40
  • @Joel `n` prints the pattern space if the `-n` flag isn't specified as a command line argument, and reads the next line. If there is no next line, the `sed` quits. – M. Nejat Aydin Aug 14 '20 at 02:44
  • 1
    @Joel You may want to read [this document](https://www.gnu.org/software/sed/manual/sed.html) for detailed information about GNU sed. – M. Nejat Aydin Aug 14 '20 at 02:52