3

I've been looking all over the internet but had no result.

I'm trying to build a package generator (in shell/bash) and that respective package contains (one or more) json files. When browsing through a json file, if the user wants to delete one of the steps I must first take into consideration what if the step he wants to delete is the last. If so, the previous of the last step will contain a comma, but the json format does not permit that.

{
  "operation_machinetype": "Ford",
  "operation_steps": [
    "steps/Step_1/01_paint_the_car.json",
    "steps/Step_2/01_drive_the_car.json",
    "steps/Step_2/02_park_the_car.json"
  ]
}

Example, If I delete '"steps/Step2/02_park_the_car.json"', then '"steps/Step2/01_drive_the_car.json",' will be the last step, but the comma will then cause an error.

Thank you in advance everyone.

  • While being human-readable `json` (as well as any other format, e.g. `xml`, `ini`, etc.) required some knowledge of how to edit it. You can mess up with `[ ]` or `{ }` easily. That `,` is just another point in long *must know what to do* list. Verdict - do nothing. – Sinatr Jan 19 '16 at 15:53
  • Thank you for taking time to write something. I understand how I should delete the line ' sed "/$2/d" <$1.temp >$1 ' but the comma is still there :) – digital_critic Jan 19 '16 at 15:59

5 Answers5

3
$ cat tst.awk
index($0,tgt) {
    if (!/,[[:space:]]*$/) {
        sub(/,[[:space:]]*$/,"",prev)
    }
    next
}
NR>1 { print prev }
{ prev = $0 }
END { print prev }

$ awk -v tgt='"steps/Step_2/02_park_the_car.json"' -f tst.awk file
{
  "operation_machinetype": "Ford",
  "operation_steps": [
    "steps/Step_1/01_paint_the_car.json",
    "steps/Step_2/01_drive_the_car.json"
  ]
}

$ awk -v tgt='"steps/Step_2/01_drive_the_car.json"' -f tst.awk file
{
  "operation_machinetype": "Ford",
  "operation_steps": [
    "steps/Step_1/01_paint_the_car.json",
    "steps/Step_2/02_park_the_car.json"
  ]
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
  • I am not familiar with 'awk', is there a possibility you might have time for a very small description? :) – digital_critic Jan 19 '16 at 16:52
  • 1
    But you tagged the question with `awk`? Anyway - it's the general purpose text manipulation tool that comes as standard on all UNIX installations. It is full of constructs designed to make your life easier when manipulating text (e.g. implicit while-read read loop, built-in condition/action blocks, splitting of lines into fields, tracking numbers of lines read, number of fields in a line, the ability to work on whole paragraphs, instead of lines, etc.) and it's a scripting language but executes as fast as C for this task. Get the book Effective Awk Programming, 4th Edition, by Arnold Robbins. – Ed Morton Jan 19 '16 at 17:24
  • 1
    Thank you for the description and for the advice!. Will verify this and will return with an update. – digital_critic Jan 19 '16 at 17:26
  • 1
    You're welcome. Note that the awk script is treating the string you want to delete as exactly that, a **string**. sed cannot operate on strings, only on regular expressions, so if you want to make sed treat a regular expression as if it were a string then you need to disable all of the regexp metacharacters by escaping them. See http://stackoverflow.com/q/29613304/1745001 for a discussion on the subject and the solution. sed is the right tool for simple substitutions on individual lines but for anything more interesting, like this, you should use awk for clarity, portability, robustness, etc. – Ed Morton Jan 19 '16 at 17:33
  • It works, quite good I might say. The problem is, it only prints the final output. Can I redirect this output to a file with '>'? like `cp $1 $1.temp && awk -v tgt='"steps/Step_2/01_drive_the_car.json"' -f tst.awk $1.temp > $1 && rm -r $1.temp` ? – digital_critic Jan 20 '16 at 09:25
  • Also, in the sed example, I could delete an entire row when it contained only "park_the_car". However in this awk example, I must provide the full row. Can I change that? – digital_critic Jan 20 '16 at 11:15
  • 1
    Yes, you can redirect output with any command, the synopsis is `cmd file > tmp && mv tmp file` though, not what you wrote which would change the timestamps etc. on the original file if the command failed and is using the `recurse` arg for `rm` undesirably. The awk script will work as-is for a partial match, you should have just tried it. – Ed Morton Jan 20 '16 at 13:22
  • 1
    I did try it :) I don't know if it was human error or some other thing but at first it didn't delete. Marking as final answer. Thank you for your help! – digital_critic Jan 20 '16 at 13:40
1

You can find the last comma before a closing bracket using a multi line regular expression, such as:

/,\s*]/g
greg-449
  • 109,219
  • 232
  • 102
  • 145
Amnon
  • 2,212
  • 1
  • 19
  • 35
  • While this is a valid option, I wanted to be independent of the brackets as much as possible; would like to relate only to the "steps" lines. I've also seen some answers using 'tac', reversing and then searching for the first occurence (if it contains a comma as the last character) but had no luck in rewriting it for this situation. Thanks again! – digital_critic Jan 19 '16 at 16:09
  • Use `yaml` then and bid adieu to brackets :) – Amnon Jan 19 '16 at 16:12
  • Unfortunately I am blocked on using this format :) Should I upvote your answer? I'm new here. – digital_critic Jan 19 '16 at 16:15
  • What exactly would you like to achieve? You can try parsing the json file and notify the user if there is a problem. – Amnon Jan 19 '16 at 16:17
  • I have hundreds of such files, each packed in a tgz archive. Instead of making thousands of changes manually, I need to fully automate this. – digital_critic Jan 19 '16 at 16:22
  • Do you mean you want to fix your users' possible syntax errors? Or do you want to automate the removal of the last line in the array? – Amnon Jan 19 '16 at 16:33
  • Not necessarily the last line. That's the problem, it can be any line :) , at user's will. I want him to select via a menu what should be deleted/edited/replace/etc using the options that I provide to him. But in this situation, I want a clean delete, meaning, if the last step would be deleted, the program will automatically (after I implement it) removes the comma from the last step. – digital_critic Jan 19 '16 at 16:51
  • I see. You should do the delete operations on a logical model and serialize it as json. – Amnon Jan 19 '16 at 17:09
1

This might work for you (GNU sed):

sed 'N;s#,\s*\n\s*"steps/Step_2/02_park_the_car\.json"##;P;D' file

This reads two lines at a time and if a match is found removes the required strings. It prints then deletes the first line and then appends the next line.

N.B. If the pattern space is empty it reads another line then appends the next.

potong
  • 55,640
  • 6
  • 51
  • 83
  • Will check this also and be right back with an update! Thanks. – digital_critic Jan 19 '16 at 17:27
  • You should escape the `.` and watch out for any other RE metacharacters, (or the `#` since you're using that as the delimiter`) that can show up in the file name. – Ed Morton Jan 19 '16 at 17:30
  • Unfortunately this only shows the entire content of the file as it is and does not change it in any way. Still, thank you for your time. – digital_critic Jan 20 '16 at 09:07
  • @digital_critic perhaps there is white space at the end of the first line containing the `,`. I have amended the solution to cater for this and also quoted the `.` as mentioned by @Ed Morton. – potong Jan 20 '16 at 15:41
  • I have tried again and again and again but still doesn't work. :( – digital_critic Jan 21 '16 at 11:24
1

I know the category is bash, but you should probably use something that's JSON-aware in general and not just edit the file like it's line based. Then you don't have to worry about proper json encoding, but can focus on manipulating the data. I'd be inclined to use Python, due to the built-in json support and relative ubiquity (and because this is pretty simple in Python). Perhaps something like this simple script (over half of which is error checking verbosity):

#!/bin/env python
import json
import sys

if( len(sys.argv)-1 != 2 ):
  sys.stdout.write('Usage: ' + sys.argv[0] + ' pattern file\n')
  sys.exit(1)

patt = sys.argv[1]
filename = sys.argv[2]

with open(filename, 'r') as data_file:
  data = json.load(data_file)

try:
  data["operation_steps"].remove(patt)
except ValueError:
  sys.stdout.write( 'Pattern "' + patt + '" not found in '
                    + filename + '; leaving file unchanged\n' )
  sys.exit(2)

sys.stdout.write( 'updating ' + filename + '\n' )
with open(filename, 'w') as data_file:
  json.dump(data, data_file, ensure_ascii=False, indent=2)

If you take the "indent=2" off of the json.dump call, it'll compact the json as well - which might be handy in reducing wasted space from a package. :)

There are several command-line json manipulators out there, and I've been fiddling with writing a C-based loadable module for ksh93, but this is how I'd actually solve the problem if I was going to do so today.

dannysauer
  • 3,793
  • 1
  • 23
  • 30
  • I actually do want to migrate this in the near future to Python, maybe a nice graphic interface using WX, but until then I still need it in sed/awk. Maybe I can bother you with more questions when I do the plunge, if of course, you wouldn't mind. Thanks!! – digital_critic Jan 20 '16 at 09:43
  • 1
    I'm by no means a Python expert, but I'm always happy to provide what guidance I can. BTW, if it's a matter of needing the code inline vs a separate script, consider http://stackoverflow.com/a/2356490/65589. – dannysauer Jan 20 '16 at 13:05
  • Hello dannysauer. I decided to migrate whole thing to python as awk was too (awk)ward to provide a quick solution for something so complex. :) I have used your example but unfortunately, when I use a string as pattern, it does not find the respective step. – digital_critic Jan 22 '16 at 09:30
  • 1
    list.remove() takes the object to be removed as it's argument, so with a string it has to be an *exact* match. I did test the example code to verify that it works, so your task is apparently to find out how your string differs. :) – dannysauer Jan 22 '16 at 14:30
0

You might consider using YAML:

---
  operation_machinetype: "Ford"
  operation_steps: 
    - "steps/Step_1/01_paint_the_car.json"
    - "steps/Step_2/01_drive_the_car.json"
    - "steps/Step_2/02_park_the_car.json"
Amnon
  • 2,212
  • 1
  • 19
  • 35