1

I want to write a line of code which will take the results of:

du -sh -c --time /00-httpdocs/*

and output it in JSON format. The goal is to get three pieces of information for each project file in a site: directory path, date last modified, and disk space usage in human readable format. This command will output that data in tab-delimited format with each entry on a new line in the terminal:

4.6G    2014-08-22 12:26    /00-httpdocs/00
1.1G    2014-08-22 13:32    /00-httpdocs/01
711M    2014-02-14 23:39    /00-httpdocs/02

The goal is to get it to export to a JSON file so it would need to be formatted something like this:

{"httpdocs": [
  {
    "size": "4.6G",
    "modified": "2014-08-22 12:26",
    "path": "/00-httpdocs/00-PREVIEW"}
  {
    "size": "1.1G",
    "modified": "2014-08-22 13:32",
    "path": "/00-httpdocs/8oclock"}
  {
    "size": "711M",
    "modified": "2014-02-14 23:39",
    "path": "/00-httpdocs/8oclock.new"}
]}

(I know that's not quite proper JSON, I just wrote it as an example. Apologies to the pedantic among us.)

I need size to return as an integer (so maybe remove '-sh' and handle conversion later?).

I've tried using awk and sed but I'm a total novice and can't quite get the formatting right.

I've made it about this far:

du -sh -c --time /00-httpdocs/* | awk ' BEGIN {print "\"httpdocs:\": [";} {print "{"$0"},\n";} END {print "]";}'

The goal is to have this trigger twice a day so that we can get the data and use it inside of a JavaScript application.

dotZak
  • 105
  • 2
  • 8

1 Answers1

1
sed '1 i\
{"httpdocs": [
s/\([^[:space:]]*\)([[:space:]]*\([^[:space:]]*\)[[:space:]]*\([^[:space:]]*\)/  {\
    "size" : "\1",\
    "modified": "\2",\
    "path": "\3"}/
$ a\^J]}' YourFile

Quick and dirty (posix version so --posix on GNU sed).

Take the 3 argument and place them (s/../../) into a 'template" using group (\( ...\) and \1). Include header at 1st line (i \...) and append footer ant last (a \...). [:space:] may be [:blank:]

NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43
  • Hey, that mostly did the trick. I had to adjust it a little to account for tabs and the date format containing a space. Here's the final result: `s/\([^ \t]*\) *\t\([^ ]*\s[^ \t]*\) *\t\([^\n]*\)/ {\ ` …and I had to add a comma to the end of each line to account for commas in JSON, ie ` "size": "\1",\ `. – dotZak Aug 26 '14 at 15:29
  • The only thing missing now is that the last entry has a comma after the closing brace. I'm not sure how to get rid of that. Though, it always ends in `"total"},` so maybe I can just do another replace on after? That seems a bit ridiculous. Anyway, thanks so much for the answer. Was really helpful. – dotZak Aug 26 '14 at 15:32
  • I adapt the post for tab but don't understand the "last entry has a comma". Yous sample didn't show it (nor "total..."). – NeronLeVelu Aug 27 '14 at 05:26