0

I am a beginner of shell script.

Today I want to create a shell script for checking disk usage, and I using du -sh *|grep [MG]|sort -r to log the result like this:

space=$(du -sh *|grep [MG]|sort -r)
for file in $space
do 
    echo $file
done

====== result:
10G
fileA
50M
fileB

But I want to get the result as an object like:

{
"fileA": "10G",
"fileB": "50M"
}

How can I use awk or other command to reorganize the result?

anubhava
  • 761,203
  • 64
  • 569
  • 643
Yuk_dev
  • 318
  • 1
  • 6
  • 16
  • Please check this url https://stackoverflow.com/questions/35211716/store-output-diskspace-df-h-json – amit bhosale Jun 23 '20 at 10:21
  • The [useless use of `echo`](http://www.iki.fi/era/unix/award.html#echo) coupled with the [quoting errors](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) beg the question, why do you think you need a loop over the output in the first place? – tripleee Jun 23 '20 at 10:56
  • @Yuk_dev : What do you mean by _result as an object_? Shell has no objects, only strings. – user1934428 Jun 23 '20 at 15:00

3 Answers3

3

From du to json

Given du's output format ...

14M     someFile
6.6M    anotherFile
576K    yetAnotherFile
0       MyEmptyFile

... you can use sed to convert into json:
Here we assume that you don't have to quote special symbols in the file names, for instance ". You can quote them by inserting sed commands like s/"/\\"/g. If you also have to deal with linebreaks inside filenames check out du -0 and sed -z.

... | sed -E '1i {
s/(.*)\t(.*)/"\2": "\1",/
$s/,$//
$a }'

Filtering du's output

Please note that du -sh *| grep [MG] | sort -r may not do what you expect. With the example files from above you would get

6.6M    anotherFile
14M     someFile
0       MyEmptyFile
  • I assume you want to show only files with sizes > 1MB and < 1TB. However, grep [MG] also selects files which contain M or G in their name. If the current directory contains a file named just M or G you might even end up with just grep M or grep G as the unquoted [MG] is a glob (like *) that can be expanded by bash.
    Use grep '^[0-9.]*[MG]' to safely select files with sizes specified in MB and GB.
  • With sort -r you probably want to sort by file size. But this does not work because sort -r sorts alphabetically, not numerically (i.e. 9 > 11). But even with numerical sort you would end up with the wrong order, as the suffixes M and G are not interpreted (i.e. 2M > 1G).
    Use sort -hr to sort by file sizes.

Putting everything together

du -sh * | grep '^[0-9.]*[MG]' | sort -hr | sed -E '1i {
s/(.*)\t(.*)/"\2": "\1",/
$s/,$//
$a }'
Socowi
  • 25,550
  • 3
  • 32
  • 54
  • Thanks for your solution and using sed to convert the result, I will study more about when should I use the method "sed" and "awk" – Yuk_dev Jun 23 '20 at 12:12
  • My rule of thumb is: Use `sed` if you want to modify **strings** in short or rather unstructured lines and (this is the most important part!) the result for each line is **independet from the other lines**. Use `awk` **for everything else** (e.g. for numbers, or tables with many columns, or data for which you have to consider more than one line at a time). – Socowi Jun 23 '20 at 12:23
  • In my [`awk` solution](https://stackoverflow.com/a/62532875/548225), I made sure that I produce correct json strong. This `sed` solution doesn't produce valid json format as it doesn't have trailing comma. – anubhava Jun 23 '20 at 13:40
  • 1
    @anubhava Oh right, that's one thing I forgot. Thank you for the information. I guess, OP should accept your answer instead. – Socowi Jun 23 '20 at 14:01
  • I updated my answer. Commas are now inserted between the entries (but not after the last entry, as required by json). – Socowi Jun 23 '20 at 14:33
2

You may use this awk:

du -sh * |
awk -F '\t' 'BEGIN{ print "{" }
$1 ~ /[GM]$/ {printf "%s\"%s\": \"%s\"", (++n>1?",\n":""), $2, $1}
END{ print "\n}" }'

This assumes you don't have tab character in your filenames.

anubhava
  • 761,203
  • 64
  • 569
  • 643
0

This is with GNU awk

du -sh * |
grep -E '^[0-9.]+[MG]' |

# sort -h for human numeric sort that understands M and G suffixes
sort -hr |

awk -F'\t' '
BEGIN { print "{" }

# print trailing comma for previous line
NR > 1 { print "," }

{ printf "\"" $2 "\": \"" $1 "\"" }

END { print "\n}" }
'
Zartaj Majeed
  • 500
  • 2
  • 8