Shell script split du -sh * result

Question

I am a beginner of shell script.

Today I want to create a shell script for checking disk usage, and I using du -sh *|grep [MG]|sort -r to log the result like this:

space=$(du -sh *|grep [MG]|sort -r)
for file in $space
do 
    echo $file
done

====== result:
10G
fileA
50M
fileB

But I want to get the result as an object like:

{
"fileA": "10G",
"fileB": "50M"
}

How can I use awk or other command to reorganize the result?

Please check this url https://stackoverflow.com/questions/35211716/store-output-diskspace-df-h-json — amit bhosale, Jun 23 '20 at 10:21
The [useless use of `echo`](http://www.iki.fi/era/unix/award.html#echo) coupled with the [quoting errors](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) beg the question, why do you think you need a loop over the output in the first place? — tripleee, Jun 23 '20 at 10:56
@Yuk_dev : What do you mean by _result as an object_? Shell has no objects, only strings. — user1934428, Jun 23 '20 at 15:00

Socowi · Accepted Answer · 2020-06-23T14:32:58.320

3

From `du` to json

Given du's output format ...

14M     someFile
6.6M    anotherFile
576K    yetAnotherFile
0       MyEmptyFile

... you can use sed to convert into json:
_{Here we assume that you don't have to quote special symbols in the file names, for instance ". You can quote them by inserting sed commands like s/"/\\"/g. If you also have to deal with linebreaks inside filenames check out du -0 and sed -z.}

... | sed -E '1i {
s/(.*)\t(.*)/"\2": "\1",/
$s/,$//
$a }'

Filtering `du`'s output

Please note that du -sh *| grep [MG] | sort -r may not do what you expect. With the example files from above you would get

6.6M    anotherFile
14M     someFile
0       MyEmptyFile

I assume you want to show only files with sizes > 1MB and < 1TB. However, grep [MG] also selects files which contain M or G in their name. If the current directory contains a file named just M or G you might even end up with just grep M or grep G as the unquoted [MG] is a glob (like *) that can be expanded by bash.
Use grep '^[0-9.]*[MG]' to safely select files with sizes specified in MB and GB.
With sort -r you probably want to sort by file size. But this does not work because sort -r sorts alphabetically, not numerically (i.e. 9 > 11). But even with numerical sort you would end up with the wrong order, as the suffixes M and G are not interpreted (i.e. 2M > 1G).
Use sort -hr to sort by file sizes.

Putting everything together

du -sh * | grep '^[0-9.]*[MG]' | sort -hr | sed -E '1i {
s/(.*)\t(.*)/"\2": "\1",/
$s/,$//
$a }'

edited Jun 23 '20 at 14:32

answered Jun 23 '20 at 10:57

Socowi

25,550
3
32
54

Thanks for your solution and using sed to convert the result, I will study more about when should I use the method "sed" and "awk" – Yuk_dev Jun 23 '20 at 12:12
My rule of thumb is: Use `sed` if you want to modify **strings** in short or rather unstructured lines and (this is the most important part!) the result for each line is **independet from the other lines**. Use `awk` **for everything else** (e.g. for numbers, or tables with many columns, or data for which you have to consider more than one line at a time). – Socowi Jun 23 '20 at 12:23
In my [`awk` solution](https://stackoverflow.com/a/62532875/548225), I made sure that I produce correct json strong. This `sed` solution doesn't produce valid json format as it doesn't have trailing comma. – anubhava Jun 23 '20 at 13:40
1

@anubhava Oh right, that's one thing I forgot. Thank you for the information. I guess, OP should accept your answer instead. – Socowi Jun 23 '20 at 14:01
I updated my answer. Commas are now inserted between the entries (but not after the last entry, as required by json). – Socowi Jun 23 '20 at 14:33

anubhava · Answer 2 · 2020-06-23T10:47:53.133

2

You may use this awk:

du -sh * |
awk -F '\t' 'BEGIN{ print "{" }
$1 ~ /[GM]$/ {printf "%s\"%s\": \"%s\"", (++n>1?",\n":""), $2, $1}
END{ print "\n}" }'

This assumes you don't have tab character in your filenames.

edited Jun 23 '20 at 10:47

answered Jun 23 '20 at 10:40

anubhava

761,203
64
569
643

Thanks for the answer, you also provide a great solution to me! – Yuk_dev Jun 23 '20 at 12:14

Zartaj Majeed · Answer 3 · 2020-06-23T11:35:26.153

0

This is with GNU awk

du -sh * |
grep -E '^[0-9.]+[MG]' |

# sort -h for human numeric sort that understands M and G suffixes
sort -hr |

awk -F'\t' '
BEGIN { print "{" }

# print trailing comma for previous line
NR > 1 { print "," }

{ printf "\"" $2 "\": \"" $1 "\"" }

END { print "\n}" }
'

edited Jun 23 '20 at 11:35

answered Jun 23 '20 at 11:23

Zartaj Majeed

500
2
8

Shell script split du -sh * result

3 Answers3

From du to json

Filtering du's output

Putting everything together

From `du` to json

Filtering `du`'s output