Store output diskspace df -h JSON

Question

I am attempting to gather basic disk space information from a server using a bash script, and store the output in JSON format. I am looking to record the available & used disk space.

An example output of df -h:

Filesystem                      Size  Used Avail Use% Mounted on
udev                            2.0G  4.0K  2.0G   1% /dev
tmpfs                           394M  288K  394M   1% /run
/dev/mapper/nodequery--vg-root   45G  1.4G   41G   4% /
none                            4.0K     0  4.0K   0% /sys/fs/cgroup
none                            5.0M     0  5.0M   0% /run/lock
none                            2.0G     0  2.0G   0% /run/shm
none                            100M     0  100M   0% /run/user
/dev/sda2                       237M   47M  178M  21% /boot
/dev/sda1                       511M  3.4M  508M   1% /boot/efi

As an example this is how I would like the final output to look.

{
  "diskarray": [{
    "mount": "/dev/disk1",
    "spacetotal": "35GB",
    "spaceavail": "1GB"
  },
  {
    "mount": "/dev/disk2",
    "spacetotal": "35GB",
    "spaceavail": "4GB"
  }]
}

So far I've tried using awk:

df -P -B 1 | grep '^/' | awk '{ print $1" "$2" "$3";" }'

with the following output:

/dev/mapper/nodequery--vg-root 47710605312 1439592448;
/dev/sda2 247772160 48645120;
/dev/sda1 535805952 3538944;

But I'm not sure how I take that data and store it in the JSON format.

I've got to this: df -P -B 1 | grep '^/' | awk '{ print $1" "$2" "$3";" }' — westcoastdev, Feb 04 '16 at 21:14
You should edit your question to include your attempted solution, and tell us what problems you're having with it. You will get better reaction to "here's what I tried, and here's how it doesn't work" rather than "I want to do this, tell me how." — miken32, Feb 04 '16 at 21:17
Sorry i edited, fairly new to this site & bash in general, going off other examples I've found doing similar things. — westcoastdev, Feb 04 '16 at 21:19
Have you looked at `jq` (a tool built for this kind of use case)? — Charles Duffy, Feb 04 '16 at 21:21
@CharlesDuffy , I will take a look at that although I am trying to avoid having package requirements for the script to run — westcoastdev, Feb 04 '16 at 21:28
Unfortunately, without requiring external packages, you won't be able to guarantee that the output generated is valid JSON (complying with quoting rules &c). — Charles Duffy, Feb 04 '16 at 21:31
...though there's a JSON generator built into the Python runtime, if you don't mind embedding a tiny bit of Python in your shell script. — Charles Duffy, Feb 04 '16 at 21:31

Charles Duffy · Accepted Answer · 2016-02-04T23:25:02.580

10

The following does what you want, with the only requirement external to bash being a Python interpreter:

python_script=$(cat <<'EOF'
import sys, json

data = {'diskarray': []}
for line in sys.stdin.readlines():
    mount, avail, total = line.rstrip(';').split()
    data['diskarray'].append(dict(mount=mount, spacetotal=total, spaceavail=avail))
sys.stdout.write(json.dumps(data))
EOF
)

df -Ph | awk '/^\// { print $1" "$2" "$3";" }' | python -c "$python_script"

An alternate implementation using jq might look like this:

df -Ph | \
  jq -R -s '
    [
      split("\n") |
      .[] |
      if test("^/") then
        gsub(" +"; " ") | split(" ") | {mount: .[0], spacetotal: .[1], spaceavail: .[2]}
      else
        empty
      end
    ]'

edited Feb 04 '16 at 23:25

answered Feb 04 '16 at 21:40

Charles Duffy

280,126
43
390
441

This is exactly what I was looking to do. Thank you so much! – westcoastdev Feb 04 '16 at 21:52
Glad to help! I've also added a `jq` implementation (that also does the work of `awk` internally). – Charles Duffy Feb 04 '16 at 23:27
Thanks for the `jq` solution. It is perfect for my docker requirement. – chris loughnane Mar 29 '21 at 16:17

score 6 · Answer 2 · edited Apr 02 '19 at 13:50

Alternative Oneliner

$ df -hP | awk 'BEGIN {printf"{\"discarray\":["}{if($1=="Filesystem")next;if(a)printf",";printf"{\"mount\":\""$6"\",\"size\":\""$2"\",\"used\":\""$3"\",\"avail\":\""$4"\",\"use%\":\""$5"\"}";a++;}END{print"]}";}'

{
   "discarray":[
      {
         "mount":"/",
         "size":"3.9G",
         "used":"2.2G",
         "avail":"1.5G",
         "use%":"56%"
      },
      {
         "mount":"/dev",
         "size":"24G",
         "used":"0",
         "avail":"24G",
         "use%":"0%"
      }
   ]
}

Reino · Answer 3 · 2021-11-02T16:32:16.383

The JSON parser xidel can do what you want:

$ df -h | xidel -se '
  {
    "diskarray":array{
      for $disk in x:lines($raw)[starts-with(.,"/dev")]
      let $item:=tokenize($disk,"\s+")
      return {
        "mount":$item[1],
        "spacetotal":$item[2],
        "spaceavail":$item[4]
      }
    }
  }
'
{
  "diskarray": [
    {
      "mount": "/dev/mapper/nodequery--vg-root",
      "spacetotal": "45G",
      "spaceavail": "41G"
    },
    {
      "mount": "/dev/sda2",
      "spacetotal": "237M",
      "spaceavail": "178M"
    },
    {
      "mount": "/dev/sda1",
      "spacetotal": "511M",
      "spaceavail": "508M"
    }
  ]
}

x:lines($raw) is a shorthand for tokenize($raw,"\r\n?|\n") and turns the input into a sequence where every new line is another item. And in this case only those lines that start with "/dev" are selected.
tokenize($disk,"\s+") turns a single line into a sequence by using (excessive) whitespace as separator.

dawg · Answer 4 · 2016-02-04T22:12:13.420

You can do:

$ df -Ph | awk '/^\// {print $1"\t"$2"\t"$4}' | python -c 'import json, fileinput; print json.dumps({"diskarray":[dict(zip(("mount", "spacetotal", "spaceavail"), l.split())) for l in fileinput.input()]}, indent=2)'
{
  "diskarray": [
    {
      "mount": "/dev/disk1", 
      "spacetotal": "931Gi", 
      "spaceavail": "623Gi"
    }, 
    {
      "mount": "/dev/disk2s2", 
      "spacetotal": "1.8Ti", 
      "spaceavail": "360Gi"
    }
  ]
}

score 1 · Answer 5 · answered Nov 01 '21 at 15:23

Instead of df you can use various system metrics gathering tools.

For example facter:

$ facter --json mountpoints
{
  "mountpoints": {
    "/": {
      "available": "14.33 GiB",
      "available_bytes": 15385493504,
      "capacity": "39.12%",
      "device": "/dev/vda1",
      "filesystem": "ext4",
...

Another example is prometheus-node-exporter - it runs as a http service. Its output is not a JSON, but it is easy to parse:

$ curl -sS 0:9100/metrics | egrep '^node_filesystem_.+_bytes'
node_filesystem_avail_bytes{device="/dev/vda1",fstype="ext4",mountpoint="/"} 1.551777792e+10
node_filesystem_free_bytes{device="/dev/vda1",fstype="ext4",mountpoint="/"} 1.6629563392e+10
node_filesystem_size_bytes{device="/dev/vda1",fstype="ext4",mountpoint="/"} 2.638553088e+10

Store output diskspace df -h JSON

5 Answers5

Linked