3

jq is an amazing tool and it does a lot. as input I have

[
  {
    "backup": [
      {
        "timestamp": { "start": 1642144383, "stop": 1642144386 },
        "info": {  "size": 1200934840},
        "type": "full"
      },
      {
        "timestamp": {"start": 1642144388, "stop":  1642144392 },
        "info": { "size": 1168586300
        },
        "type": "incr"
      },
      {
        "timestamp": {"start": 1642145388, "stop":  1642145392 },
        "info": { "size": 1168586330
        },
        "type": "incr"
      }
    ],
    "name": "dbname1"
  },
  {
    "backup": [
      {
        "timestamp": { "start": 1642144383, "stop": 1642144386 },
        "info": {  "size": 1200934840},
        "type": "full"
      },
      {
        "timestamp": {"start": 1642144388, "stop":  1642144392 },
        "info": { "size": 1168586300
        },
        "type": "incr"
      }
    ],
    "name": "dbname2"
  }
]

and using

jq 'map([.backup[] + {name}] | max_by(.timestamp.stop))'

I get the latest timestamp.stop for a name. How should I change this to get the latest timestamp.stop for a name and group? in SQL this would be something like max(.timestamp.stop) group by .name,.type Hoping for output like:

[
  {
    "timestamp": {
      "start": 1642144383,
      "stop": 1642144386
    },
    "info": {
      "size": 1200934840
    },
    "type": "full",
    "name": "dbname1"
  },
  {
    "timestamp": {
      "start": 1642145388,
      "stop": 1642145392
    },
    "info": {
      "size": 1168586330
    },
    "type": "incr",
    "name": "dbname1"
  },
  {
    "timestamp": {
      "start": 1642144383,
      "stop": 1642144386
    },
    "info": {
      "size": 1200934840
    },
    "type": "full",
    "name": "dbname2"
  },
  {
    "timestamp": {
      "start": 1642144388,
      "stop": 1642144392
    },
    "info": {
      "size": 1168586300
    },
    "type": "incr",
    "name": "dbname2"
  }
]
  • yes, that is correct one contains the max .timestamp.stop for the type full, the other contains the max .timestamp.stop for the type incr. There can even be more types. –  Jan 18 '22 at 08:27

2 Answers2

1

Remove the inner brackets to flatten the array, then group_by both criteria (which makes your criteria an array), and map your max_by onto the result array:

jq 'map(.backup[] + {name}) | group_by([.name, .type]) | map(max_by(.timestamp.stop))'
[
  {
    "timestamp": {
      "start": 1642144383,
      "stop": 1642144386
    },
    "info": {
      "size": 1200934840
    },
    "type": "full",
    "name": "dbname1"
  },
  {
    "timestamp": {
      "start": 1642145388,
      "stop": 1642145392
    },
    "info": {
      "size": 1168586330
    },
    "type": "incr",
    "name": "dbname1"
  },
  {
    "timestamp": {
      "start": 1642144383,
      "stop": 1642144386
    },
    "info": {
      "size": 1200934840
    },
    "type": "full",
    "name": "dbname2"
  },
  {
    "timestamp": {
      "start": 1642144388,
      "stop": 1642144392
    },
    "info": {
      "size": 1168586300
    },
    "type": "incr",
    "name": "dbname2"
  }
]

Demo

pmf
  • 24,478
  • 2
  • 22
  • 31
  • Thanks again! Amazing tool jq. There is still a lot to learn before I am on par with my SQL knowledge. –  Jan 18 '22 at 08:45
0

This seems to produce the desired expected output. You need an additional grouping by the .type record, before doing the max_by

map( .backup[] + {name} ) | group_by(.name)[] | 
  group_by(.type) | map(max_by(.timestamp.stop))

jqplay demo

Inian
  • 80,270
  • 14
  • 142
  • 161