3

I'm fairly new to JSON and I currently have a list of ~130 JSON files that I received by downloading my Facebook message data. I'm currently attempting to use JQ To concatenate them all into a single json file while keeping the existing message order intact however I'm running into errors when I attempt to enter commands on Windows. I have attempted to follow their suggestions for Windows within the FAQ and am still running into issues.

All files have the same layout

{
  "participants": [
    {
      "name": "Participant One"
    },
    {
      "name": "Participant Two"
    }
  ],
  "messages": [
    {
      "sender_name": "Participant One",
      "timestamp_ms": 99999999999,
      "content": "message content",
      "type": "Generic"
    },
    {
      "sender_name": "Participant Two",
      "timestamp_ms": 9999999999,
      "content": "message content",
      "type": "Generic"
    }
  ],
  "title": "chat title",
  "is_still_participant": true,
  "thread_type": "Regular",
  "thread_path": "thread path"
}

If possible with JQ, I'd like to combine only the "messages" into a single JSON file while keeping them in the existing order. The messages are the only data in these files that I care about for my purposes.

--Edit-- Very sorry for the lack of details in the original post. I've attempted several commands that I've found both on here and elsewhere:

jq -s "[.[][]]" a-*.json > output.json

jq -s "{ attributes: map(.attributes[0]) }" file*.json > output.json

jq -s "*" *.json > output.json

I've also attempted to put this code into a text file and rune it using: jq -f filename But am also receiving errors

The errors I've been getting are either:

jq: error: Could not open file *.json: Invalid argument

or

jq: error: syntax error, unexpected INVALID_CHARACTER, expecting $end (Windows cmd shell quoting issues?) at <top-level>

As for more specifics on the output I'm trying to achieve: I am using JQ 1.5. I don't care about the order that the JSON files get added to one another as long as the messages remain in the correct order so that I can get clear defined message/response pairs. I also don't care if it all gets combined and I have to separate out the messages in a different way. Something along the lines of this would be my ideal output:

"messages": [
    {
      "sender_name": "Participant One",
      "timestamp_ms": 99999999999,
      "content": "message content",
      "type": "Generic"
    },
    {
      "sender_name": "Participant Two",
      "timestamp_ms": 9999999999,
      "content": "message content",
      "type": "Generic"
    },
    {
      "sender_name": "Participant Three",
      "timestamp_ms": 9999999999,
      "content": "message content",
      "type": "Generic"
    },
    {
      "sender_name": "Participant Two",
      "timestamp_ms": 9999999999,
      "content": "message content",
      "type": "Generic"
    }
  ]

Where participant two represents my responses to messages and other participants are the people I am having conversations with (Which is how it is in the original JSON output from Facebook)

Z. Kettell
  • 33
  • 1
  • 5
  • Please follow the [mcve] guidelines as much as possible. Without knowing more about what the desired combination would look like, it's hard to offer specific advice. – peak Sep 28 '18 at 02:25
  • Out of curiosity, will this data be dynamic or do you just need to combine the files ahead of time? I've had a similar need and just ended up using powershell to create the JSON file for use later. – CodeSpent Sep 28 '18 at 02:34
  • 1
    We're not a code writing service. We're glad to try to help, but you've not included what you do when you *attempt to enter commands in Windows*, and you've not included the error messages you're getting. How can we help you figure out why they're not working if you don't show us what you're doing? – Ken White Sep 28 '18 at 02:37
  • Added more specifics to my original response, I'm new to posting my own questions on this site so I'm sorry about the lack of details that I had in my initial post. – Z. Kettell Sep 28 '18 at 03:13
  • @CodeSpent This data will not be dynamic, I need it combined ahead of time so I can convert it into a data source for a machine learning project I'm working on. – Z. Kettell Sep 28 '18 at 05:12
  • Think this is what you want: ```jq -s 'reduce .[] as $item ({}; . * $item)' *.json```. See https://stackoverflow.com/a/58621547/3160967 – mwag Feb 13 '20 at 00:44

1 Answers1

1

You could start with the following command-line invocation:

jq -n "[inputs | .messages] | add" *.json 

This has a number of assumptions. If your windows shell does not support wildcard expansion, see Passing multiple wildcard filenames to a command in Windows and/or https://superuser.com/questions/460598/is-there-any-way-to-get-the-windows-cmd-shell-to-expand-wildcard-paths/460648#460648

Other assumptions are that you are using jq 1.5 or later, and of course that all the *.json files in the current directory are relevant and that *.json lists them in the desired order.

If the ordering of the JSON files is best determined by explicitly listing them, then you will probably want to create a batch file. If their ordering is determined by their contents, then you could use jq to order them for you, but the details of how to do will depend on the details about the sorting criteria.

peak
  • 105,803
  • 17
  • 152
  • 177
  • My command line is pointed at the directory containing all my JSON files but I'm getting this error: "jq: error: Could not open file *.json: Invalid argument" I've received this error when attempting several commands previously. – Z. Kettell Sep 28 '18 at 03:16
  • I installed Ubuntu on Windows so I had access to a linux shell and this response worked perfectly. I used "jq -n '[inputs | .messages] | add' *.json > output.json" and it compiled all of the messages into a single new file. Thanks for the help. – Z. Kettell Sep 28 '18 at 05:13