1

I have a zip archive with a lot of JSON files. Each of these JSON files is an array of JSON objects that I would like to import to a MongoDB collection. My idea was to use the pipe option of unzip and send the content of these files directly to mongoimport:

unzip -p archive.zip *.json | mongoimport -d db_name -c collection_name --jsonArray

I expected a similar behavior to what piping a find result does: each file is processed correctly, like with this command:

find . -type f -name "*.json" | zip archive.zip -@

But it isn't. Since the contents of the files is output to std, the mongoimport has a problem, because it gets a beginning of an array from another file right after an end of an array from the previous file. Nothing in between (apart from the newline I guess), so it stops.

Is there any other way to achieve my goal?

george007
  • 609
  • 1
  • 7
  • 18

1 Answers1

0

Since I haven't found a one-liner that answers my problem (although I still believe that there is some potential in sed or awk), I chose to go with the while loop:

#!/bin/bash

ARCHIVE_FILE="archive.zip"
unzip -l $ARCHIVE_FILE | awk 'NR>3{print $4}' | while IFS= read file ; do
  unzip -p $ARCHIVE_FILE "$file" | mongoimport -d db_name -c collection_name --jsonArray;
done

I assume here that unzip -l has universal format, which may not be the case, as suggested in comments for answer of this post. So, in the future, I may need to go with some grep or sed on top of that.

george007
  • 609
  • 1
  • 7
  • 18