0

I have a large json file (~10GB) like this

{"id":1, "attributes":{"a": 1}}
{"id":2, "attributes":{"a": 4, "b": 5, "d": 6}}
{"id":2, "attributes":{"a": 4, "b": 5, "c": 6, "d": 5, "e": 1}}
{"id":2, "attributes":{"a": 4, "b": 5, "c": 6, "d": 5, "e": 1, h: "l"}}

I need split this file into multifile with size within a certain range (300-350MB)

I tried using split command line

split -l 5000000 test.json

or

split -b 300MB test.json

Both ways don't work as I expected because each line of json file has different size. If divided by size, the size of each file after splitting can be larger or smaller than the range I want. If divided by line, the last line or first line of the files after the split may be cut off

Dung Pham
  • 39
  • 4
  • *"I have a large json file (~10GB)"* It's time to use a RDBMS. And your JSON is invalid by the way – Cid Sep 23 '22 at 10:42
  • 1
    [You can use the split command and ensure it splits on a line break with -C while still telling it the size.](https://stackoverflow.com/a/2016918/3585500) – ourmandave Sep 23 '22 at 10:45

0 Answers0