So I am interested in splitting rather large files into 5Gig intervals. My goal is to have ALL partitions less than 5Gigs and the least ammount of partitions as possible.
While I WOULD normally use the split with a size limit, I need to ensure that lines remain intact (Which I cannot get split by size to do).
I have been contemplating using the file size and line count to determine the number of lines I could split per file
e.g.
File size = 11Gig
File line count = 900
File limit = 5Gig
ceiling(11/5) = 3
900/3 = 300
#Split the file by line limiting 300 each.
While this would probably usually work, due to the nature of line elements file sizes COULD still be above 5gigs if there is one extremely large line in a segment of the file.
I'm contemplating using python (It handles numbers much better and seems less hackish), but then I would loose bashes file manipulation speed.
I'm wondering if anyone knows of a better alternative in bash?
Thank you in advance!