I have a huge dataset (around 30 GB size), and I need to break the CSV into smaller CSV files. The traditional way to use skipRows
arguement seems to be taking a lot of time. I think that the process could be much faster if, after reading the initial rowSize
; say 1000, we delete those rows from the CSV file and so after every iteration, we won't have to skip the rows, which is basically reading those number of lines every time.
Is there any way to implement this?