I like the command-line utility GoCSV, by aotimme. It follows the Unix philosophy of having a number of small tools, each of which does one thing very well, and the tools can be pipelined. It also has pre-built binaries for Linux and Windows.
I mocked up this sample input based on the info in your question:
_ID__,Confirmed
00001,1
00002,0
00003,0
00004,1
00005,1
...
09996,1
09997,0
09998,0
09999,0
10000,1
GoCSV's filter and split subcommands can be piped together to first filter out any "non 1" row; then break up the remaining "1" rows into files of 1000 rows each:
gocsv filter -c Confirmed -eq 1 input.csv | gocsv split --max-rows 999
The filter subcommand specifies with column to consider, -c Confirmed
the Confirmed column, then -eq 1
to specify that only rows with a 1 in the Confirmed column should be output.
GoCSV always treats the first row as the header (a number of its subcommands only make sense if they interpret the first row as a header), so I subtracted 1 for --max-rows
.
For my mock input.csv, that yielded 5 output CSVs:
ls out*.csv | while read CSV; do
echo "--$CSV--"
gocsv dims $CSV
done
--out-1.csv--
Dimensions:
Rows: 999
Columns: 2
--out-2.csv--
Dimensions:
Rows: 999
Columns: 2
--out-3.csv--
Dimensions:
Rows: 999
Columns: 2
--out-4.csv--
Dimensions:
Rows: 999
Columns: 2
--out-5.csv--
Dimensions:
Rows: 979
Columns: 2
Again, GoCSV doesn't count the header as a row, so the Rows count is only 999 for the complete files.