Assume file1.csv
looks like this:
File1,Line1
File1,Line2
File1,Line3
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18
1,2,3,4,5,6
And file2.csv
looks like this:
File2,Line1
File2,Line2
File2,Line3
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18
1,2,3,4,5,6
Then you can make awk
print all your CSV files whilst skipping the first line (where FNR
, i.e. the file line number, is 1) like this:
awk -F, 'FNR==1{next} 1' *csv
Output
File1,Line2
File1,Line3
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18
1,2,3,4,5,6
File2,Line2
File2,Line3
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18
1,2,3,4,5,6
Then you can check the number of fields present on each line, and if it is not 23, add as many commas as necessary to make it 23:
awk -F, 'FNR==1{next} NF!=23{$0=$0 substr(",,,,,,,,,,,,,,,,,,,,,,",1,23-NF)}1' *csv
Output
File1,Line2,,,,,,,,,,,,,,,,,,,,,
File1,Line3,,,,,,,,,,,,,,,,,,,,,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,,,,,
1,2,3,4,5,6,,,,,,,,,,,,,,,,,
File2,Line2,,,,,,,,,,,,,,,,,,,,,
File2,Line3,,,,,,,,,,,,,,,,,,,,,
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,,,,,
1,2,3,4,5,6,,,,,,,,,,,,,,,,,
So, as an actual complete solution, you could do this:
awk -F, 'FNR==1{next} NF!=23{$0=$0 substr(",,,,,,,,,,,,,,,,,,,,,,",1,23-NF)}1' *csv > NewBigFile.csv
and then import NewBigFile.csv
into sqlite
.