1

I have CSV file which is having more than 100K records (File size is 50mb). When I am importing data from CSV to Sqlite3 Database getting errors saying "expected 43 columns of data but found 78".

The reason I found was, CSV file has some invalid characters. (Refer Screenshot)

Please help me to remove invalid characters from CSV file using "Linux Sed Command"

https://prnt.sc/odr7hb

I need to modify this command with sed command to remove invalid characters.

sqlite3 /path_to_db/date_sqlite.db 
"<<EOS" 
.separator '|' 
.import /path_to_data_file/data_log_filr tbl_sqlite_data EOS

Null Byte = These are the characters I need to remove

Sharon.B
  • 51
  • 5

2 Answers2

1

To remove Null Byte in a Text file

sed 's/\x0//g' PATH TO INPUT FILE > PATH TO OUTPUT FILE

To Use in PHP Exec

 $regx = 's/\x0//g';  

 $command = sprintf(" sed '%s' $pathToInput > $pathToOutput   ", $regx );

 exec($command)

Hope this helps.

  • 1
    After removing null bytes and import data into sqlite database, it took longer time than I expected. Because of file size 40/50 Mb. Ref: https://medium.com/@JasonWyatt/squeezing-performance-from-sqlite-indexes-indexes-c4e175f3c346 https://stackoverflow.com/questions/1711631/improve-insert-per-second-performance-of-sqlite?rq=1 Conclusion : To Improve performance of Sqlite bulk data insert/update/delete, Drop table indexing when you do sqlite operations and once the operation is done recreate table Indexing this will drastically improve performance. Hope this will help someone – Sharon.B Jul 15 '19 at 06:00
0

Assuming that your input file is not Unicode, and so you really want to remove all NUL bytes, you can do this on the shell with

tr -d \\000 <input.csv >output.csv
user1934428
  • 19,864
  • 7
  • 42
  • 87