0

I have a very large, 4.4 million rows, tab delimited file with 51 columns of data, approx. 1.5GB.

When I try to use the SQLite command .import it always tells me the last row only has 24 columns. I assume a transaction rolls back as there is no data.

If I just try and import the last 10 rows they are inserted without any error so the column count can't be wrong.

SQLite performance for large files doesn't seem to be a problem.

Is there a limit to the size of the import?

Community
  • 1
  • 1
Dave Anderson
  • 11,836
  • 3
  • 58
  • 79
  • Are you inserting all the elements at once? If you are, doing the inserts within a transaction will speed up the process dramatically. This urls can help you See http://www.sqlite.org/faq.html#q19 SQLite can handle very large databases. See http://www.sqlite.org/limits.html – iVenky Sep 23 '13 at 06:53
  • @iVenky this is with the command line tool, I've just added a link. My alternative method (in Android) is a line by line import with table indexes off, PRAGMA synchronous = OFF and PRAGMA journal_mode = MEMORY and I reckon that will take about 7 hours. This has the benefit that I can pick and choose the columns I need as I only want about 15 from the original 51. – Dave Anderson Sep 23 '13 at 07:01
  • How exactly do you create the file with the last 10 rows? – CL. Sep 23 '13 at 07:31
  • Can't confirm anything, but when i've tried to perform a one big multiline insert (number of rows wasn't actually big - several thousands, postgres eat the same data in half a second) - i've got very similar error. At the moment, haven't got a time to dig it up, and got opportunity to use postgres - still interested in answer tho. – keltar Sep 23 '13 at 08:44
  • Just checked sqlite code: it generates insert query, prepares it and executes it many times, all in one transaction. Error you're describing happens when CSV reader actually can't find extra columns. Double-check your data. But anyway, as code stands - with wrong number of columns sqlite will either fill missing fields with NULL, or ignore extra columns if there are too many. Are you sure about separator and quoting are correct? – keltar Sep 23 '13 at 09:06
  • @CL I'm using UltraEdit to edit the file and I just cut and paste the last ten rows into a new file and call `.import` with that file instead. Think I will try splitting the file in half and running both halves to see if they both have the problem and keep splitting until I get to a size that works. – Dave Anderson Sep 23 '13 at 23:57
  • @keltar I call `.separator \t` which works with the same row of data that it reports the error on so I think that is ok. There were some fields that had `--` in them and I thought that might be interpreted as a comment and omit the rest of the row so I stripped those out and still got the column count issue. Will now split file in half and keep splitting or adding half until I get to the size that cause the problem. – Dave Anderson Sep 24 '13 at 00:00
  • @DaveAnderson Try cutting/pasting all rows into a new file. – CL. Sep 24 '13 at 06:31

0 Answers0