0

I'm having trouble casting data types in TEXT columns.

I HAVE to upload all data from a custom CSV file. There is no guarantee that I'll even get all the columns I expect let alone the right data types so I start with a table where all columns are of type TEXT and put empty strings in the blank spots.

Ex. Table looks like this

CREATE TABLE tbl1 ( col1 TEXT, col2 TEXT, col3 TEXT);

After loading tbl1 from a file I run this.

SELECT * FROM tbl1;

Select results:

'1','String1', '2.0'

'2','String2', '3.14'

'3','String3', '6.77776'

'h','Stringh', 'h.h'

'','String', ''

Now I want to take the data from tbl1 and use it to populate this table.

CREATE TABLE tbl2 ( col1 INTEGER, col2 TEXT, col3 REAL);

And I try it like this.

INSERT INTO tbl2 SELECT CAST(tbl1.col1 as INTEGER), tbl1.col2, CAST(tbl1.col3 AS REAL) FROM tbl1;

After that I run this,

SELECT * FROM tbl2;

Select results:

1,'String1', 2.0

2,'String2', 3.14

3,'String3', 6.77776

0,'Stringh', 0

0,'String', 0

What I really want is to grab what I would consider 'Good' casts and insert them into tbl2 And then take all the values I would consider 'Bad' casts and put them in a 'tbl3' that would look like this.

CREATE TABLE tbl3 (col1 TEXT, col2 TEXT, col3 TEXT, REASON_FOR_REJECTION TEXT);

tbl3 would be for reporting and possibly trouble shooting bad data.

Do I need to preprocess this data in C++ before I insert it into tbl2 or does SQLite support some kind of querying function that allows me to catch 'Bad' casts?

UPDATE:

By adding CL.'s query to the end of this I could distinguish which records had "Bad" Casts and clean up tbl2 and add bad data rows to tbl3.

Dan
  • 2,625
  • 7
  • 39
  • 52

1 Answers1

1

Read the SQLite documentation about type affinity.

When you declare a column as INTEGER or REAL, SQLite will automatically try to convert values. Any values that cannot be converted will retain the original type.

So just import your data directly into tbl2, then find all records with errors with a query like this:

INSERT INTO tbl3
SELECT col1, col2, col3,
       trim(CASE typeof(col1)
              WHEN 'integer' THEN ''
              ELSE                'col1:' || typeof(col1)
            END ||
            ' ' ||
            CASE typeof(col3)
              WHEN 'real' THEN ''
              ELSE             'col3:' || typeof(col3)
            END)
FROM tbl2
WHERE typeof(col1) != 'integer'
   OR typeof(col3) != 'real'
CL.
  • 173,858
  • 17
  • 217
  • 259
  • Quick question this seems to work to populate `tbl3` BUT if I `SELECT * tbl2;` rows `h,stingh,h.h` and `,string,` still show as `0,stingh,0` and `0,string,0`. How would I be able to exclude the records if they are different? – Dan Mar 05 '13 at 20:26
  • Never mind. Even though the data was cast to INTEGER and FLOAT it still matches behind the scenes. Confusing but Awesome! so a `select tbl2.*, tbl3.* from tbl2, tbl3 where tbl2.col1 = tbl3.col1 AND tbl2.col2 = tbl3.col2 AND tbl2.col3 = tbl3.col3;` does find something. – Dan Mar 05 '13 at 20:28