I'm reading and parsing CSV files into a SQL Server 2008 database. This process uses a generic CSV parser for all files.
The CSV parser is placing the parsed fields into a generic field import table (F001 VARCHAR(MAX) NULL, F002 VARCHAR(MAX) NULL, Fnnn ...) which another process then moves into real tables using SQL code that knows which parsed field (Fnnn) goes to which field in the destination table. So once in the table, only the fields that are being copied are referenced. Some of the files can get quite large (a million rows).
The question is: does the number of fields in a table significantly affect performance or memory usage? Even if most of the fields are not referenced. The only operations performed on the field import tables are an INSERT and then a SELECT to move the data into another table, there aren't any JOINs or WHEREs on the field data.
Currently, I have three field import tables, one with 20 fields, one with 50 fields and one with 100 fields (this being the max number of fields I've encountered so far). There is currently logic to use the smallest file possible.
I'd like to make this process more generic, and have a single table of 1000 fields (I'm aware of the 1024 columns limit). And yes, some of the planned files to be processed (from 3rd parties) will be in the 900-1000 field range.
For most files, there will be less than 50 fields.
At this point, dealing with the existing three field import tables (plus planned tables for more fields (200,500,1000?)) is becoming a logistical nightmare in the code, and dealing with a single table would resolve a lot of issues, provided I don;t give up much performance.