I'm currently attempting to come up with a solution to the following problem:
I have been tasked with parsing large (+-3500 Lines 300kb) pipe delimited text files and comparing them line by line to corresponding codes within our database. An example of a file would be:
File name: 015_A.txt
File content (example shows only 4 lines):
015|6999|Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old.|1|1|0|0|2016/01/01
015|3715|It has roots in a piece of classical Latin literature from 45 BC|1|1|213.5|213.5|2016/01/01
015|3724|Making it over 2000 years old.|1|1|617.4|617.4|2016/01/01
015|4028|Words will go here.|1|1|74|74|2016/01/01
I will be providing a web interface which I have already built to allow a file to be selected from the browser and then uploaded to the server.
Using the above example pipe file I will only be using these:
Code (using above line 1 as an example: 6999)
Price (using above line 1 as an example: 0)
I would then (to my mind not sure if this is best method) need to run a query (our DB is MSSQL) for each line, example:
SELECT t.Price
FROM table t
WHERE t.code = '6999'
If t.Price === 0 then line 1 has passed. As it is equal to the source file.
This is where I believe I just needed to ask some advice as I am sure there are many ways to tackle this problem, I would just like to, if possible be pointed in the direction of doing this in an efficient manner. (Example best method of parsing the file? Do I run a query per code or rather do a SQL statement using an IN clause and then compare every code and price? Should I scrap this idea and use some form of pure SQL tool bearing in mind I have pipe file to deal with / import.)
Any advice would be greatly appreciated.