4

So i have this project i'm working on

Basically i have an excel files which has only employer "registration number"(primary key) that i have to upload to the database BUT sometimes some employer employer registration numbers are duplicated by mistakes which cause problems when someone tries to upload this file to the database

What i need to do upload the excel file (make a new table i guess) after i click on upload it should give me, how many employers registration number are in the excel file(calcule lines) and how many employer registration number are duplicated (comparing the original table with the new table and finding if there is any duplicates),and how many non-duplicates

the problem i have no idea how to do it

what i'm im using: c#/asp.net - entity framework - visual studio 2012 - sql server 2012

Eriasu
  • 59
  • 3
  • As written, your question is too broad. Please ask one *focused* question and show us what you've tried that isn't working. https://stackoverflow.com/help/how-to-ask – MikeH Oct 25 '18 at 18:51
  • Are you using XLS or XLSX files? If you are using XLSX files, you can use Nuget Package EPpLus to open and read the xls file. If you are using XLS files, you can use Nuget Package NPOI (I don't like this one). https://github.com/JanKallman/EPPlus https://archive.codeplex.com/?p=npoi – DanB Oct 25 '18 at 19:45
  • Why not create an Excel spreadsheet to show you this? – NetMage Oct 25 '18 at 19:57
  • i see, i didn't try anything yet still thinking how to do it, i'm using XLSX – Eriasu Oct 25 '18 at 20:53
  • I think the easiest way is to install VS Data tools and use SSIS ( Sql Server Intgration Services) for that . – El.Hum Oct 26 '18 at 08:02

2 Answers2

4

I really suggest you to make a flow for that. how do we make a data flow?
with SSIS.
If you usually have to import files to database, use SSIS so you can handle all errors.

In SSIS you can have an Excel Source and an OLEDB Destinaion to import data. then you can handle errors and import them to another table.
So there is no need to join tables and doing all those things to compare them.
SSIS is not hard to use at all

El.Hum
  • 1,479
  • 3
  • 14
  • 23
1

I think at first you should read excel file by any library that you want (ex look at this samples) and change it to a data table or a list in RAM before sending to the database.

then you must find duplicate values that user entry wrongly with Linq or other ways. for example:

var duplicateKeys = list.GroupBy(x => x)
                    .Where(group => group.Count() > 1)
                    .Select(group => group.Key);

at the end you could send your uniqe data to database .

Mohammad
  • 1,197
  • 2
  • 13
  • 30