0

I am in the process of developing application in C# which analyse multiple files .csv files using some sort of queries or functions and it display results.

So far, I have managed to create application which opens .csv file using excel but I have just found that Microsoft Excel won't read record of more than 104,000. In my case I have record of 705.000. So currently I am converting those .csv files in Microsoft Access database and from that using queries I have populated those results into my C# application.

However, this process is long and I have to convert all the files into Access and then I can analyse the data. Is there any other way I can directly read multiple .csv files and filter what I am looking for?

Any suggestions or help will be appreciated.

Thanks,

Harsh Panchal

  • 1
    Possible duplicate of [Reading CSV files using C#](https://stackoverflow.com/questions/3507498/reading-csv-files-using-c-sharp) – Jeroen Mostert May 24 '17 at 14:05
  • 2
    Open CSV file as a plain txt file, read it line by line. Then use lines by splitting from seperator. – Mustafa May 24 '17 at 14:06
  • Probably you can use LINQ to access data in your CSV file https://stackoverflow.com/questions/5116604/read-csv-using-linq – Mihail Kuznesov May 24 '17 at 15:55
  • Or you can read CSV file into dataset and use all dataset related function to access data https://stackoverflow.com/questions/1050112/how-to-read-a-csv-file-into-a-net-datatable – Mihail Kuznesov May 24 '17 at 15:59
  • @MichaelKuznetsov but is it possible to read two files and analyse the dataset from it? – Harsh Panchal May 24 '17 at 17:26
  • Is it have same columns? If yes, no any problem. Simply add data from second file at the end of dataset – Mihail Kuznesov May 24 '17 at 17:28
  • @MichaelKuznetsov no the fields are different and also I need a different output from the second dataset. Thanks. – Harsh Panchal May 24 '17 at 21:34
  • You can make 2 dataset, and download first csv to first dataset, and second csv to second dataset. And make different output from first and second dataset. Or, you can make 2 list, download 1 csv to first list, download 2 csv to 2 list, and use linq to first list, and after use linq to second list. Or, you can read 1 csv to 1 array, and 2 csv to 2 array, and after use foreach to first array, and use if statement to find necessary line, after, make same thing to second array. In c#, you can make many variable/object, you are not limited to 1. You can take data from hundreds of csv files. – Mihail Kuznesov May 25 '17 at 12:28

2 Answers2

1

Depending upon what your analysis needs are you might find R with the ff package to be a useful tool.

If you really must use c# for this task then is there some reason you cannot simply read the files one line at a time analyzing as you go?

Zane
  • 11
  • 1
  • Hi @Zane, thanks for comment. Yes, I have research and found that it is possibly to do in R. But, I have no idea how it works or anything. Possible I can use R as there is no such requirement to use specifically C#. Is there any way you can guide or where to start? Thanks. – Harsh Panchal May 24 '17 at 14:24
  • just tried with R but, the program crashes due to very high volume of data. – Harsh Panchal May 24 '17 at 22:28
0

You can try this. Untested.

//Read all lines into array in once.
var tmparray = File.ReadAllLines("C:\YourFile.csv");

for (var i = 0; i < tmparray.Length; i += 1)
{
    //Assuming ";" is your seperator. (or delimiter)
    subArray=tmparray[i].Split(';'); 

    //Do anything with your subArray.
}
Mustafa
  • 825
  • 3
  • 14
  • 37
  • Thanks @Mustafa. But this means I can only read one file `("C:\YourFile.csv")` However, I have to get data from both csv files. Basically it is an insider threat analysis application. Datasets are downloaded from opensource website and it has information about user details such as name email and time, computers IDs, Websites they visited, any external media they have connected, etc.. Thanks once again. – Harsh Panchal May 24 '17 at 14:28
  • Then you can make another loop outside for files and you can add your subArrays to a dataset or database. Then you can make queries to them with SQL or Linq @HarshPanchal – Mustafa May 25 '17 at 04:49