0

I am writing a program in C#. First, it reads a large file into a dictionary variable, then the variable is used as a search table to process other data (saved in another file). I would like to execute multiple instances of the program to process a lot of files simultaneously. The problem is I don't have enough memory to execute many instances simultaneously, since each instance of the program has to read the large file first.

I was wondering if there is a way to share the dictionary variable across all the instances. The program only required to read the large file into memory once, and then all the instances of the same program could access the shared variable. It could save a lot of time by skipping reading the file repeatedly, and a lot of memories since there is only one shared dictionary holding the data.

Using a database is an option, but requires database installation on the target system.

The dictionary I want to share is: Dictionary<string, (char, char)> searchTable = new(); Dictionary<string, Person> searchPersonTable = new();

So, if there is an easy way to share the dictionary variable, how to implement it?

MORE: The files that need to be processed are not coming all at once (only several at a time), so I cannot use multi-threads to process all of them all at once.

DiabloRex
  • 1
  • 1
  • 1
    You can use multiple threads, no need to use multiple programs – Tim Schmelter Sep 30 '22 at 09:42
  • Is the file processing I/O-bound or compute-bound? If it's I/O-bound then multithreading it won't speed it up significantly. – Matthew Watson Sep 30 '22 at 09:44
  • @MatthewWatson: maybe you can use an approach like this: https://stackoverflow.com/a/20929333/284240 – Tim Schmelter Sep 30 '22 at 09:46
  • 1
    So, in a nutshell, you're asking if this is possible. The answer is, yes, it is. Would you like to edit your question so that you're asking something more than an yes/no question? – Enigmativity Sep 30 '22 at 09:47
  • The files that need to process are not coming all at once, so I have to execute the program one by one, the program is part of a workflow. – DiabloRex Sep 30 '22 at 09:48
  • 1
    @DiabloRex: i would recommend using a database with proper indexes and a view that represents that dictionary. That should improve performance and is also the most scalable solution that will not cause an OutOfMemoryException sooner or later. It will also simplify your code significantly. I guess it will also replace your text-files, so move them all into the database. – Tim Schmelter Sep 30 '22 at 09:52
  • @TimSchmelter The program is deployed to different systems, and the target systems may not have a database installed. – DiabloRex Sep 30 '22 at 09:58
  • @Enigmativity I have edited my question. So, if you know how to implement it, please let me know. Thanks! – DiabloRex Sep 30 '22 at 10:00
  • @DiabloRex: you can use [Microsoft.EntityFrameworkCore.Migrations](https://learn.microsoft.com/en-us/aspnet/core/data/ef-mvc/migrations?view=aspnetcore-6.0) to create the initial setup for your local sql-server db. If you don't use .net core there are similar ways in .net framework. – Tim Schmelter Sep 30 '22 at 10:05
  • @TimSchmelter Is there a way not to use a database? – DiabloRex Sep 30 '22 at 10:08
  • @DiabloRex - You really should share more detail on the data that you're loading and the processing you're doing on that file. It's trivial to run multiple `Task.Run(() => ...)` calls that share a dictionary, but without knowing what you're actually doing it's hard to advise the right solution. – Enigmativity Sep 30 '22 at 10:08
  • there are like 3 ways to solve this off the top of my head, but the problem is that you haven't not done enough research and gone down a path, you could has 1 (api (with dic) and serval app read from api) 2. (why is there a need to have it separate? 1 app could do it all,...) 3 (there is a way, to query form another app but gets involved, aka research, dnt know off hand what the specifics are, google is great). if u just starting out then option 2 is best and most performance and maintainable. main thread read large file, other threads do what ever – Seabizkit Sep 30 '22 at 10:16
  • @Enigmativity - The files that need to process are not coming all at once, without knowing the number of files, I cannot use multi-threads to process the files all at once, so I have to execute the program one by one when the files arrived, it's part of a workflow. – DiabloRex Sep 30 '22 at 10:18
  • @DiabloRex - You run a program and watch for the files to be created and then launch a task. – Enigmativity Sep 30 '22 at 10:23
  • @DiabloRex Assuming you can't change the way that the program is called: You could write a "service" executable that is launched the first time that your program is run, which keeps running in the background. The service executable would handle RPC calls from your other program, which just passes the filename to be processed to the service executable. The service executable would have the single instance of the large dictionary and would use multiple threads to process the requests. RPC is a big subject though - but StreamJsonRpc is relatively simple to use for it. – Matthew Watson Sep 30 '22 at 10:25
  • This is better done with a database. You can download SQL Server for free which will solve issue. – jdweng Sep 30 '22 at 10:30
  • If you don't want to use a database you can surely roll your own thing. So more or less replicating what a database does but then call it differently. But are you willing to put in the time and work where something already exists you can simply use? – Ralf Sep 30 '22 at 10:36
  • 1
    "I cannot use multi-threads to process the files all at once" so use reactive programming. Simply have an "event" or TPL Data Source or Rx IObservable or IAsyncEnumerable, as the entry point. Have tasks spawn as files come in. Bob's your uncle, reactive multithreaded application. – Aron Sep 30 '22 at 10:40
  • OP, I really think you are over complicating things. IPC, shared memory etc is a very hard topic. You should start from scratch with your architecture. – Aron Sep 30 '22 at 10:42
  • @Aron well I think I'd better use a database – DiabloRex Sep 30 '22 at 10:52
  • @DiabloRex - I think you better detail the data in your dictionary, and what's coming in the files, and then what processing you have to do. Then we can suggest an architecture. Right now it seems rather vague to me. – Enigmativity Sep 30 '22 at 10:57
  • Probably better to continue in chat https://chat.stackoverflow.com/rooms/248468/dotnet-c-share-variables-across-multiple-instances-of-the-same-program – Aron Sep 30 '22 at 11:02
  • I am with Tim Schmelter, that database can be a solution here. With the suggestion to use SQLite. So you do not need a server. Basically, [that.](https://www.sqlite.org/appfileformat.html) – Fabian Sep 30 '22 at 12:17

0 Answers0