1

I am programming a server daemon from which users can query data in C. The data can also be modified from clients.

I thought about keeping the data in memory.

For every new connection I do a fork().

First thing I thought about that this will generate a copy of the db every time a connection takes places, which is a waste of memory.

Second problem I have is that I don't know how to modify the database in the parent process.

What concepts are there to solve these problems?

Mat
  • 202,337
  • 40
  • 393
  • 406
Zulakis
  • 7,859
  • 10
  • 42
  • 67
  • 2
    they probably shouldnt be part of the same application, you can always have your database be a separate process and use IPC to communicate with it or even use something simple like sqlite instead to make it easier on yourself – Jesus Ramos Jun 16 '12 at 13:36
  • okay, i will have a look at IPC (maybe got a good tutorial explaining it?) how do daemons like bind9 do it? i am pretty sure they got the database in memory, but there is no seperate database process. – Zulakis Jun 16 '12 at 13:41
  • 2
    Why don't you use threads instead of forks? That solves all the problems you mention. They are also generally faster. – usr Jun 16 '12 at 13:41

2 Answers2

3

Shared memory and multi-threading are two ways of sharing memory between multiple execution units. Check out POSIX Threads for multi-threading, and don't forget to use mutexes and/or semaphores to lock the memory areas from writing when someone is reading.

All this is part of the bigger problem of concurrency. There are multiple books and entire university courses about the problems of concurrency so maybe you need to sit down and study it a bit if you find yourself lost. It's very easy to introduce deadlocks and race conditions into concurrent C programs if you are not careful.

Emil Vikström
  • 90,431
  • 16
  • 141
  • 175
1

What concepts are there to solve these problems?

Just a few observations:

  1. fork() only clones the memory of the process it executes at the time of execution. If you haven't opened or loaded your database at this stage, it won't be cloned into the child processes.
  2. Shared memory - that is, memory mapped with mmap() and MAP_SHARED will be shared between processes and will not be duplicated.
  3. The general term for communicating between processes is Interprocess communication of which there are several types and varieties, depending on your needs.

Aside On modern Linux systems, fork() implements copy-on-write copying of process memory. Actually, you won't end up with two copies of a process in memory - you'll end up with one copy that believes it has been copied twice. If you write to any of the memory, then it will be copied. This is an efficiency saving that makes use of the fact that the majority of processes alter only a small fraction of their memory as they run, so in fact even if you went for the copy the whole database approach, you might find the memory usage less that you expect - although of course that wouldn't fix your synchronisation problems!

Community
  • 1
  • 1