When I create a linux thread, what exactly happens? For example, look at this code in C that allows to create a new thread:
iret1 = pthread_create( &thread1, NULL, print_message_function, (void*) message1);
as you can see, a function must be passed. So, the thread will execute code from this function, but keep the memory shared with the main process. In Linux, a thread is what is called a 'lightweight process'. It's a process in the kernel but with memory shared with the parent process.
The thing I don't understand is: what are the contents of this thread? I mean, are the function instructions passed to the thread process? I don't think so. I think that just the arguments are passed inside the thread, and the instructions from the parent process are executed on these arguments.
I'm asking this because I'm studying python threading/multiprocessing, and as I understood, threading in python is just a user-space 'emulation' of context switches, it's not parallel in the kernel, so it suffers from performance problems, because we can't run in multiple cores.
On the other side, as I understood, python multiprocessing creates python processes (not threads), so if the python3.5 binary, for example, has 3 mb, each process created with multiprocessing will have at least 3 mb (python binary + script instructions + variables). Also, the function instructions are copied to each process. Am I right?
I'm asking this because I'm thinking about using a chatbot python code in a server, but I need multiple workers for this chat bot. Threading would be too slow, so each work must be a python process. (if I were in C, however, threading would be the perfect solution, because it's the same as a process with shared memory).
I know that stackoverflow don't like posts with multiple questions, but to ask what I want, I need to know if my assumptions are correct. If they are, then here's what I need to know:
Even if in python multiprocessing's module, function instructions are copied to each process, the only disavantage from the standard shared memory threading technique, is that I'll have extra memory for each process, right? Even though instructions are copied to each process, would it be the same (in terms of cpu, not memory) as having multiple threads with no extra instructions?
The main problem is that I'm not passing just a function, I'm passing a method of an object, so I think the entire object gets copied into each process. So if memory wasting is a problem in my server, is there a way, in python, to make workers use the function instructions from the same place, but take advantage of kernel context switching just like linux threads do?