Based on my current understanding, a process is a collection of instructions along with all the resources it uses while it is running. This includes the code, input/output, resources, memory, file handles, and more. In other words, it encompasses everything required for the execution of a program.
# this script, while running as a whole, is considered a process
print('hello world')
with open('something.txt', 'a') as file_handle:
for i in range(500):
file_handle.write('blablabla')
print('job done!')
To utilize my computer's processing power more efficiently, I can spawn additional processes or threads. Which one should I choose? How do they compare to the simple Python script process analogy? Is spawning another process similar to recalling the entire script with a different filename?
# changed filename (is this "another process?")
print('hello world')
with open('something_else.txt', 'a') as file_handle:
for i in range(500):
file_handle.write('blablabla')
print('job done!')
I also get the vague idea that a single process can contain multiple threads, would it just be the equivalent of loading a bunch of more "conceptual" for loops then?
# like would this be a "thread" a barebones "subset" of an entire program?
with open('something.txt', 'a') as file_handle:
for i in range(500):
file_handle.write('blablabla')
What are the key differences between processes and threads? Online sources suggest that processes are more autonomous and resource-intensive, while threads are more lightweight and able to share memory with one another. But what does this mean in practice? Why can't processes also share memory? If threads are able to share memory, why can't I access variables from different threads that are spawned from the same script (e.g. from thread_a import var_data)?
Lastly, what computes what exactly? Does a CPU compute threads or processes, or is it a broader term that includes multiple cores, etc? Do cores compute processes or threads?
Summary:
Using a simple python script as an example for a process, what would the equivalent of spawning another process/thread be? (e.g. duplicate script/subset of a script/some section of code only)
How are processes fundamentally different from threads, what is an example of processes being able to do something that threads cannot?
Why is memory/data often described as "harder to share" in processes than threads? and how do threads share data anyways?
Do CPUs compute threads or processes. Do cores compute threads or processes?
Can you provide general guidelines and examples for when to use certain things? Is there a rule of thumb for threads vs processes in python?