I have one question regarding the HDFS read/write process:
Assuming that we have a client (for the sake of the example let's say that the client is a HADOOP map process) who requests to read a file from HDFS and or to write a file to HDFS, which is the process which actually does the read/write from/to the HDFS?
I know that there is a process for the Namenode and a process for each Datanode, what are their responsibilities to the system in general but I am confused in this scenario.
Is it the client's process by itself or is there another process in the HDFS, created and dedicated to the this specific client, in order to access and read/write from/to the HDFS?
Finally, if the second answer is true, is there any possibility that this process can be suspended for a while?
I have done some research and the most important solutions that I found were Oozie and JobControl class from hadoop API.
But, because I am not sure about the above workflow, I am not sure what process I am suspending and resuming with these tools.
Is it the client's process or a process which runs in HDFS in order to serve the request of the client?