It is my understanding that Spark uses parallel IO to read files. That conclusion comes from other stack overflow responses.
My question is does spark read data using an independent approach or a collective approach? In other words, does each worker read a set chunk of data, or do the workers communicate with each other and collaborate to efficiently read data?