1

My primary goal is to write to one file (e.g. file.txt) in many parallel flows, each flow should start from defined offset of a file.

Example:
script 1 - writes 10 chars from position 0
script 2 - writes 10 chars from position 10
script 3 - writes 10 chars from position 20

I didn't even get to parallelism cause I got stuck on writing to different offsets of a file. I have created a simple script to check my idea:

file = open("sample_file.txt", "w")
file.seek(100)
file.write("new line")

file.close()

Ok, so the file was created, offset was moved to 100 and sentence 'new line' was added. Success.

But then I wanted to open the same file and add something with offsett 10:

file = open("sample_file.txt", "w")
file.seek(100)
file.write("new line")
file.close()

file = open("sample_file.txt", "a")
file.seek(10)
file.write("second line")
file.close()

And the sentence 'second line' is added but at the end of the file. I'm sure it is possible to add chars somewhere in the middle of a file. Can anyone help with this simple one?

Or maybe someone has an idea how to do it in parallel?

Pawel

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
psmith
  • 1,769
  • 5
  • 35
  • 60

1 Answers1

3

As this post suggests, opening a file in 'a' mode will:

Open for writing. The file is created if it does not exist. The stream is positioned at the end of the file. Subsequent writes to the file will always end up at the then current end of file, irrespective of any intervening fseek(3) or similar.

On the other hand, the mode 'r+' will let you:

Open for reading and writing. The stream is positioned at the beginning of the file.

And though not mentioned explicitly, this will let you seek the file and write at different positions


Anyway if you are going to do this in parallel, you will have to control the resources. You don't want 2 processes writing to the file at the same time. Regarding that issue, see this SO question.

Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
  • Great, r+ does the job. I can seek over the file and insert wherever I need. Thanks! – psmith Feb 04 '20 at 20:40
  • Yes - but - If I know the sizes of strings I'm going to insert into a file (like I said ath the begining - 3 stripts, 10 chars each), I can create a file with 30 chars eg. seek(30) and then in parallel open this file with r+ and as chars are replacing null values, three scripts can write to the same file and they will not overwrite each other because they start with different offset. This is my hypothesis, which I'm checking at the moment. – psmith Feb 04 '20 at 20:46
  • First test went fine. 4 scripts writting in parallel to 4 different oiffsets. Obviously needs more testing but it's promising. – psmith Feb 04 '20 at 21:23