13

I have two threads, one which writes to a file, and another which periodically moves the file to a different location. The writes always calls open before writing a message, and calls close after writing the message. The mover uses shutil.move to do the move.

I see that after the first move is done, the writer cannot write to the file anymore, i.e. the size of the file is always 0 after the first move. Am I doing something wrong?

That1Guy
  • 7,075
  • 4
  • 47
  • 59
Schitti
  • 25,489
  • 8
  • 24
  • 21

3 Answers3

29

Locking is a possible solution, but I prefer the general architecture of having each external resource (including a file) dealt with by a single, separate thread. Other threads send work requests to the dedicated thread on a Queue.Queue instance (and provide a separate queue of their own as part of the work request's parameters if they need result back), the dedicated thread spends most of its time waiting on a .get on that queue and whenever it gets a requests goes on and executes it (and returns results on the passed-in queue if needed).

I've provided detailed examples of this approach e.g. in "Python in a Nutshell". Python's Queue is intrinsically thread-safe and simplifies your life enormously.

Among the advantages of this architecture is that it translates smoothly to multiprocessing if and when you decide to switch some work to a separate process instead of a separate thread (e.g. to take advantage of multiple cores) -- multiprocessing provides its own workalike Queue type to make such a transition smooth as silk;-).

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • If I'm writing a library that should be thread-safe but is not always used in threads. Should I take this approach or look at a lock? I'm not sure whether my library should be spawning new threads if it is only being used one thread. What is the best solution for a variable number of threads that could be one? – Luke Taylor Jun 23 '16 at 12:07
8

When two threads access the same resources, weird things happen. To avoid that, always lock the resource. Python has the convenient threading.Lock for that, as well as some other tools (see documentation of the threading module).

Eli Bendersky
  • 263,248
  • 89
  • 350
  • 412
4

Check out http://www.evanfosmark.com/2009/01/cross-platform-file-locking-support-in-python/

You can use a simple lock with his code, as written by Evan Fosmark in an older StackOverflow question:

from filelock import FileLock

with FileLock("myfile.txt"):
    # work with the file as it is now locked
    print("Lock acquired.")

One of the more elegant libraries I've ever seen.

Xorlev
  • 8,561
  • 3
  • 34
  • 36
  • are you sure this can be actually combined with moving files around? – Eli Bendersky Feb 20 '10 at 07:48
  • 3
    Evan Fosmark's code applies to synchronizing multiple *processes*, not *threads*. As per Eli's suggestion, I would use `threading.Lock` or `threading.RLock`. – Vinay Sajip Feb 20 '10 at 08:31
  • As a followup, the link in this post is no longer working. Per [Evan's answer to a related question here](http://stackoverflow.com/a/498505/760905) you can also find the code at https://github.com/dmfrey/FileLock – MartyMacGyver Jul 01 '16 at 07:36