What is so bad with threadlocals

Question

Everybody in Django world seems to hate threadlocals(http://code.djangoproject.com/ticket/4280, http://code.djangoproject.com/wiki/CookBookThreadlocalsAndUser). I read Armin's essay on this(http://lucumr.pocoo.org/2006/7/10/why-i-cant-stand-threadlocal-and-others), but most of it hinges on threadlocals is bad because it is inelegant.

I have a scenario where theadlocals will make things significantly easier. (I have a app where people will have subdomains, so all the models need to have access to the current subdomain, and passing them from requests is not worth it, if the only problem with threadlocals is that they are inelegant, or make for brittle code.)

Also a lot of Java frameworks seem to be using threadlocals a lot, so how is their case different from Python/Django 's?

Having tried to implement subdomain multi-tenancy without threadlocals, I can totally sympathize. After some serious frustration, threadlocals really ended up being the only way to go. I read the arguments against them, and they weren't strong enough. I think the refusal to utilize threadlocals is one of the primary reasons that the Sites framework is so useless for some scenarios. Will be interesting if they ever figure out how to legitimately solve https://code.djangoproject.com/ticket/15089 in a way that can be adaptable to the type of multi-tenancy you and I are using, without them. — B Robster, Mar 12 '13 at 15:21
Django Cookbook link is broken. The essay link is broken as well ([possible replacement](http://www.memonic.com/user/pneff/folder/python/id/1Wg)). — André Caron, Jul 07 '13 at 23:08

score 22 · Answer 1 · answered Dec 17 '09 at 20:53

22

I avoid this sort of usage of threadlocals, because it introduces an implicit non-local coupling. I frequently use models in all kinds of non-HTTP-oriented ways (local management commands, data import/export, etc). If I access some threadlocals data in models.py, now I have to find some way to ensure that it is always populated whenever I use my models, and this could get quite ugly.

In my opinion, more explicit code is cleaner and more maintainable. If a model method requires a subdomain in order to operate, that fact should be made obvious by having the method accept that subdomain as a parameter.

If I absolutely could find no way around storing request data in threadlocals, I would at least implement wrapper methods in a separate module that access threadlocals and call the model methods with the needed data. This way the models.py remains self-contained and models can be used without the threadlocals coupling.

answered Dec 17 '09 at 20:53

Carl Meyer

122,012
20
106
116

1

Carl: I agree this breaks locality, but if I have to pass 'subdoain' data in all 100% of my `.filter` calls, doesnt that break DRY in worse way. This is why I think this is an acceptable tradeoff in this case. – agiliq Dec 18 '09 at 07:37
7

The tradeoff might be acceptable in this case; only you can make that call. You asked "what is so bad with threadlocals," so I answered :-) I also described how I might mitigate the damage if I did feel the tradeoff was worth it. – Carl Meyer Dec 18 '09 at 14:48
1

The approach in Django would be to create a request middleware that attaches thread-local data to the request object. This follows Django's design for the session framework, by which the middleware class ensures that enough data is present that your application doesn't blow when you want access some per-request data. This also means that such touchy coupling is accessible only through views, which I don't believe limited anyone before for designing their application. – Filip Dupanović Jun 27 '11 at 12:48
Filip: attaching things to the request in middleware is fine. The problem is that Django's model layer doesn't know about the request. The most common reason for treadlocals in django is to get the request in the model layer. There are applications where you'd need to pass along the request to special model layer methods all of the time otherwise. – Reinout van Rees Nov 24 '11 at 12:29

kibitzer · Accepted Answer · 2009-12-17T12:24:42.137

18

I don't think there is anything wrong with threadlocals - yes, it is a global variable, but besides that it's a normal tool. We use it just for this purpose (storing subdomain model in the context global to the current request from middleware) and it works perfectly.

So I say, use the right tool for the job, in this case threadlocals make your app much more elegant than passing subdomain model around in all the model methods (not mentioning the fact that it is even not always possible - when you are overriding django manager methods to limit queries by subdomain, you have no way to pass anything extra to get_query_set, for example - so threadlocals is the natural and only answer).

edited Dec 17 '09 at 12:24

answered Dec 17 '09 at 11:39

kibitzer

4,479
1
21
20

From someone who's implemented the same thing (we store the request, not the model, but to the same end) I whole-heartedly agree. People will rail against threadlocals until they actually need to implement dynamic multi-tenancy with django, at which point, they'll realize that this is definitely one of those "practicality beats purity" moments. – B Robster Mar 12 '13 at 15:25
1

making an app more elegant by hiding dependencies? First time I read that. In understand that django might not let you do it in any other way. That's a django issue. But elegant, let's agree to disagree on that. – graffic Mar 23 '14 at 06:55

score 3 · Answer 3 · answered Dec 17 '09 at 11:05

Also a lot of Java frameworks seem to be using threadlocals a lot, so how is their case different from Python/Django 's?

CPython's interpreter has a Global Interpreter Lock (GIL) which means that only one Python thread can be executed by the interpreter at any given time. It isn't clear to me that a Python interpreter implementation would necessarily need to use more than one operating system thread to achieve this, although in practice CPython does.

Java's main locking mechanism is via objects' monitor locks. This is a decentralized approach that allows the use of multiple concurrent threads on multi-core and or multi-processor CPUs, but also produces much more complicated synchronization issues for the programmer to deal with.

These synchronization issues only arise with "shared-mutable state". If the state isn't mutable, or as in the case of a ThreadLocal it isn't shared, then that is one less complicated problem for the Java programmer to solve.

A CPython programmer still has to deal with the possibility of race conditions, but some of the more esoteric Java problems (such as publication) are presumably solved by the interpreter.

A CPython programmer also has the option to code performance critical code in Python-callable C or C++ code where the GIL restriction does not apply. Technically a Java programmer has a similar option via JNI, but this is rightly or wrongly considered less acceptable in Java than in Python.

score 2 · Answer 4 · answered Dec 17 '09 at 09:24

You want to use threadlocals when you're working with multiple threads and want to localize some objects to a specific thread, eg. having one database connection for each thread. In your case, you want to use it more as a global context (if I understand you correctly), which is probably a bad idea. It will make your app a bit slower, more coupled and harder to test.

Why is passing it from request not worth it? Why don't you store it in session or user profile?

There difference with Java is that web development there is much more stateful than in Python/PERL/PHP/Ruby world so people are used to all kind of contexts and stuff like that. I don't think that is an advantage, but it does seem like it at the beginning.

`Why don't you store it in session or user profile?` Because I need to access it from models. — agiliq, Dec 17 '09 at 09:29
`In your case, you want to use it more as a global context` NOt really, I want to set it in Middleware, and have it accesible in the modles.py, without views, sending it to models.py explicitly each time. — agiliq, Dec 17 '09 at 09:30

score 0 · Answer 5 · answered Jan 22 '12 at 11:44

I have found using ThreadLocal is an excellent way to implement Dependency Injection in a HTTP request/response environment (i.e. any webapp). You just set up a servlet filter to 'inject' the object you need into the thread on receiving the request and 'uninject' it on returning the response.

It's a smart man's DI without all the XML ugliness, without the MB of Spring Jars (not to mention its learning curve) and without all the cryptic repetitive @annotation nonsense and because it doesn't individually inject many object instances with the dependencies it's probably a heck of a lot faster and uses less memory.

It worked so well we opened sourced our exPOJO Filter that can inject a Hibernate session or a JDO PersistenceManager using ThreadLocal:

http://www.expojo.com

What is so bad with threadlocals

5 Answers5

Linked