0

Sometimes I used to do things like this:

@Component class MyBean {
  private Map<TypeKey, Processor> processors;

  @Autowired void setProcessors(List<Processor> processors) {
    this.processors = processors.stream().....toMap(Processor::getTypeKey, x -> x);
  }

  //some more methods reading this.processors
}

But, strictly speaking, it's buggy code, isn't it?

1) this.processors is not final, nor is its creation synchronized on the same monitor as every access to it. Thus, every thread - and this singleton can be called from arbitrary thread processing user request - may be observing its own value of this.processors which might be null.

2) Even though no writes happen after the Map is initially populated, Javadoc offers no guarantees on which implementation will be used for the Map, so it might be an implementation not ensuring thread safety when the Map structure changes, or even if anything is modified, or at all. And initial population is changes, so it may break thread safety for who knows how long. Collectors even offer the specialized toConcurrentMap() method, to address that problem - so, at a bare minimum, I should have been using it instead.

But even if I use toConcurrentMap() in #2, I will not be able to make my field final, because then I'll not be able to initialize it in a setter. So here are my choices:

a) Initialize and populate the Map in an autowired constructor, which frankly I prefer. But so few teams do that, so what if we abstain from that solution? What other choices exist?

b) Initialize the Map to an empty final ConcurrentHashMap, then populate it in a setter. This is possible, but we'll have to list.forEach() then map.put(). This looks like it's still Java 6; or we could definitely do map.addAll(list....toMap()) but its useless duplication of the Map, even if temporary.

c) Use volatile on the field. Slightly degrades performance without any need, because after some point the field never gets changed.

d) Use synchronized to access the field and read its values. Clearly even worse than (c).

Also, any of those methods will make the reader think that the code actually wants some multithreading reads/writes to the Map, while actually, it's just multithreaded reading.

So, what does a reasonable guru do when they want something like that?

At this point, the best solution seems to be the one with a volatile field, assigned in a setter by using toConcurrentMap. Is there anything better? Or maybe I am just making up problems no one ever actually encountered?

skomisa
  • 16,436
  • 7
  • 61
  • 102
Maksim Gumerov
  • 642
  • 8
  • 23
  • `So, what does reasonable guru do when they want something like that?` -- and your answer is : `synchronized`. After a while and **if there's time**, someone will probably change that to some sort of efficient locking mechanism like `StampedLock` or even manual, sequential, unsynchronized population of a `ConcurrentHashMap` - which is classical and proven ... although maybe not as "beautiful" compared to stream population. Actual software engineers dont care about beauty all the time - readability, **maintainability** and stability are much more important – specializt Mar 27 '19 at 07:12
  • 3
    AFAIR (but I can't find any reference), Spring makes sure there is a happens-before, i.e. the injection of the beans happens before their usage. There should be nothing to worry about. And I don't know of any map (and certainly not the HashMap used by Collectors.toMap()), that would by itself decide to change its internal structure when you only read from it. That said, I would indeed favor constructor injection to setter injection. – JB Nizet Mar 27 '19 at 07:14
  • i completely missed the fact that the question is about setter injection and a spring singleton. Well in that case : Yes, it doesnt actually matter because injection happens only **once per singleton bean** – specializt Mar 27 '19 at 07:17
  • "only once" - it's irrelevant, I don't care how many times it happens, I care about whether the results of that are observed by everyone. Even if something happened eons ago, it might be not yet observed by some thread without proper happens-before. JB Nizet's answer says Spring makes sure of that - I heard that as well and I'll try to find specifics on that. – Maksim Gumerov Mar 27 '19 at 07:43
  • " by itself decide to change its internal structure when you only read from it." - no of course not that, I was trying to explain that while it's being populated it might and will change its structure, and since it's not thread safe, no one can be sure further reading from it will work well. Thread safety does not only mean "in concurrent accesses" but more general - accesses from concurrent threads. Thus *volatile*. – Maksim Gumerov Mar 27 '19 at 07:46
  • read this : https://spring.io/blog/2007/07/11/setter-injection-versus-constructor-injection-and-the-use-of-required/. Apparently, you can achieve your goal with `@Required` - that way `MyBean` simply cannot exist without its IoC dependency - which makes it impossible to reference a uninitialized version of `processors`, your setter will always be called first – specializt Mar 27 '19 at 08:20
  • If (a) is correct I would use it. The fact that it is not the fashion is irrelevant. There are a lot of bad practices going on in software engineering based on some bad argument. The usual case against constructor initialization is that it is harder to test and debug. The case for them is your code is usually more robust requiring less testing and debugging. – rghome Mar 27 '19 at 08:25
  • Yes, also : `he usual case against constructor initialization is that it is harder to test and debug` is entirely wrong. In fact it is **easier** to create mocks and pass them as constructor parameter to your class, some frameworks even do it **semi-automatically** (like Mockito) – specializt Mar 27 '19 at 08:38
  • "your setter will always be called first " - again, my concern is not setter being called too late, my concern is other threads not being able to read assigned and populated field even after the setter has been called. Due to lack of happens-before, or for other reasons (who knows why exactly - in what way - HashMap is not thread-safe on some operations, they do not explicitly say why - and that's normal). – Maksim Gumerov Mar 27 '19 at 11:07
  • what you're allegedly concerned about is technologically (and logically) impossible. I recommend reading up on spring CDI / IoC and Java basics, really ... in fact : programming languages and CPUs in general ... just a suggestion, i think you might need it ... direly. And remember this : your field will be populated on every access. And every thread will successfully read it. Maybe your misconception stems from a lack of understanding threads and parallel processing, in general - it could also be worthwile to read up on that. – specializt Mar 27 '19 at 17:07
  • I suggest you to provide a proof of any sort, and to stop insulting people. My question stems from reading too much on threads and parallel processing, in fact. But probably still not enough :) – Maksim Gumerov Mar 27 '19 at 18:12
  • proof has already been provided, insults were never made – specializt Mar 27 '19 at 19:24

2 Answers2

0

Or maybe I am just making up problems no one ever actually encountered?

I think may be conflating your assignment with the problems historically seen from double-checked locking:

private Foo foo;  // this is an instance variable

public Foo getFoo() {
    if (foo != null) {
        synchronized (this) {
            if (foo != null) {
                foo = new Foo();
            }
        }
    }
    return foo;
}

This code appears to be thread-safe: you do an initial, presumed quick, check to verify that the value has not been initialized yet, and if it hasn't you initialize in a synchronized block. The problem is that the new operation is distinct from the constructor call, and some implementations were assigning the reference returned by new to the variable before the constructor ran. The result was that another thread could see that value before the constructor completed.

In your case, however, you are assigning the variable based on the result of a function call. The Map created by the function call is not assigned to the variable until the function returns. The compiler (including Hotspot) is not permitted to re-order this operation because such a change would be visible to the thread that's executing the function, and would not therefore be sequentially consistent per JLS 17.4.3.

That out of the way, here are some additional comments:

Initialize and populate the Map in an autowired constructor, which frankly I prefer

As do the creators of the Guice dependency injection framework. One reason to prefer constructor injection is that you know that you'll never see the bean in an inconsistent state.

The reason that Spring encourages (or at least does not discourage) setter injection is because it makes circular references possible. You can decide for yourself whether circular references are a good idea.

Initialize the Map to an empty final ConcurrentHashMap, then populate it in a setter

This is a bad idea because it's likely that other threads will see a partially constructed map. It's far better to see either null or a fully-constructed map, because you can compensate for the first case.

Use volatile on the field. Slightly degrades performance without any need

Use synchronized to access the field and read its values. Clearly even worse than (c).

Don't let perceived performance impacts keep you from writing correct code. The only time that synchronization will significantly impact your performance is when concurrent threads access a synchronized variable/method within a tight loop. If you're not in a loop, then the memory barrier adds an irrelevant amount of time to your call (and even in a loop it's minimal unless you need to wait for a value to arrive in your core's cache).

In this case it doesn't matter, but I would guess that getProcessors() takes a tiny percentage of your total execution time, and that a far larger amount of time is taken by running the processor(s).

Community
  • 1
  • 1
guest
  • 617
  • 4
  • 4
  • "Don't let perceived performance impacts keep you from writing correct code." - sure, I just would like to avoid that it there is a cleaner solution (or if for some reason there is no problem). And thanks for interesting insights regarding new+assignment. But I don't see actual answer: what do you propose me to do? "In your case, however" implies that you don't think there is a problem. – Maksim Gumerov Mar 29 '19 at 03:47
  • Also, the code you cited at least contains synchronization. The initial code I shown does not. So, it's probably still buggy like I said. So, there is a problem and it needs a solution. What is the one you actually say is an *answer*? – Maksim Gumerov Mar 29 '19 at 03:51
  • just use toConcurrentMap and get it over with. It is **impossible** for the map to be in an inconsistent state after that - you're trying to solve problems which do not exist and your time surely can be invested in a better way ... by reading up on spring CDI and `@Required`, for instance – specializt Mar 29 '19 at 07:45
  • @MaksimGumerov - the example code I wrote is a case where multi-thread visibility _is_ a problem. I wrote it to compare to the code that you wrote, in which multi-thread visibility _is not_ a problem, due to the explanation and JLS reference that I gave you. – guest Mar 29 '19 at 13:48
  • Sure thing, toConcurrentMap solves the map consistency problem. Where did I say it may become inconsistent after that? toConcurrentMap by definition creates a thread-safe map :) You are not adding anything to my initial question with 4 options. But yes, I promised to read up on how Spring guarantees happens-before on injections. Didn't do that yet. – Maksim Gumerov Apr 01 '19 at 17:16
  • Dear Guest, you say that it's not a problem just because it does not suffer from operation reordering. But that's not the only problem the code can suffer from, right? So, not having to worry about that particular problem does not mean I can feel safe and *not* add final/volatile/toConcurrentMap/something. So the question still stands, what solution is both correct and elegant. – Maksim Gumerov Apr 01 '19 at 17:20
  • @MaksimGumerov - at this point I really don't know what question you're asking. In your original question you seemed to be concerned that consumers could see a partially-constructed map. That won't happen, and I explained why. There's nothing that `volatile` or `ConcurrentHashMap` will do to affect the outcome. If you have some other _clearly identified_ concern, please edit your question. – guest Apr 08 '19 at 19:02
  • If you're concerned that one bean will see `null` during the Spring injection phase, then yes, that's a possibility depending on dependency ordering, but it's a possibility _with any setter injection mechanism_. If you want to avoid it, use constructor injection. If you feel that constructor injection is somehow "not Spring" then don't rely on injected values until the injection phase is complete. – guest Apr 08 '19 at 19:03
  • "and I explained why" - you did not :) you only explained (with that reference) that re-ordering is not a problem. But what if, for instance, some thread maintains a cached copy of some part of a map? It works the same way as with a "processors" variable: a thread may be observing outdated copy of a map's state. Your reference to 17.4.3 does not explain why this cannot happen because that's not intra-thread matter, does it? The injecting thread will be seeing most recent value, but some other thread, not necessarily. – Maksim Gumerov Apr 12 '19 at 17:35
  • Basically, if seeing outdated internals of HashMap from other threads would not be possible, then seeing any outdated fields at all would not be possible, and then no one would need "volatile" modifier (after all, for really simultaneous reads and writes JMM is able to handle it without producing errors). – Maksim Gumerov Apr 12 '19 at 17:57
0

Thanks to hints of commenters here, after googling a bit I was able to find not a reference to Spring manual but at least a matter-of-fact guarantee, see Should I mark object attributes as volatile if I init them in @PostConstruct in Spring Framework? - the updated part of accepted answer. In essence it says that every lookup of a particular bean in a context, and hence, roughly, every injection of that bean, is preceded by locking on some monitor, and bean initialization also happens while locked on that same monitor, establishing happens-before relationship between bean initialization and bean injection. More specifially, everything done during bean initialization (like, assignment of processors in MyBean initialization) happens-before subsequent injections of that bean - and the bean is only used after it has been injected. So the author says no volatile is necessary, unless we are going to change the field after that.

That would be my accepted answer (in combination with toConcurrentMap) if not for 2 "buts".

1) This does not mean injection of non-initialized beans offers that same happens-before. And injection of non-initialized beans happens more often that some think. In case of circular dependencied, which are better kept rare but look valid sometimes. In case of lazy initialized beans. Some libraries (AFAIK even some of Spring projects) introduce circular dependencies, I saw that myself. And sometimes you introduce circular deps by accident, that is not treated as error by default. Of course, reasonable code does not use non-initialized beans, but since MyBean could be injected to some bean X before it's initialized, there will be no happens-before after it's initalized, annihilating our guarantees.

2) This is not even a documented feature. Still! But recently it has at least been put on a backlog. See https://github.com/spring-projects/spring-framework/issues/8986 - still, until they documented it we cannot assume it's not subject to changes. Bah, even when they do, it still may be changed in some next version, but at least that will be reflected in some changelist or whatnot.

So, taking those 2 notes into consideration, especially the 1st, I am inclined to say that volatile+toConcurrentMap is the way to go. Right?

Maksim Gumerov
  • 642
  • 8
  • 23