-1

Current setup of legacy project is: Java 11, Maven 3.6.3, Spring boot 2.1.6. Development is on Windows 10 Pro machines, usual setup, nothing special. When project is built and ran in IntelliJ Idea there were NEVER circular dependencies between Java classes. We noticed this problem a while ago during runtime when deployed in test environment which is just Windows Server 2019 Datacentar machine. No matter if JAR file is run as Windows service or with java -jar command in command prompt. Then we fix code based on error and we continued until next problem. Sometimes problem occurs even on development machine if we build in command prompt with 'mvn clean install'.

What puzzles me is why it never happens in Idea and how to actually setup Idea to happen there as well. I know there is option to detect Cyclic dependency, but architecture is all wrong and we don't have time to start rewriting app now. Eventually we will do - I agree.

Today something even more weird happened. No problem on development machine (Idea or command line). No problems if code is build on test machine with command line. But when code is build as it should be with Azure DevOps circular dependency occurred. I fixed it but I still don't understand why code built OK on Azure DevOps and then when ran on test machine it had runtime error?

Java is same version 11.0.8 (Oracle), except for Azure DevOps 11.0.10 (AdoptOpenJDK). I tried to install AdoptOpenJDK on one dev machine, but still can't reproduce latest error which is happening ONLY if we run jar built on Azure on windows test machine.

UPDATE: In Azure DevOps pipeline, instead of 'ubuntu-latest' I chose 'windows-latest' and now it works (although now I have problem with front end / reactjs build, but that is another topic).

CLARIFICATION: Circular dependency is happening between two, sometimes three classes. As we use Spring Constructor based DI we resolve issues by simply substituting it with field based DI. So if we have ten parameters in Constructor where each represent classes, we remove "problematic" from Constructor and create field based DI for them.

Real problem is that it happens without obvious reasons. For example, today we had new build, just minor changes in code, we changed one IF statement. So NOTHING related to adding or removing DI classes. And this problem occurred. We had at least 15 deployments with zero problems and then this happened.

UPDATE 2: As all build yield no compile time error, and as all executions of JAR built on physical machines (development and test/production) also yield no error, but only JAR build on Azure DevOps during runtime (java -jar filename.jar) I suspected that something might be wrong with cloud build. I've removed this line:

mavenOptions: '-Xmx3072m'

from azure-pipelines.yml

  - task: Maven@3
    inputs:
      mavenPomFile: 'pom.xml'
      mavenOptions: '-Xmx3072m'
      javaHomeOption: 'JDKVersion'
      jdkVersionOption: '1.11'
      jdkArchitectureOption: 'x64'
      publishJUnitResults: true
      testResultsFiles: '**/surefire-reports/TEST-*.xml'
      goals: 'package'

and now when we execute jar file, there is no error during runtime.

Error which we had was (I have to retype, cant copy and paste from prod server:

***************************
APPLICATION FAILED TO START
***************************
Description:

The dependencies of some of the beans in the application context form a cycle:
assessmentRuleEngineService defined in URL [jar:file:C:/Project/somewebapp-SNAPSHOT.jar!/BOOT-INF/classes!/some/company/ruleengine/assessment/AssessmentRuleEngineService.class]
???????
| assessmentServiceImpl defined in URL [jar:file:C:/Project/somewebapp-SNAPSHOT.jar!/BOOT-INF/classes!/some/company/services/AssessmentServiceImpl.class]
?       ?
| userPermissionService
?       ?
| organizationServiceImpl
???????

So to sum up:

  1. Whey there is never error when ran in Idea and how to make error happen there as well so we notice it before deploy?
  2. Why sometimes there are absolutely no cyclic dependency errors on development machines but there are on test/prod machines?
  3. Why sometimes jar built on Azure DevOps has errors while there are no errors neither in development nor test/prod machines.
  4. Why there is difference with Ubuntu 18.04 build and Windows 2019 build?
Nenad Bulatović
  • 7,238
  • 14
  • 83
  • 113
  • Wow, this is interesting. As Java usually does not involve one-pass compilers, the cyclic dependency should NEVER be a problem. Even weirder when you get it as runtime errors. Do any of your libraries contain C/C++-compiled binaries, or have some C-compilation done at runtime? – JayC667 Jan 29 '21 at 01:09
  • 1
    @JayC667 nope. Just pure Java 11 with Spring Boot, Hibernate and Maven. Also build command is same everywhere: mvn clean package – Nenad Bulatović Jan 29 '21 at 10:50
  • 1
    @JayC667 you won't believe this - in Azure DevOps instead of Ubuntu I chose Windows as build OS, and now it works! How crazy is can be? – Nenad Bulatović Jan 29 '21 at 15:53
  • 1
    @JayC667 it is easy to get circular dependency using spring, it's not that special – eis Feb 25 '21 at 17:56
  • 1
    without code, how would we able to know the reasons of your cyclic dependency? this would need a lot more details to be answerable. – eis Feb 25 '21 at 17:56
  • Java itself does not have a problem with cyclic dependencies because of late binding. I assume your problem is with Maven. Do you have any error message? I had many problems with Maven, and in such cases it's always good to start by deleting the .m2 repository and any build files and try to reproduce the problem from fresh state. – jurez Feb 25 '21 at 18:01
  • @jurez it is Azure Cloud build. It always recreates .m2 and actually whole VM if I am not mistaken? – Nenad Bulatović Feb 25 '21 at 18:52
  • @eis it is big application and service methods are not anything special. Just usual Spring DI - constructor based. If there would be any problem with the code, then dev build in IntelliJ would be first one to break. If not, then maven build from command line. On our developers machines (just typical Windows 10 machines with zillion utility software) there are no problems. – Nenad Bulatović Feb 25 '21 at 18:55
  • @NenadBulatovic at the very least add the error you're getting into the question – eis Feb 25 '21 at 19:27
  • based on this question it's not even clear if you are talking about buildtime error, runtime error or both – eis Feb 25 '21 at 19:30
  • @eis it is explained in first paragraph. When it happens it happens during runtime. Rarely it happens during build time. If it would happen during build, it wouldn't be a problem, we could fix it before deploying jar to the server. As for code - it is usual code, which runs without problems on normal desktop machine. Only way to eliminate problem is to substitute constructor based DI with field based dependency injection. I will add that into original post. – Nenad Bulatović Feb 25 '21 at 22:21
  • "When it happens it happens during runtime. Rarely it happens during build time" - you literally can't get the same error both runtime and build time. This is one of the things you need to clarify, which is it, and what is the exact error message. – eis Feb 25 '21 at 22:22
  • @eis I added update and error we had. – Nenad Bulatović Feb 28 '21 at 18:10
  • Without an [MCVE](https://stackoverflow.com/help/mcve), this question cannot be properly answered. It only attracts speculation. – kriegaex Mar 03 '21 at 06:56

2 Answers2

2

Obviously without, you know, actually digging through your code, isolating & proving the problem exists, making a change & fixing it -- no-one is going to be able to give you a perfect/precise answer here.

That said, I've been doing this for a long time & have run into this type of thing several times before.

In my experience (and at risk of stating the obvious), this is occurring because your classes are sometimes getting loaded in a different order.

That is - you have a problematic initialization cycle (somewhere) - where you are (probably in a static initializer) reading from one class that ultimately calls back to yourself. See something like Static Circular Dependency in Java for more on this.

Probably the ordering of these calls -- within the respective source files -- is such that it just so happens to be OK when one class is initialized first, but not if the other is initialized first.

How could that happen, you might ask?

While in theory it could be the first workload hitting your app -- probably (especially given your 'app failed to start' error) this is ultimately caused by multiple threads doing the initialization. And it's OK for multiple threads to do this - it just means that, depending upon how those threads get scheduled (the other load on the machine, etc, etc) - sometimes things will occur in a different order.

Given your observed difference of Windows/Linux - and I've seen this exact issue often -- what is happening is that the two operating systems schedule work diferently. Like for example, Linux might begin execution of a new thread more quickly than letting the thread that started that new thread continue. Whereas Windows might let the original thread continue for a little longer before the new thread actually begins running. Or perhaps the production machines have additional cores and they can do more work in parallel. Solaris used to be super different in its new thread behavior compared to Windows/Linux - this type of thing was a barrel of laughs back in the day.

If you're following my conjecture above, you will note that just (for example) switching production environments (to Windows, fewer cores, etc) will not eliminate this problem, just reduce the likelihood of it occurring. And, actually, if I was on your team, I'd argue that it also makes it significantly more difficult to diagnose/fix :) -- because it is now so much harder to reproduce.

So - what to do?

This is a little dependent upon your code/architecture, I'm afraid -- basically a matter of finding the approach that is the least work :)

Some ideas:

  • Find all the places new threads get started and add some some very small sleeps, either in the thread being started or from the calling/starting thread. Please don't commit that code! But use it to make the problem occur deterministically. Then you can fix the root cause. Maybe you leave some debug variables around "THREAD_INIT_SLEEP=0" to make it easy to turn back on later.

  • Find all the static initializations that call across sub-systems / modules. In theory with dependency injection you won't have problems like this, so ... look for places where someone has cheated :)

  • Log -verbose:class for success/failure launches, and look at the different ordering of the class loads between the runs. This could help you narrow down which classes are involved ... although I don't envy you trying to dig through those logs. Maybe a little awk/grep will help :)

Dharman
  • 30,962
  • 25
  • 85
  • 135
PaulM
  • 113
  • 5
  • Thanks for being only one trying to actually help me with this instead of asking me to post legacy code which belong to the financial institution. – Nenad Bulatović Mar 05 '21 at 06:47
1

I also faced the same problem today. we are using spring Boot 3 and Java 17. On my local and on dev environment application was running fine. I opened my PC today and I saw a message from my lead that on production one service is throwing bean in currently creation exception. We have different services like admin-service, payment-service and others. Only admin-service was throwing this exception on production. First I tried to run locally admin-service and it was running fine. I just joined this company and I was not familiar with all the code base.

Then what I did. I noticed we have different configuration classes in admin-service. Main class is AdminServiceApplication.java with main method which is Spring Boot main class. we have another Configuration class which was AmazonS3Configuration.java.

I wrote a simple test.

@SpringBootTest(classes = {AmazonS3Configuration.class, AdminServiceApplication.class})
@ActiveProfiles("local")
public class FakeTest {

    @Test
    void myTest() {
        System.out.println();
    }
}

Notice I am loading AmazonS3Configuration.class first and AdminServiceApplication.class later. And the issue got reproduce on my local. If I change the position like @SpringBootTest(classes = {AdminServiceApplication.class, AmazonS3Configuration.class}) then there will be no issue. It means if AmazonS3Configuration class is loading before the AdminServiceApplication class then it is the problem. Then I simply used the @Order

@SpringBootApplication
@Order(Ordered.HIGHEST_PRECEDENCE)
public class AdminServiceApplication {
    public static void main(String[] args) {
        SpringApplication.run(AdminServiceApplication.class, args);
    }
}

This resolved the issue. Hopefully it will help some others too.

Thanks

Basit
  • 8,426
  • 46
  • 116
  • 196