I am using parallel stream to process large dataset. But it gives inconsistent results. The database used is Postgres. I have hierarchical data with levels defined.
For example, I have 5 levels of data in a hierarchy. I am processing the lowest level (5 here) nodes first, persist to the DB. Then while I am processing one level above (4 here), I have to fetch the data that has been already saved in level 5 and process it and save the level 4 data to db.
I am using parallelstream for each level processing. Once the process of level 5 completed and when we are trying to fetch data of that when we process level 4 nodes, the saved data is not reflecting.
When I remove "parallelStream()" in the below code, everything works fine. But it is taking too much time