1

Suppose I have an EJB, CrudService, with methods to perform CRUD on a single entity (so not collections). CrudService has an EntityManager injected into it, and it cannot be modified. CrudService looks something like this:

@Stateless
public class CrudService {

  @PersistenceContext(name = "TxTestPU")
  private EntityManager em;

  public Integer createPost(Post post) {
     post = em.merge(post);
     return post.getId();
  }

  public void updatePost(Post post) {
      em.merge(post);        
  }

  public Post readPost(Integer id) {
      return em.find(Post.class, id);
  }

  public void deletePost(Post post) {
      em.remove(post);
  }
}

I would like to be able to create/update a collection of Post entities, in parallel, in a single transaction. An approach which does not work, as for each thread in the pool a new transaction is created by the container, is the following :

@Stateless
public class BusinessBean {

  @Inject
  private CrudService crudService;

  public void savePosts(Collection<Post> posts) {
      posts.parallelStream().forEach(post ->
          crudService.createPost(post);
  }
}

Is there a way to do it ? The code runs on Wildfly, with a Hibernate persistence unit and Postgresql database.

  • 2
    This will probably not work, as the pool threads in which the subtasks run will not have the right context. Also, the streams library was designed for _data parallelism_, not _IO parallelism_. So you are also unlikely to get the parallelism you think you'll get. – Brian Goetz Jan 14 '18 at 14:30
  • Thanks for your input! Indeed, streams are just the way I intended to parallelize the work. The main goal here was to perform the inserts in parallel, not necessarily using streams, and in a single transaction. Changed the title, too. – cristian3181763 Jan 14 '18 at 14:34

1 Answers1

2

The straight "here is an answer" answer.

Not generically. Have a look at the answers to this question: Is it discouraged using Java 8 parallel streams inside a Java EE container?


The annoying "XY problem" answer.

How would you expect this would work? Most databases don't support multiple parallel transactions on a single database connection, I don't believe PG supports it: https://stackoverflow.com/a/289057/924597

So something/somebody (JEE container, JDBC, driver, etc.) would have to open multiple DB connections to achieve this - which I think you're saying is what is happening? If you're doing this across many different business actions this would likely exhaust your connection pool pretty quickly.

In the spirit of this being an "XY problem" answer - what problem are you trying to solve?

If it's just a raw throughput problem - consider batching your inserts.

If it's a bulk insert problem - consider making an end-run around your container and using a different tool, JEE containers aren't usually meant for/good at this kind of thing.

Shorn
  • 19,077
  • 15
  • 90
  • 168
  • The problem would be throughput, indeed. I also thought about insert batching, but the collection size varies and I don't have direct access to the PU definition. That is why i was searching for a solution from "outside" the CrudService. And yes, currently each thread from the fork join pool that gets some work processing the collection of entities has its own DB transaction. – cristian3181763 Jan 15 '18 at 08:36
  • @cristian3181763 Regarding "I don't have direct access to the PU definition". If you really have a throughput problem that must be dealt with - forget the PU and go straight for the JDBC DataSource or Connection. Consider asking a new question based around your underlying problem. Parallelising your solution might be the right approach - but parallelising transactions by using streams is probably barking up the wrong tree. – Shorn Jan 17 '18 at 03:27
  • Thanks for the input! Streams was just a quick way of splitting the work. I might have used some other approach. The question is not related to streams, but rather to persisting a large collection of entities, in parallel, in the same transaction. – cristian3181763 Jan 18 '18 at 17:36