I must implement an AWS lambda function in Java that would consume a Kinesis stream and read/write data to a MySQL database. As I already have the model entities defined in another application, I would like to reuse them, and not work with plain SQL/JDBC. So my goal is to implement the lambda using JPA/Hibernate. Is this possible in general? If yes, are there any real examples or best practices? I have previously worked on Spring Boot applications, where similar functionality is perfectly available and easily configurable, and now I don't even know where to start from.
-
Spring boot or Wildfly Swarm work on AWS Lambda just fine. The are a bit "heavier" than other solutions but still work. Give it a shot! – stdunbar Jan 16 '18 at 16:16
-
Thanks for the suggestion. I'm afraid Spring boot might be too heavy for lambda, so I would like to use it as a last resort solution. – Archie Jan 16 '18 at 16:18
-
JPA will require a container like Spring Boot, Wildfly Swarm, or TomEE. It sounds like you're just guessing that Spring Boot won't work - it isn't too hard to try it and see if it works for what you want. Otherwise you need to go a different route - pure JDBC or Hibernate in a J2SE environment. – stdunbar Jan 16 '18 at 16:30
-
2You don't need any container. You can use JPA without and app server or Spring Boot https://stackoverflow.com/questions/12688162/java-standalone-app-with-jpa-hibernate-or-similar-and-apache-derby-embedded – Simon Martinelli Jan 16 '18 at 18:26
-
You can use the vast majority of java APIs without an app server or "container"! JPA, JDO, JDBC, JNDI, JTA, and so on. – Jan 16 '18 at 18:31
2 Answers
I disagree strongly with your assessment that JPA isn't feasible on Lambda, and have several functions running currently which demonstrate the point, with a caveat: It's only really feasible to use JPA in a synchronous lambda environment, or an infrequently used one. (More on this in a moment)
Persistence.createPersistenceFactory
should only be executed once per application, and the result placed in a static or singleton context.
You handle closing down JPA resources using a runtime shutdown hook.
What happens in practice is that the Lambda function is spun up and held around for a while even when no data is being processed. (This is true for both synchronous and asynchronous functions) Under this condition, when the next invocation comes, your function is executed with the same runtime environment, and it can re-use the EntityManagerFactory
to create new EntityManager
instances. The major difference between synchronous and asynchronous invocations is the amount of concurrency. Synchronous functions have very limited concurrency, whereas asynchronous functions can vary widely, so you bear a larger startup cost because it happens more often as the pool of function instances scales up and down.
Yes, the function's runtime environment will be periodically killed, whether synchronous or asynchronous, especially when no traffic is seen, and you then have the slow startup time, but subsequent invocations will re-use the same runtime environment(s) in my experience, especially when using synchronous functions, like those hooked into a Kinesis Stream.
I highly recommend limiting yourself to synchronous writers incidentally, as it will limit the number of Lambda instances, and hence how many JDBC pools are connecting to your database. (This can become a serious issue on the DB side, which generally only accepts a few dozen or hundred connections, but each JDBC pool will be ~10ish connections!!!) For Kinesis Streams, this is equal to the number of shards, and if your records are smallish, you can greatly increase the shard capacity using KPL and record aggregation. The lambda function itself will only receive < 6MB of data from the stream per invocation.

- 2,931
- 7
- 27
- 39

- 3,400
- 1
- 23
- 45
-
Thanks for your answer. If I call `Persistence.createPersistenceFactory` once and store it let's say in a static variable, then when should I close it? As far as I know, lambda doesn't provide a shutdown hook or anything similar. – Archie Feb 23 '18 at 13:08
-
Use Runtime.getRuntime().addShutdownHook() to close the EntityManagerFactory. You shouldn’t use this approach in a servlet container, but it works well with Lambda and J2SE applications. – SplinterReality Feb 25 '18 at 01:09
-
So... which JPA implementation is available in an AWS lambda? And which JARs do we need as dependencies in order to use it in our lambda? – peter.petrov Mar 23 '20 at 11:18
-
@peter.petrov It's very much bring your own provider in the same way you need to when using Tomcat or a J2SE application instead of a J2EE server. In all truth, you should just think of Lambdas as a J2SE application with a very specific entry point. If you'd like some demo code, I can probably cut some out of my application as an example, but it's nothing special. Just initialize JPA in the same way you would in a J2SE application using Persistence.createPersistenceFactory, then immediately add a Runtime.getRuntime().addShutdownHook() and you're basically golden. – SplinterReality Mar 24 '20 at 12:52
-
@SplinterReality It's OK, thanks very much for the help. I think I got it all working today. Small question: should I close only the entity manager factory or 1) the entity manager and then 2) the entity manager factory? Or... maybe closing the parent EMF closes the child EM too? – peter.petrov Mar 24 '20 at 15:25
I'll answer my own question to save time for other who want to try AWS lambda with JPA/Hibernate. While it IS possible to use JPA without an app server or container, in case of AWS lambda there are a lot of limitations. See this for example. This makes it not practical to use JPA & lambda. So I decided to interact with database through JDBC directly (to be more precise using Spring's JdbcTempate).

- 962
- 2
- 9
- 20
-
1Sorry, the link is indeed broken. It referred to a problem when entity manager factory was created in the lambda function. When running multiple times, it could take seconds to create the factory. – Archie Nov 05 '18 at 10:55
-
@Archie since SplinterReality explained what you can do to prevent creating multiple EntityManagerFactory, that should be marked as the recommended answer. – user2910265 Feb 20 '19 at 19:03
-
@user2910265 I am still convinced that it's too much pain in the ass to tickle and configure JPA for a stateless and multithreaded / multiprocessor environment like Lambda. And I never tried his suggestion to see if it works or not. Neither could I find any example on the internet that shows it works. Hence I would keep this as recommended answer, as I believe plain JDBC better matches the nature of lambda. – Archie Feb 21 '19 at 09:35
-
1@Archie that's fair if you haven't tried it. I'm trying it and I'll report back on how it went. – user2910265 Feb 25 '19 at 15:55
-
@user2910265 Awesome. Let me know about your findings. Also feel free to add it as an answer to this question. – Archie Feb 26 '19 at 13:32