I'm having some strange problems on Spark running with sparklyr.
I'm currently on an R production server, connecting to a my Spark Cluster in client mode via spark://<my server>:7077
and then pulling data from a MS SQL Server.
I was able to do this recently with no issues, but I recently was given a bigger cluster and am now having memory issues.
First I was getting inexplicable 'out of memory' errors during my processing. This happened a few times and then I started getting 'Out of memory unable to create new thread' errors. I checked the number of threads I was using compared to the max for my user on both the R production server and Spark server and I was no where near the max.
I restarted my master node and am now getting:
# There is insufficient memory for the Java Runtime Environment to continue.
# Cannot create GC thread. Out of system resources.
What the heck is going on??
Here are my specs:
- Spark Standalone running via root
user.
- Spark version 2.2.1
- Sparklyr version 0.6.2
- Red Hat Linux