I am working in a Dataproc cluster setup to test all the features. I have created a cluster template and I've been using the creation command almost every day, but this week it stopped working. The error printed is:
ERROR: (gcloud.dataproc.clusters.create) Operation [projects/cluster_name/regions/us-central1/operations/number] failed: Failed to initialize node cluster_name-m: Component ranger failed to activate post-hdfs See output in: gs://cluster_bucket/google-cloud-dataproc-metainfo/number/cluster_name-m/dataproc-post-hdfs-startup-script_output.
When I see the output that the message print, the error found is:
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]: ERROR: Error CREATEing SolrCore 'ranger_audits': Unable to create core [ranger_audits] Caused by: Java heap space
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]:
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]: + exit_code=1
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]: + [[ 1 -ne 0 ]]
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]: + echo 1
<13>Mar 3 12:22:59 google-dataproc-startup[15744]: <13>Mar 3 12:22:59 post-hdfs-activate-component-ranger[15786]: + log_and_fail ranger 'Component ranger failed to activate post-hdfs'
The creation command that I ran is:
gcloud dataproc clusters create cluster_name \
--bucket cluster_bucket \
--region us-central1 \
--subnet subnet_dataproc \
--zone us-central1-c \
--master-machine-type n1-standard-8 \
--master-boot-disk-size 500 \
--num-workers 2 \
--worker-machine-type n1-standard-8 \
--worker-boot-disk-size 1000 \
--image-version 2.0.29-debian10 \
--optional-components ZEPPELIN,RANGER,SOLR \
--autoscaling-policy=autoscale_rule \
--properties="dataproc:ranger.kms.key.uri=projects/gcp-project/locations/global/keyRings/kbs-dataproc-keyring/cryptoKeys/kbs-dataproc-key,dataproc:ranger.admin.password.uri=gs://cluster_bucket/kerberos-root-principal-password.encrypted,hive:hive.metastore.warehouse.dir=gs://cluster_bucket/user/hive/warehouse,dataproc:solr.gcs.path=gs://cluster_bucket/solr2,dataproc:ranger.cloud-sql.instance.connection.name=gcp-project:us-central1:ranger-metadata,dataproc:ranger.cloud-sql.root.password.uri=gs://cluster_bucket/ranger-root-mysql-password.encrypted" \
--kerberos-root-principal-password-uri=gs://cluster_bucket/kerberos-root-principal-password.encrypted \
--kerberos-kms-key=projects/gcp-project/locations/global/keyRings/kbs-dataproc-keyring/cryptoKeys/kbs-dataproc-key \
--project gcp-project \
--enable-component-gateway \
--initialization-actions gs://goog-dataproc-initialization-actions-us-central1/cloud-sql-proxy/cloud-sql-proxy.sh,gs://cluster_bucket/hue.sh,gs://goog-dataproc-initialization-actions-us-central1/livy/livy.sh,gs://goog-dataproc-initialization-actions-us-central1/sqoop/sqoop.sh \
--metadata livy-timeout-session='4h' \
--metadata "hive-metastore-instance=gcp-project:us-central1:hive-metastore" \
--metadata "kms-key-uri=projects/gcp-project/locations/global/keyRings/kbs-dataproc-keyring/cryptoKeys/kbs-dataproc-key" \
--metadata "db-admin-password-uri=gs://cluster_bucket/hive-root-mysql-password.encrypted" \
--metadata "db-hive-password-uri=gs://cluster_bucket/hive-mysql-password.encrypted" \
--scopes=default,sql-admin
I know that it's something related to Ranger/Solr setup, but I don't know how to increase this heap size without creating an alternative initialization script or creating a custom machine image. If you have any idea of how to solve this or need more information about my setup please let me know.