0

we have HDP cluster with 528 data nodes machines 

in Ambari HDFS Configs , we configured 3 config group because the following: 

  1. 212 data nodes machine are with 32G

  2. 119 data nodes machines are with 64G

  3. 197 data nodes machines are with 128G

so in Ambari we have following config group settings 

enter image description here

now we need to configure the parameter - "DataNode maximum Java heap size  ( dtnode_heapsize )" according to the machines memory

so we want to set the following:

on first 212 data nodes machine are with 32G , DataNode maximum Java heap size will set to 10G

on machines - 119 data nodes machines are with 64G , DataNode maximum Java heap size will set to 15G

on machines - 197 data nodes machines are with 128G , DataNode maximum Java heap size will set to 20G

so in order to configure the parameter - DataNode maximum Java heap size , on each config group , we try to use the following tool - config.py

/var/lib/ambari-server/resources/scripts/configs.py -user=admin --password=admin  --port=8080 --action=set --host=ambari_server_node --cluster=hdp_cluster7 --config-type=hadoop-env -k "dtnode_heapsize" -v "10000"

the above cli will configure the parameter - dtnode_heapsize to 10G ( 10000M ) 

when we run above cli , the parameter -  dtnode_heapsize was update but not on the groups !

what was update is the parameter in the default group - "Default"

so how to set the parameter - dtnode_heapsize , according to the relevant config group? 

I we not sure that config.py support configuration on config group , in that case we need maybe other approach 

maybe other related posts that can help - ambari + API syntax in order to change the parameters of the ambari services

Note - the target is to automate the settings in Ambari by API/REST API or SCRIPTS, so manual changing isn't acceptable

Judy
  • 1,595
  • 6
  • 19
  • 41
  • I'd suggest opening the network tab in your browser, then click through `Manage Config Groups` and add hosts to those groups and update configs as needed, and then look what requests are being made and replicate them in your own application... Also, max heap size [doesn't need to be much larger than 8GB for datanodes](https://stackoverflow.com/a/53654899/2308683), otherwise, GC pauses will take longer. You should reserve the RAM for YARN containers instead – OneCricketeer Aug 19 '22 at 16:13
  • about - "nd then look what requests are being made and replicate them in your own application." how we can do it? – Judy Aug 20 '22 at 19:36
  • https://developer.chrome.com/docs/devtools/network/ From there, you can copy network requests as curl commands – OneCricketeer Aug 21 '22 at 14:06

0 Answers0