2

Will we have a difference in performance between a normal Hadoop cluster and a secure Hadoop cluster configured with Kerberos and SSL?

Considering machine configuration are same for both type of clusters, does the time taken to complete a job differs? If yes, do we have any known time metrics regarding time difference? Like,

  • Normal cluster - 1.5 hour
  • Secure cluster - 2.5 hour
Dinesh Kumar P
  • 1,128
  • 2
  • 18
  • 32

1 Answers1

2

Yes. There is an overhead in all API calls due to Kerberos and SSL.

Job completion time will differ but there's no way of knowing how much this is without knowing how many times the APIs are called within the job. In general the impact will be minimal but since you are introducing another network component to your workflow (the KDC) you could experience significant degradation depending on how large your cluster is and any network issues between the KDC and your cluster, plus how the KDC was configured. See the following for more information.

https://community.hortonworks.com/questions/31205/performance-impact-of-security-ssl-tde-ranger-kerb.html

https://www.cloudera.com/documentation/enterprise/5-8-x/topics/impala_scalability.html#kerberos_overhead_cluster_size

HTTP vs HTTPS performance

tk421
  • 5,775
  • 6
  • 23
  • 34