0

We are currently doing a feasibility study on migrating from CDH(Cloudera Distributed Hadoop) to CDP(Cloudera Data Platform) wrt spark(currently in version 1.6).

When checked the documenation,it was understood that 1.6 is not supported ,we need to refactor it to 2.4 and the steps to do manually is given

https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade-cdh/topics/cdp-one-workload-migra...

But We are planning to migrate to Spark 3.x in CDP.In one of the cloudera blogs about the same(link below

https://blog.cloudera.com/upgrade-journey-the-path-from-cdh-to-cdp-private-cloud/

As part of pre upgrade step ,it is mentioned that we need to convert Spark 1.x jobs to 2.4.5.

Phase 2: Pre-upgrade Backup existing cluster using the backup steps list here Confirm if all the prerequisites are addressed. Ensure all outstanding dependencies are met. Convert Spark 1.x jobs to Spark 2.4.5. Test and validate the jobs to ensure all the required code changes are performed and tested. My doubt is :

If the migration is from Spark 1.x-3.x when moving from cdh to cdp,is it mandatory to have a step in between to convert spark 1x-2x and then 2x to 3,if yes then the refactoring of 1x-2x is automated or it should be done manually as the steps given in cloudera

https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/upgrade-cdh/topics/cdp-one-workload-migration-spark16-to-spark24.html

If not,can we directly refactor from spark 1x-3x when moving from CDH to CDP.Kindly help.

Thanks in advance.

tried looking for the solution in exisiting cloudera docuementation but couldnt get anything,in terms of Migrating Spark workloads to CDP ,there are only 2 options

Spark 1.6 to Spark 2.4 Refactoring Because Spark 1.6 is not supported on CDP, you need to refactor Spark workloads from Spark 1.6 on CDH or HDP to Spark 2.4 on CDP.

Spark 2.3 to Spark 2.4 Refactoring Because Spark 2.3 is not supported on CDP, you need to refactor Spark workloads from Spark 2.3 on CDH or HDP to Spark 2.4 on CDP.

Spark 2.4 to 3.x

But, if in case if we have Spark 1.6,then moving it to 2.4 and then to 3 will be double the effort

Adigkar
  • 3
  • 2

0 Answers0