We'd like to be able to deploy our Spark jobs such that there isn't any downtime in processing data during deployments (currently there's about a 2-3 minute window). In my mind, the easiest way to do this is to simulate the "blue/green deployment" philosophy, which is to spin up the new version of the Spark job, let it warm up, then shut down the old job. However, with structured streaming & checkpointing, we cannot do this because the new Spark job sees that the latest checkpoint file already exists (from the old job). I've attached a sample error below. Does anyone have any thoughts on a potential workaround?
I thought about copying over the existing checkpoint directory to another checkpoint directory for the newly created job - while that should work as a workaround (some data might get reprocessed, but our DB should deduplicate), this seems super hacky and something I'd rather not pursue.
Caused by: org.apache.hadoop.fs.FileAlreadyExistsException: rename destination /user/checkpoint/job/offsets/3472939 already exists
at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.validateOverwrite(FSDirRenameOp.java:520)
at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.unprotectedRenameTo(FSDirRenameOp.java:364)
at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.renameTo(FSDirRenameOp.java:282)
at org.apache.hadoop.hdfs.server.namenode.FSDirRenameOp.renameToInt(FSDirRenameOp.java:247)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.renameTo(FSNamesystem.java:3677)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rename2(NameNodeRpcServer.java:914)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.rename2(ClientNamenodeProtocolServerSideTranslatorPB.java:587)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2045)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.rename(DFSClient.java:1991)
at org.apache.hadoop.fs.Hdfs.renameInternal(Hdfs.java:335)
at org.apache.hadoop.fs.AbstractFileSystem.rename(AbstractFileSystem.java:678)
at org.apache.hadoop.fs.FileContext.rename(FileContext.java:958)
at org.apache.spark.sql.execution.streaming.HDFSMetadataLog$FileContextManager.rename(HDFSMetadataLog.scala:356)
at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.org$apache$spark$sql$execution$streaming$HDFSMetadataLog$$writeBatch(HDFSMetadataLog.scala:160)
... 20 more
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.fs.FileAlreadyExistsException): rename destination /user/checkpoint/job/offsets/3472939 already exists