I have a glue script using pyspark. I have to create a unique surrogate key. Ive been using row_number and monatically increasing, and it works on the first job but every time I upload new files and run the job again it starts the number back at 1. Any guidance on how I can keep the sequential continuing when new files are added? For more info I'm uploading to an Oracle Database.
Current table
Column A | Column B |
---|---|
Jon | 1 |
doe | 2 |
Table after a upload
Column A | Column B |
---|---|
Jon | 1 |
doe | 2 |
Jean | 1 |
What I want after a upload
Column A | Column B |
---|---|
Jon | 1 |
doe | 2 |
Jean | 3 |