0

I am replicating my data from Azure SQl DB TO Azure SQL DB. I have some tables with date columns and some tables with just the ID columns which are assigning primary key. While performing incremental load in ADF, I can select date as watermark column for the tables which have date column and id as watermark column for the tables which has id column, But the issue is my id has guid values, So can I i take that as my watermark column ? and if yes while copy activity process it gives me following error in ADF

Please see the image for above reference

Please see the image for above reference

How can I overcome this issue. Help is appreciated

Thank you Gp

I have tried dynamic mapping https://martinschoombee.com/2022/03/22/dynamic-column-mapping-in-azure-data-factory/ from here but it does not work it still gives me same error.

burnsi
  • 6,194
  • 13
  • 17
  • 27
gmp
  • 1
  • 3
  • Is copy activity failing only during incremental copy? – Aswin Jan 25 '23 at 05:43
  • can you show your dynamic mapping output and data types of the underlying source and sink columns? – Geezer Jan 25 '23 at 09:41
  • @Aswin yes it fails during copy activity – gmp Jan 25 '23 at 10:02
  • @Geezer the data types at both source and sink are uniqueidentifier and the value which I was passing for column mapping parameter was : {"type": "TabularTranslator","mappings": [{"source": {"name": "RKey","type": "guid"},{"sink": {"name": "RKey","type": "String"}}]} – gmp Jan 25 '23 at 10:03
  • What is the datatype and constraints in Sink SQL table for id column? – Aswin Jan 25 '23 at 10:03
  • @Aswin its guid at sink and ideally constraint should be primary key but during auto creation table through adf primary key is not being set on the id column – gmp Jan 25 '23 at 10:09
  • @Gauri if the datatype in source is `uniqueidentifier` then why are using `String` datatype in sink the mapping? – Geezer Jan 25 '23 at 10:15
  • @Geezer I was trying to test it, but I did try keeping same datatype i.e. guid at both sides, still it fails. I have created paramater columnmapping with type object and in the copy activity I have added mapping as @pipeline().parameters.ColumnMappping – gmp Jan 25 '23 at 10:27

2 Answers2

0

Regarding your question about watermak: A watermark is a column that has the last updated time stamp or an incrementing key So GUID column would not be a good fit. Try to find a date column, or an integer identity which is ever incrementing, to use as watermark. Since your source is SQL server, you can also use change data capture. Links: Incremental loading in ADF Change data capture

Regards, Chen

Chen Hirsh
  • 736
  • 1
  • 1
  • 13
  • is there any documentation where I can go through the cons of using guid as watermark column ? – gmp Jan 26 '23 at 09:56
  • The link I wrote above shows what are good candidate for a watermark column. I don't think guid can be used. – Chen Hirsh Jan 26 '23 at 11:18
0

The watermark logic takes advantange of the fact that all the new records which are inserted after the last watermark saved should only be considered for copying from source A to B , basically we are using ">=" operator to our advantage here .

In case of guid you cannot use that logic as guid cann surely be unique but not ">=" or "=<" will not work.

HimanshuSinha
  • 1,650
  • 2
  • 6
  • 10
  • that is true, I agree on that point, but I dont have any other column in the table on which I can set incremental load other than the guid , is there any workaround for this other than CT or CDC – gmp Jan 31 '23 at 09:27
  • How big is the table ? What is the incremental growth we are are looking in terms of record count ? – HimanshuSinha Jan 31 '23 at 22:04
  • At the moment table is not big, records are less than lakhs but the size can increase later – gmp Feb 01 '23 at 09:27
  • is there any way out for this as the CT approach would not work in my case due to some restrictions – gmp Feb 06 '23 at 10:11
  • A far fetched Idea and let me be honest never tried in practice . A uniqueidentifier (GUID) contains text, so it's only logical that you cannot convert it directly to an INT. What you can do is create a mapping table. Generate your GUID and insert it into a table containing an IDENTITY column . Basically you join the mapping table on the GUID and use INT as an watermark . Good luck ;) – HimanshuSinha Feb 08 '23 at 05:08