0

I wanted to replace populate property address data the null values in propertyaddress column with the help of self join. But it runs for over thirty minutes without returning and I have to cancel the task. Can anyone help me?

MERGE 
INTO housedata a
      USING 
      (select distinct b.id
             , b.parcelid
             , b.propertyaddress
             , NVL(a.propertyaddress, b.propertyaddress)
      from housedata a
          ,housedata b
             -- WHERE a.propertyaddress is null
               ) src
              ON (a.parcelid = src.parcelid and a.id != src.id)
  WHEN MATCHED THEN 
  UPDATE 
  SET propertyaddress = NVL(a.propertyaddress,src.propertyaddress)
  WHERE a.propertyaddress is null

my code

Slava Rozhnev
  • 9,510
  • 6
  • 23
  • 39
  • 1
    What does "it does not work" mean? It works, but doesn't produce result you wanted. It doesn't work at all. It raises an error (if so, which one)? Sample data might help us help you. – Littlefoot May 27 '21 at 16:08
  • @Littlefoot Thanks. I mean that Query is running for more than 30 mins and I have to cancel the task. – Mohammad Liton Hossain May 27 '21 at 16:48
  • 1
    Is this a one off task, or a regularly occuring one? How long do you expect it to take? Can you provide a SQL Monitor report of your run. – BobC May 27 '21 at 17:23
  • Tuning SQL is a matter of many things. We need far more details. Please read [this thread on asking Oracle tuning questions](https://stackoverflow.com/a/34975420/146325). – APC May 27 '21 at 18:17

1 Answers1

1

Your USING subquery has no join criteria for the two instances of the table, so it is going to generate a product of all the rows in both tables. You don't want that. Perhaps what you meant to code was something like this?

MERGE 
INTO housedata a
      USING 
      (select b.parcelid
             , max(b.propertyaddress) as propertyaddress
      from housedata b
      where b.propertyaddress is not null
      group by b.parcelid
               ) src
   ON (a.parcelid = src.parcelid)
  WHEN MATCHED THEN 
  UPDATE 
  SET a.propertyaddress = src.propertyaddress
  WHERE a.propertyaddress is null

Note this logic handles multiple instances of parcelid with difference values of propertyaddress. Without seeing your data or knowing your business rules it's hard to know whether than is a valid use case, and whether max() is the appropriate solution. Only you can decide that.


I have given my screenshot of my orginal table which contains null values in property address.

To give a proper answer we need to see a representative sample of the whole data set. So it would be better if you showed some data filtered or sorted by parcelid, including rows where propertyaddress is not null. As it is, I have to guess what your problem might be.

I do not understand how my use case is solved by the max() function.

If you only have one row per parcelid with a populated propertyaddress then you don't need MAX(). However, if you have more than one such instance then you need some logic to choose one. Apart from anything else, MERGE requires it to generate a stable set. It doesn't have to be MAX(): MIN() would also work, maybe some other aggregate or analytic functions too. Again, you need to choose the the best fit for your business rules.

APC
  • 144,005
  • 19
  • 170
  • 281
  • Thanks a lot @APC. It works for me and now there are no more null values in my desired column. I have given my screenshot of my orginal table which contains null values in property address. I wanted to repalce those null values by the property address which have the same parcel ID. I do not understand how my use case is solved by the max() function. could u please help me to learn this? Thanks again. – Mohammad Liton Hossain May 28 '21 at 12:16
  • @MohammadLitonHossain - I have added a more detailed explanation of this point. – APC May 29 '21 at 04:38