-1
| num_A | num_B | name                      | birth_date | id |
|-------|-------|---------------------------|------------|----|
| 1234  | abcd  | M Rita Costa Santos       | 01/01/2000 | 1  |
| 3333  | uvwx  | M Rita Costa Santos       | 01/01/2000 | 1  |
| 5678  | efgh  | Maria Rita C Santos       | 01/01/2000 |    |
| 9101  | ijkl  | Rita Costa Santos         | 01/01/2000 | 1  |
| 1111  | mnop  | Maria Leonor Santos Silva | 02/03/2001 |    |
| 2222  | qrst  | Leonor Santos Silva       | 02/03/2001 | 2  |
| 4444  | yzab  | Leonor Santos Silva       | 30/08/1999 |    |

Imagine I have this table 1, but in a large scale. I want to find similar and equal values in the name column and if they are similar/equal, go to the column birth_date and see if they are equal. If yes, create the same id for this cases. So i want the final output to be like in table 2:

| num_A | num_B | name                      | birth_date | id |
|-------|-------|---------------------------|------------|----|
| 1234  | abcd  | M Rita Costa Santos       | 01/01/2000 | 1  |
| 3333  | uvwx  | M Rita Costa Santos       | 01/01/2000 | 1  |
| 5678  | efgh  | Maria Rita C Santos       | 01/01/2000 | 1  |
| 9101  | ijkl  | Rita Costa Santos         | 01/01/2000 | 1  |
| 1111  | mnop  | Maria Leonor Santos Silva | 02/03/2001 | 2  |
| 2222  | qrst  | Leonor Santos Silva       | 02/03/2001 | 2  |
| 4444  | yzab  | Leonor Santos Silva       | 30/08/1999 | 3  |

Thank you in advance.

Reeza
  • 20,510
  • 4
  • 21
  • 38
fcf
  • 5
  • 4
  • 1
    Most people here want sample table data and expected result as formatted text, not as images or links to images. – jarlh Nov 04 '20 at 12:43
  • Does this answer your question? [Getting the closest string match](https://stackoverflow.com/questions/5859561/getting-the-closest-string-match) – Bill Huang Nov 04 '20 at 12:47
  • how large a scale ? Comparing names pairwise for nearness (similarity) using say an expensive function such as `COMPGED` over N items (large scale) is O(N^2) and requires `n(n-1)/2` comparisons. What if an exactly same name has a multiple or different birthdates ? – Richard Nov 04 '20 at 14:23
  • @jarlj I am new here, I tried to post a table, like I did in another website, but I was not able to. Thank you :) – fcf Nov 04 '20 at 14:51
  • @Richard what I mean by large scale is that I have nearly 12 000 records. If an exactly same name has a multiple or different birthdates, it will have a new ID, like I put in the last record :) Thank you – fcf Nov 04 '20 at 14:53
  • thank you @BillHuang :) I am trying to understand it now! – fcf Nov 04 '20 at 14:55
  • Use a web tool such as https://thisdavej.com/copy-table-in-excel-and-paste-as-a-markdown-table/ to create a text based table from an Excel or spreadsheet range copied to the clipboard. – Richard Nov 04 '20 at 14:58
  • In python look up the dedupe package. – Reeza Nov 04 '20 at 15:14

1 Answers1

0

I think the DENSE_RANK() function can solve your problem.

SELECT
   num_A,
   num_B,
   name,
   birth_date,
   DENSE_RANK() OVER(ORDER BY birth_date) AS rank
FROM Table1

Result from query above