I have some tabular data as follows.
| | A | B | C | D |
|---|----------|--------------|------------------|---------------------------------------------------|
| | | p1 | p2 | pn |
| 1 | Lanterns | Bruce Wayne | Jean-Paul Valley | Dick Grayson; Terry McGinnis; Jean-Paul Valley |
| 2 | Bats | Alan Scott | Hal Jordan | Guy Gardner; John Stewart; Kyle Rayner; Simon Baz |
| 3 | Fates | Kent Nelson | Khalid Nassour | Hector Hall; Khalid Nassour; Khalid Ben-Hassin |
| 4 | Supes | Clark Kent | John Henry Irons | Conner Kent; Hank Henshaw; Kong Kenan |
| 5 | Spideys | Peter Parker | Peter Parker | Ben Reilly; Miles Morales |
| 6 | Irons | Tony Stark | Happy Hogan | James Rhodes; Eddie March; James Rhodes |
For each row, I want to find if duplication exists between columns B, C as well as semi colon separated values of column D.
How to do this in excel?
The desired output will be as follows.
| X | A | B | C | D | E |
|---|----------|--------------|------------------|---------------------------------------------------|-------|
| | | p1 | p2 | pn | |
| 1 | Lanterns | Bruce Wayne | Jean-Paul Valley | Dick Grayson; Terry McGinnis; Jean-Paul Valley | TRUE |
| 2 | Bats | Alan Scott | Hal Jordan | Guy Gardner; John Stewart; Kyle Rayner; Simon Baz | FALSE |
| 3 | Fates | Kent Nelson | Khalid Nassour | Hector Hall; Khalid Nassour; Khalid Ben-Hassin | TRUE |
| 4 | Supes | Clark Kent | John Henry Irons | Conner Kent; Hank Henshaw; Kong Kenan | FALSE |
| 5 | Spideys | Peter Parker | Peter Parker | Ben Reilly; Miles Morales | TRUE |
| 6 | Irons | Tony Stark | Happy Hogan | James Rhodes; Eddie March; James Rhodes | TRUE |
Edit There was an error in the column names in the question causing lack in clarity. Fixed it now.
Update
Here is my attempt with VBA as suggested by @Foxfire And Burns And Burns
.
It is adapted from https://superuser.com/a/1005497/460054
Public Function HasDuplicates(list As String, delimiter As String) As String
Dim arrSplit As Variant, i As Long, tmpDict As Object, tmpOutput As Boolean
Set tmpDict = CreateObject("Scripting.Dictionary")
arrSplit = Split(list, delimiter)
tmpOutput = False
For i = LBound(arrSplit) To UBound(arrSplit)
If tmpDict.Exists(Trim(arrSplit(i))) Then
tmpOutput = True
Exit For
Else
tmpDict.Add Trim(arrSplit(i)), Trim(arrSplit(i))
End If
Next i
HasDuplicates = tmpOutput
'housekeeping
Set tmpDict = Nothing
End Function
Here is all possibel use cases again as suggested by @Foxfire And Burns And Burns
.
+---+-----+----+-----------+--------------------+-------+
| | A | B | C | D | E |
+---+-----+----+-----------+--------------------+-------+
| 1 | A | B | | A; B; | False |
| 2 | A | | | A; ; | True |
| 3 | | | | ; ; | True |
| 4 | G | K | G | G; K; G | True |
| 5 | N | M | O | N; M; O | False |
| 6 | N | N | O | N; N; O | True |
| 7 | V | U | X; Y; X | V; U; X; Y; X | True |
| 8 | P J | VK | P; J; V K | P J; VK; P; J; V K | False |
| 9 | VK | O | R; VK | VK; O; R; VK | True |
+---+-----+----+-----------+--------------------+-------+
The formula for columnD is =CONCATENATE(B2,"; ",C2, "; ",D2)
and for E is =HasDuplicates(E2, ";")
.
But here it is not handling empty cells. Rows 2 and 3 should also be False
.