7

I want to DeleteDuplicates from a list of lists keeping the sublist structure intact.

E.g.

{{1, 7, 8}, {4, 3}, {4, 1, 9}, {9, 2}}

becomes

{{1, 7, 8}, {4, 3}, {9}, {2}}

This is the extent of nesting, and empty sublists are ok.

Nakilon
  • 34,866
  • 14
  • 107
  • 142
  • Welcome to stack exchange. Nice first question. You may also want to know about the Mathematica (beta) site: http://mathematica.stackexchange.com/ – DavidC Feb 16 '12 at 22:17

3 Answers3

11

This is a classic trick:

list = {{1, 7, 8}, {4, 3}, {4, 1, 9}, {9, 2}}

Module[{f},
 f[x_] := (f[x] = Sequence[]; x);
 Map[f, list, {2}]
]

To understand how it works, consider what happens when f[1] is evaluated for the first time. f[1] will evaluate to (f[1]=Sequence[]); 1, then to just 1. So now a definitions has been made for f[1]. The next time f[1] is evaluated, it will simply evaluate to Sequence[].

So, the first time f is evaluated for an argument, it returns that argument. The second (or third, etc.) time it is evaluated, it returns Sequence[]. Sequence[] has the property that it will get stripped out of expressions completely, i.e. {1, Sequence[], 3} will evaluate to {1, 3}.

To summarize, what the function does is: the first time it sees an element, it replaces the element with itself (leaves it alone). The second time it sees the same element, it removes it. I mapped this function at "level 2", so it will act only on the elements of sublists.

Szabolcs
  • 24,728
  • 9
  • 85
  • 174
  • @Szabolics Thanks, when I was trying to attack this I did come across some code that indicated `Sequence[]` might be useful. I will try to unpack this... any pedagogical comments on the way it works appreciated! – stackexchange.michael Feb 17 '12 at 01:57
  • @David the idea to to use a function, `f` that acts like `Identity` but simultaneously redefines itself for that specific expression to `Sequence[]`, so that if it is called again for that expression it collapses (is removed from the list). See [this page from the v4.0 documentation](http://reference.wolfram.com/legacy/v4/RefGuide/Union_ex.html) for the origin of the function (AFAIK). – Mr.Wizard Feb 17 '12 at 05:36
  • @stackexchange.michael and Mr.Wizard Yep, I saw this trick in a pre-6 version of the documentation! – Szabolcs Feb 17 '12 at 09:14
  • Yes, your explanation helps. First time I see this. Nice. – DavidC Feb 17 '12 at 11:48
  • @Szabolcs & Mr. Wizard. Aha! all is clear now, thanks. This is exactly the sort of thing I expected to happen when I (naively) typed `DeleteDuplicates[list, {2}]`. – stackexchange.michael Feb 17 '12 at 15:08
3

You can use Complement for this and the only thing you have to do is, you have to store all elements you have already seen. At the beginning the list of all seen elements a is empty and then you add step by step all numbers which are in the sublists. This cumulated list is used in every step to delete all numbers which have already been seen:

DeleteDuplicatesList[l_List] :=
  Block[{a = {}},
   Table[With[{b = a},
     a = Union[a, c];
     Complement[c, b]], {c, l}]
   ];

l = {{1, 7, 8}, {4, 3}, {4, 1, 9}, {9, 2}};
DeleteDuplicatesList[l]
halirutan
  • 4,281
  • 18
  • 44
  • Thank you, I like the approach of using `Complement`. Although it is not obvious from the way I stated the initial problem, there is a set-theoretic background to all this. – stackexchange.michael Feb 17 '12 at 01:49
2

It's not elegant but it gets the job done:

t = {{1, 7, 8}, {4, 3}, {4, 1, 9}, {9, 2}}; 

Delete[t, Flatten[Rest[Position[t, #]] & /@ (DeleteDuplicates[Flatten[t]]), 1]]

DeleteDuplicates...will return the set, not multi-set, of numbers in t. Rest[Position[...will return the positions of duplicates. Delete removes said duplicates.

DavidC
  • 3,056
  • 1
  • 20
  • 30