-3

I have a python array that contains more URL link as string format. Some of this URL string are equal:

[u'it/crag/830/ai-falconi.html', u'/it/crag/830/ai-falconi.html', u'it/crag/751/alonte.html', u'/it/crag/751/alonte.html']

how can I remove the equal string URL? Thanks

APPGIS
  • 353
  • 1
  • 10
  • 20
  • 4
    [Convert to a set](https://docs.python.org/2/library/functions.html#func-set). – khelwood May 09 '17 at 09:24
  • 2
    Possible duplicate of [How do you remove duplicates from a list in whilst preserving order?](http://stackoverflow.com/questions/480214/how-do-you-remove-duplicates-from-a-list-in-whilst-preserving-order) – Christian König May 09 '17 at 09:25

2 Answers2

2

This should do it:

l = [u'it/crag/830/ai-falconi.html', u'/it/crag/830/ai-falconi.html', u'it/crag/751/alonte.html', u'/it/crag/751/alonte.html']
result = [j for i, j in enumerate(l) if all(j not in k for k in l[i + 1:])]
zipa
  • 27,316
  • 6
  • 40
  • 58
  • I suppose it works - but it's *remarkably* inefficient compared to `list(set(l))` – Jon Clements May 09 '17 at 09:39
  • @JonClements Of course it's slower than `set` but set doesn't remove substrings, _as it shouldn't_, and that was the question :) – zipa May 09 '17 at 09:42
  • Why not `{'/' + el.lstrip('/') for el in l}` then? (that also standardises duplicates to retain a leading `/` if required. – Jon Clements May 09 '17 at 09:46
  • @JonClements Fair, but this would work on general substrings, not just `/` case. One example that would rule out your solution is `.htm` – zipa May 09 '17 at 09:50
1

As mentioned use a set, because in a set there can not be duplicate

which translates into

s = set([u'it/crag/830/ai-falconi.html', u'/it/crag/830/ai-falconi.html', u'it/crag/751/alonte.html', u'/it/crag/751/alonte.html'])
Ludisposed
  • 1,709
  • 4
  • 18
  • 38