Functional differences between converting (List -> Dictionary -> List) vs (List -> Set -> List)?

Question

A list of unique elements can be attained by converting a list to a dictionary or a set, then back to a list.

>>> original_list = [1,2,3,4,5,6,7,8,9,2,4,6,8]
>>> original_list
[1, 2, 3, 4, 5, 6, 7, 8, 9, 2, 4, 6, 8]
>>> 
>>> unique_via_set = list(set(original_list))
>>> unique_via_dict = list(dict.fromkeys(original_list))
>>> 
>>> unique_via_set
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> unique_via_dict
[1, 2, 3, 4, 5, 6, 7, 8, 9]

^Python 3.7.3

Main Question:

What is the difference between these two strategies?

Sub Questions:

Is one more efficient than the other?
Is there a use case that is more optimized for one or the other?
Is there a practical difference in the run time between these two strategies?

Note: I'm not looking for the best or fastest way to get unique elements from a list, I am looking to compare the two methods above

Edit: Again, I am not looking for HOW to remove duplicates, I am merely looking for the practical differences in the methods above, if any

Possibly related: https://softwareengineering.stackexchange.com/questions/112662/how-costly-are-the-python-dict-and-set-in-built-types, but fundamentally there is no difference. Dictionaries use membership testing for adding elements to it, the same as for sets so theoretically it should be constant time or O(1). — rayryeng, Jun 10 '19 at 18:38
Well building a dictionary using the list elements as keys works, as there cannot be duplicate keys (although there is the overhead of creating a dict first), however why not just use `sets` as they are precisely intended for this purpose, i.e for collections of unique items? — yatu, Jun 10 '19 at 18:38
You have asked so many questions. Moreover, it is more opinion based. You can check the timings to see which one is faster. I would mark it duplicate of https://stackoverflow.com/questions/7961363/removing-duplicates-in-python-lists — Sheldore, Jun 10 '19 at 18:39
Possible duplicate of [Removing duplicates in Python lists](https://stackoverflow.com/questions/7961363/removing-duplicates-in-python-lists) — DSC, Jun 10 '19 at 18:40
This doesn't look like a dupe to me, at least not of the linked questions - this is around performance of the (known) methods. — Peter DeGlopper, Jun 10 '19 at 18:42
Thanks for the suggestions. I also don't believe it is a duplicate, I'm not looking for optimal strategies to remove duplicates, I am only looking to understand the functional differences between the two strategies in the question. @rayryeng Thanks for the related link. — Eddie, Jun 10 '19 at 18:43
I think it's worth pointing out that your intention is more clear when using sets for anyone that may read your code. — busybear, Jun 10 '19 at 18:53

Mad Physicist · Accepted Answer · 2019-06-10T18:46:05.683

2

While the implementations of the hash tables for dictionaries and sets may differ slightly, you will not see a functional difference between the keys of the dictionary and the set. The only major difference will be that the dictionary allocates a bunch of values referencing None, which the set does not need. That might make the dictionary implementation less efficient, since it is unnecessary overhead.

edited Jun 10 '19 at 18:46

answered Jun 10 '19 at 18:40

Mad Physicist

107,652
25
181
264

Functional differences between converting (List -> Dictionary -> List) vs (List -> Set -> List)?

Main Question:

What is the difference between these two strategies?

1 Answers1