My understanding of list and set in Python are mainly that list allows duplicates, list allows ordered information, and list has position information. I found while I was trying to search if an element is with in a list, the runtime is much faster if I convert the list to a set first. For example, I wrote a code trying find the longest consecutive sequence in a list. Use a list from 0 to 10000 as an example, the longest consecutive is 10001. While using a list:
start_time = datetime.now()
nums = list(range(10000))
longest = 0
for number in nums:
if number - 1 not in nums:
length = 0
##Search if number + 1 also in the list##
while number + length in nums:
length += 1
longest = max(length, longest)
end_time = datetime.now()
timecost = 'Duration: {}'.format(end_time - start_time)
print(timecost)
The run time for above code is "Duration: 0:00:01.481939" With adding only one line to convert the list to set in third row below:
start_time = datetime.now()
nums = list(range(10000))
nums = set(nums)
longest = 0
for number in nums:
if number - 1 not in nums:
length = 0
##Search if number + 1 also in the set(was a list)##
while number + length in nums:
length += 1
longest = max(length, longest)
end_time = datetime.now()
timecost = 'Duration: {}'.format(end_time - start_time)
print(timecost)
The run time for above code by using a set is now "Duration: 0:00:00.005138", Many time shorter than search through a list. Could anyone help me to understand the reason for that? Thank you!