A dict/mapping object maps hashable values(keys) to arbitrary objects. By default all the primitive immutable objects in python have this hash function implemented. For custom objects you have to implement them for correct functionality. Refer - Object of custom type as dictionary key for more details.
Internals of 'hashtable/hashmap/dictionary/dict' DataStructure:
The important aspect of 'dict' datastructure is - it provides complexity of 'dictionary lookups' in O(1). While others need atleast O(log n) or O(n) for lookup.
For providing this O(1) complexity, it requires the key objects to provide a "hash" function.
- This hash function takes the information in a key object and uses it
to produce an integer, called a "hash value".
- This hash value is
then used to determine which "bucket" this (key, value) pair should
be placed. Then, go to the bucket and get the value directly or by
traversing the list to get correct value for given key.
Following code should run in O(n), where n is the no.of lines in your 2nd File.
fin = open('x.csv')
file1 = open("y.txt","r")
file_output = open("z.txt","w")
lines = file1.readlines()
a = {}
for line in lines:
key = line.split("\n")[0]
a[key] = 1
for line in fin.readlines():
id=line.split(',')[0]
if id in a:
file_output.write(line)