Should be able to do something simple like this:
file_c = list(set(file_a) - set(file_b))
Should be fairly low overhead using builtins. I suppose it may be the same as
list(set(file_a).difference(file_b))
from erhesto's answer. I'm not sure if the builtin method is faster than the sub overload on list().
Okay, after testing this is what I've found out. I set up two different files sub.py and dif.py
Outputs:
swift@pennywise practice $ time python sub.py
[27, 17, 19, 31]
real 0m0.055s
user 0m0.044s
sys 0m0.008s
swift@pennywise practice $ time python dif.py
[17, 19, 27, 31]
real 0m0.056s
user 0m0.032s
sys 0m0.016s
Body of the .py files:
sub.py:
#!/usr/bin/python3.6
# -*- coding utf-8 -*-
def test():
lsta = [2, 3, 5, 7, 9, 13, 17, 19, 27, 31,]
lstb = [2, 3, 5, 7, 9, 13,]
lstc = list(set(lsta) - set(lstb))
return lstc
if __name__ == '__main__':
print(test())
dif.py
#!/usr/bin/python3.6
# -*- coding utf-8 -*-
def test():
lsta = [2, 3, 5, 7, 9, 13, 17, 19, 27, 31,]
lstb = [2, 3, 5, 7, 9, 13,]
lstc = list(set(lsta).difference(lstb))
return lstc
if __name__ == '__main__':
print(test())
Edited because I realized an error - forgot to execute the programs!
The sub operator is substantially faster on the system than the set.difference So, I would probably stick with the '-' over the set.difference...it's easier for me to read what's going on.
Source for the set() - set() functionality:
https://stackoverflow.com/a/3462160/9268051