I have a list of files where each file has two columns. The 1st column contains words, and the 2nd column contains numbers.
I want to extract all the unique words from the files, and sum the numbers in them. This I was able to do...
The second task is to count the number of files in which the words were found. I am having trouble in this part... I am using a dictionary for this.
Here is my code:
import os
from typing import TextIO
currentdir = " " #CHANGE INPUT PATH
resultdir = " " #CHANGE OUTPUT ACCORDINGLY
if not os.path.exists(resultdir):
os.makedirs(resultdir)
systemcallcount ={}
for root, dirs, files in os.walk(currentdir):
for name in files:
outfile2 = open(root+"/"+name,'r')
for line in outfile2:
words=line.split(" ")
if words[0] not in systemcallcount:
systemcallcount[words[0]]=int(words[1])
else:
systemcallcount[words[0]]+=int(words[1])
outfile2.close()
for keys,values in systemcallcount.items():
print(keys)
print(values)
for example I have two files -
file1 file2
a 2 a 3
b 3 b 1
c 1
so the output would be -
a 5 2
b 4 2
c 1 1
To explain second column of output a is 2 because it is occuring in both the files whereas c is 1 as it is appearing in only file1.