I am trying to make an executable(.exe) python file of my jupyter notebook code. The code basically reads a bunch of files from folder A and folder B, and finds the difference between the files in the folder, making a csv of the results. Where do I go about looking for how to set up a configuration file, which the executable file reads to get the path for the input folders (containing all the files) that need to be compared. This configuration file can be either json or text file that the user edits and adds the current directory for him in which the the two folders with the files are located. In my code, I read the folders from my own path and add the path to directory_A and directory_B as directory_A = r"C:\Users\Bilal\Python\Task1\OlderVersionFiles\" and for directory_B=r"C:\Users\Bilal\Python\Task1\NewVersionFiles\". I know how to convert a jupyter notebook to a python executable thanks to :Is it possible to generate an executable (.exe) of a jupyter-notebook? This creates a build folder with a lot of files and an application file that does not do anything in my case.
- How can I make it so that it creates the Record.csv file that my code generates when I run it through jupyter by itself on clicking the executable file? using the static path code in the python file referring to the path of folders stored in my system.
- How can I have an application file that reads paths from a configuration file and outputs a csv with differences between the folders?
My code for finding the difference is as follows
import os
import csv
import pandas as pd
import io
import re
dir_A_dict = dict()
directory_A = r"C:\\Users\\Bilal\\Python\\Task1\\OlderVersionFiles\\"
dir_A_files= [os.path.join(directory_A, x) for x in os.listdir(directory_A) if '.csv' in str(x)]
dir_B_dict = dict()
directory_B = r"C:\\Users\\Bilal\\Python\\Task1\\NewVersionFiles\\"
dir_B_files = [os.path.join(directory_B, x) for x in os.listdir(directory_B) if '.csv' in str(x)]
for file_ in dir_A_files:
f = open(file_, 'r')
reader = csv.reader(f)
header = next(reader)
for line in reader:
if ''.join(line) not in dir_A_dict.keys():
dir_A_dict[''.join(line)] = {
"record": line,
"file_name": os.path.basename(file_),
"folder" : "OlderVersion",
"row": reader.line_num
}
for file_ in dir_B_files:
f = open(file_, 'r')
reader = csv.reader(f)
header = next(reader)
for line in reader:
if ''.join(line) not in dir_B_dict.keys():
dir_B_dict[''.join(line)] = {
"record": line,
"file_name": os.path.basename(file_),
"folder" : "NewVersion",
"row": reader.line_num
}
aset = set()
for v in dir_A_dict.values():
aset.add(tuple(v['record']))
bset = set()
for v in dir_B_dict.values():
bset.add(tuple(v['record']))
in_a_not_b = aset - bset
in_b_not_a = bset - aset
diff = in_a_not_b.union(in_b_not_a)
record_ = []
for val in diff:
file_ = ''.join(val)
record_.append(file_)
# Writing dictionary values to a text file
with open("Report2.txt", 'w') as f:
for i in range(73488):
if record_[i] not in dir_A_dict.keys():
f.write('%s\n' % ', '.join(str(x)for x in dir_B_dict[record_[i]].values()))
else:
f.write('%s\n' % ', '.join(str(x)for x in dir_A_dict[record_[i]].values()))
# regular expression to capture contents of balanced brackets
location_regex = re.compile(r'\[([^\[\]]+)\]')
with open(r"C:\\Users\\Bilal\\Report2.txt", 'r') as fi:
# replaced brackets with quotes, pipe into file-like object
fo = io.StringIO()
fo.writelines(str(re.sub(location_regex, r'"\1"', line)) for line in fi)
# rewind file to the beginning
fo.seek(0)
# read transformed CSV into data frame
df = pd.read_csv(fo)
df.columns = ['Record', 'Filename', 'Folder', "Row"]
# print(df)
df.to_csv('Records2Arranged.csv')