if df is None:
"""Using np.column_stack for performance reasons"""
df = pd.DataFrame(np.column_stack([paths, file_names, hashes, filesizes]),
columns=['path', 'file name', 'sha256', 'file size (MB)'])
"""Save the df just in case we want to continue later"""
print(f"Saving progress to {pickle_path}")
try:
df.to_pickle(pickle_path)
I get this error on the line with np.column_stack
.
Exception has occurred: ValueError
all the input array dimensions for the concatenation axis must match exactly, but along dimension 0, the array at index 0 has size 156738 and the array at index 3 has size 156735
This line works most of the time, but every once in a while I get this error on a script that can have a runtime at almost a half hour, so it is pretty frustrating to get it. Is there some way I could fill NA or something to get the dimensions corrected?