So I saved a bunch of features as a .pkl file. This is the code I used to save the files initially.
with open('variables.pkl', 'wb') as output:
pickle.dump(embedding_weights, output, 2)
pickle.dump(X1, output, 2)
pickle.dump(X2, output, 2)
pickle.dump(Y, output, 2)
pickle.dump(X1_test, output, 2)
pickle.dump(X2_test, output, 2)
pickle.dump(Y_test, output, 2)
pickle.dump(X1_nli, output, 2)
pickle.dump(X2_nli, output, 2)
pickle.dump(Y_nli, output, 2)
pickle.dump(X1_test_nli, output, 2)
pickle.dump(X2_test_nli, output, 2)
pickle.dump(Y_test_nli, output, 2)
pickle.dump(X1_test_matched, output, 2)
pickle.dump(X2_test_matched, output, 2)
pickle.dump(Y_test_matched, output, 2)
pickle.dump(X1_test_mismatched, output, 2)
pickle.dump(X2_test_mismatched, output, 2)
pickle.dump(Y_test_mismatched, output, 2)
pickle.dump(X2_two_sentences, output, 2)
pickle.dump(X2_test_two_sentences, output, 2)
pickle.dump(tokenizer, output, 2)
NOTE: I have received this data as-is and this was the code used to produce it. I cannot rerun the above code as these are Deep learning features that take hours to compute. Hence I wont be able to make any changes to the file variables.pkl
The files size was approximately 1.93GB. After this, I wanted to update the X1_test
file and X2_test
file using the following code:
with open('variables.pkl', 'wb') as output:
pickle.dump(X1_test, output, 2)
pickle.dump(X2_test, output, 2)
My understanding was that it would just update the two files. Instead it has deleted all the files and only these two files remain. The file size is only 12.6KB now. Was wondering what I did wrong? How can I just update the said two files while keeping everything else the same.