Looking for the right solution to store filenames

Question

Just started out with my hobby project and now I am here to get help with making the correct database design/query. I have made a simple Java program that loops trough the content of a folder. I want to save this content to a MySQL database, so I added a connector to my database in Java, created a table and the columns "file", "path" and "id, "date" in MySQL.

So now to the important/fun thing, every time I want to add the filenames to the MySQL in Java I do this (when the GUI-button is pressed I call on a method that does):

DELETE all entries with the same file path - this is to ensure that I will get new entries which is exactly the same as the content in the path.
Java-loop: INSERT the file-info into the columns id, path, filename and date when the file was added to the database.

In this way I can always ensure that the filenames that are going to be added into the database always are up to date, it doesn't matter if I rename a file or remve it, it will be up to date since the table will get it's entries deleted and new info will be written. Old info -> DELETE old info - INSERT new info -> Up-to-date.

I know this is probably not the best solution but it works, but now I am stuck on the next thing I want to do. I want to add the difference of the files in order to know which files has been added and deleted between two inserts, and here is my problem, since the entries are deleted before a new INSERT I cannot compare. How would you change the design or the solution? All ideas are welcome and since I am so fresh I would really appreciate if you could show me how the query could look like.

you could simply query you db before you delete everything. query old info -> compare with new info -> delete old info -> insert new — rptmat57, Mar 19 '17 at 21:15

score 0 · Answer 1 · edited May 23 '17 at 11:46

Do not remove all rows first. Remove only the ones that are removed (or event better, just mark them "inactive" as I suggest below). Query your DB first, to see what was there last time.

I would maintain additional column in your table called "inactive". It will be FALSE as default, and TRUE for removed files. Please keep in mind that as your file is uniquely identified by file+path+id renaming any file is indeed an operation of deleting the old one and creating the new one.

Removing things from DB is not a good idea, as you might always remove something by accident (bug in the code) and would not be able to get the data back.

Additional thing to do is adding the hash to your table. This way you will be able to check if the file was really changed. There is no need to re-add the file to the DB is it is not changed. See Getting a File's MD5 Checksum in Java for more info.

score 0 · Answer 2 · answered Mar 20 '17 at 01:03

One way to achieve this is to implement auditing of your table. A common approach is to create a copy of the table where you are storing the folder contents and name that table using a convention to indicate it is storing audit information (eg. _AUD) . You then add additional columns to the AUD table, like "REV" (revision), "REV_TYPE" (inserted, deleted, modified). Whenever you insert, update or delete any rows from your main table, you insert a row into the AUD table to describe what you've done. Then you can find the operations associated with each revision by looking it up in the AUD table. A java framework that provides this feature is hibernate envers (http://hibernate.org/orm/envers/).

Looking for the right solution to store filenames

2 Answers2