I'm trying to find the most efficient code to catch duplicate file names in a parent folder. I'm storing file tree in a MySQL database using parent-child model with lineage:
CREATE TABLE `filetable` (
`id` int(11) NOT NULL auto_increment,
`name` varchar(50) NOT NULL,
`parent_id` int(11) NOT NULL,
`path_num` varchar(255) NOT NULL,
`path_string` text() NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB;
I am considering two methods:
1. Run a query:
INSERT INTO filetable (name, parent_id, path_num, path_string)
SELECT '$name','$pid','$path_num','$path_string' FROM DUAL
WHERE NOT EXISTS
(SELECT name FROM filetable WHERE name='$name');
and in PHP:
if (mysql_affected_rows() === 0) takeAction($name);
PROS: simple sql, no need to add rows to table
CONS: double query that can get snow on large table
2.Creat unique index on a field and use:
INSERT INTO filetable (name, parent_id, path_num, path_string)
VALUES ('$name','$pid','$path_num','$path_string')
ON DUPLICATE KEY UPDATE 0=0;
and in PHP:
if (mysql_affected_rows() === 2) takeAction($name); // Yes 2 for updates
PROS: UPDATE triggered on a small fraction of queries when name
is duplicated
CONS: an extra row in a table
The field that is unique is a path_string, a full path to the file. In my file system it can get very long, therefore a TEXT type of field. In MySQL one can not create unique index on a TEXT field. So what I am thinking to do is to create column
`problemsolver` varchar(62) = $parent_id . '_'.$name
and crete unique index on it.
I am expecting those queries to constitute 10% of all the queries to database.
My question is which of those two methods would you use and why? Or any better options available?