Say i have a table like below:
CREATE TABLE `hadoop_apps` (
`clusterId` smallint(5) unsigned NOT NULL,
`appId` varchar(35) COLLATE utf8_unicode_ci NOT NULL,
`user` varchar(64) COLLATE utf8_unicode_ci NOT NULL,
`queue` varchar(35) COLLATE utf8_unicode_ci NOT NULL,
`appName` varchar(255) COLLATE utf8_unicode_ci DEFAULT NULL,
`submitTime` datetime NOT NULL COMMENT 'App submission time',
`finishTime` datetime DEFAULT NULL COMMENT 'App completion time',
`elapsedTime` int(11) DEFAULT NULL COMMENT 'App duration in milliseconds',
PRIMARY KEY (`clusterId`,`appId`,`submitTime`),
KEY `hadoop_apps_ibk_finish` (`finishTime`),
KEY `hadoop_apps_ibk_queueCluster` (`queue`,`clusterId`),
KEY `hadoop_apps_ibk_userCluster` (`user`(8),`clusterId`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
mysql> SELECT COUNT(*) FROM hadoop_apps;
This would return me a count 158593816
So I am trying to understand what is inefficient about the below query and how I can improve it.
mysql> SELECT * FROM hadoop_apps WHERE DATE(finishTime)='10-11-2013';
Also, what's the difference between these two queries?
mysql> SELECT * FROM hadoop_apps WHERE user='foobar';
mysql> SELECT * FROM hadoop_apps HAVING user='foobar';