This is kind of a follow up to my previous question link.
Some time has passed so I wasn't sure if I needed to update that one or create a new one, especially since the parameters of my issues have changed a little.
--
I have a large (simple) table of phone numbers (>5 million and growing). The table was set up as follows:
| AreaCode | Local | Created |
- 'Local' is the 7 digits following the 3 digit area code
- 'Created' is just a timestamp
- AreaCode and Local are both indexed
- I used to have an ID column as my primary key but removed it when I partitioned my table. I set up 10 partitions by range on AreaCode ( <100, <200, <300, ~ <999)
I am uploading csv files of numbers (up to 250k rows), sanitizing and cleaning the input via PHP, then creating a temp table and inserting data into it.
After this point, I am having a lot of trouble. With smaller data sizes (10k-25k rows), I don't really have any problems. But when I'm trying to compare my temp table containing 250k+ rows against my master database, it takes incredibly long.
I've tried the 2 following queries, and not much luck with any of them.
Using inner join
SELECT a.* FROM master_table a
INNER JOIN temp_table b
ON a.AreaCode = b.AreaCode
AND a.Local = b.Local;
I found this suggestion on a website and tried it
SELECT b.* FROM temp_table b
WHERE b.AreaCode
IN (
SELECT a.AreaCode
FROM master_table a
WHERE a.AreaCode = b.AreaCode
AND a.Local = b.Local
);
I apologize for the long list questions but my mastery of mysql is weak.
- Am I making a mistake by not having a primary and/or unique key? Since every phone number is unique, I wasn't sure that having an ID column would benefit me.
- Am I right to partition my master table or is this slowing me down?
- I have indexes on my AreaCode and Local columns. When I create temporary tables, should I be creating indexes on the same columns there as well?
- Please help me fix my query so it doesn't take so long!!