increment count field of a row if the same row exists in another table

Question

I have a table with one column (about 1 million rows) :

Table A (InnoDB)

ColA (no index, no primary key)

I also have a table (3 million rows):

Table B (InnoDB)

ColA (PK INDEX) | count

Table B is a Superset of Table A. Table B contains every row I need, while Table A only contains some rows. I want to increment the count field of Table B by 1 for every row which exists in Table A. The problem is Table A is quite large so I can't use a normal UPDATE command with a where clause because the where clause would have 1 million conditions. I am really looking for the most optimized way in terms of performance to increment these fields. The solution I can think of is :

UPDATE TABLE_B set count = count + 1 where ColA IN (SELECT ColA FROM TABLE_A)

But that would require two full index scans on both tables so I am not sure if it is the fastest way to do this task.

score 2 · Answer 1 · answered Sep 14 '14 at 20:51

2

First, add an index on table A:

create index idx_tablea_cola on table_a(cola)

Then phrase your query as:

update table_b b
    set count = count + 1
    where exists (select 1 from table_a a where a.colA = b.colA);

This assumes that there are no duplicates in tablea. Or, at least if there are, you still only want the counter incremented by 1.

WIthout an index, this will be a painful operation.

answered Sep 14 '14 at 20:51

Gordon Linoff

1,242,037
58
646
786

how does `This assumes that there are no duplicates in tablea` this happen? and what assumes it? – George G Sep 14 '14 at 21:09
Using the explain this seems to produce the exact same result that @GeorgeGarchagudashvili first introduced – user2924127 Sep 14 '14 at 21:16

score 2 · Answer 2 · answered Sep 14 '14 at 22:22

2

I found a faster way. It seems to be about 30% -35% faster than the other answer:

INSERT
INTO   TABLE_B
SELECT colA, 1 FROM TABLE_A
ON DUPLICATE KEY
UPDATE  count = count + 1

answered Sep 14 '14 at 22:22

user2924127

6,034
16
78
136

Are there records in `table_a` that are not in `table_b`? – Gordon Linoff Sep 14 '14 at 22:47
No, all records in table A are in table B. I understand this isn't used for this case usually, but it produces the same result and seems to be much quicker. – user2924127 Sep 14 '14 at 23:14
That is quite interesting. Originally, I was thinking "this answer doesn't answer the question being asked". But, with no new records it does and the performance is better. – Gordon Linoff Sep 15 '14 at 02:22

score 1 · Answer 3 · answered Sep 14 '14 at 20:07

1

Ensuring ColA is indexed in both tables, this would work:

UPDATE B set count = count + 1 
WHERE (
    SELECT 1 FROM A 
    WHERE A.ColA = B.ColA LIMIT 1
) IS NOT NULL

answered Sep 14 '14 at 20:07

George G

7,443
12
45
59

1

You should use `exists` rather than the construct you have. – Gordon Linoff Sep 14 '14 at 20:48
you are wrong, this is most optimal way, just think man `exists` is equal something to check if required statement is met even once, so I've already made my sub-query as much optimal that `exists` can not add more, even it's even extra calling for function – George G Sep 14 '14 at 20:57
1

This seems to be the fastest way. Do you think the InnoDB is the best engine for Table B that is only used for updates (incrementing one field) by only one user at a time? – user2924127 Sep 14 '14 at 21:14
Yes `InnoDB` is better, for more see http://stackoverflow.com/questions/20148/myisam-versus-innodb – George G Sep 14 '14 at 21:18

increment count field of a row if the same row exists in another table

3 Answers3