5

Database example:

| country | animal | size   | x_id* |
|---------+--------+--------+-------|
|  777    | 1001   | small  |   1   |
|  777    | 2002   | medium |   2   |
|  777    | 7007   | medium |   3   |
|  777    | 7007   | large  |   4   |
|  42     | 1001   | small  |   1   |
|  42     | 2002   | medium |   2   |
|  42     | 7007   | large  |   4   |

I need to generate the x_id continuously based on entries in (animal, size) and if x_id for the combination x_id exist use it again.

Currently i use the following PHP script for this action, but on a large db table it is very slow.

query("UPDATE myTable SET x_id = -1");

$i = $j;
$c = array();
$q = query("
    SELECT animal, size
      FROM myTable
     WHERE x_id = -1
  GROUP BY animal, size");

while($r = fetch_array($q)) {

  $hk = $r['animal'] . '-' . $r['size'];

  if( !isset( $c[$hk] ) ) $c[$hk] = $i++;

  query("
      UPDATE myTable
       SET x_id = {$c[$hk]}
     WHERE animal = '".$r['animal']."'
       AND size = '".$r['size']."'
       AND x_id = -1");

}

Is there a way to convert the PHP script to one or two mysql commands?

edit:

CREATE TABLE `myTable` (
`country` int(10) unsigned NOT NULL DEFAULT '1', -- country
`animal` int(3) NOT NULL,
`size` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`lang_id` tinyint(4) NOT NULL DEFAULT '1',
`x_id` int(10) NOT NULL,
KEY `country` (`country`),
KEY `x_id` (`x_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Sebas
  • 21,192
  • 9
  • 55
  • 109
HQ5
  • 81
  • 10
  • create a single table housing animal and size. Use that in support of the table shown in question. That is a hint – Drew Dec 25 '15 at 19:57
  • @Drew i know that this is the right way, but **i can't change this db table** an the complexity is much more then this small example – HQ5 Dec 25 '15 at 20:18
  • I can help you but it is going to be incredibly confusing, and you say you can't change the schema. So idk. Why blow 20min on it. It would be of no use to anyone :P – Drew Dec 25 '15 at 20:21
  • i can't change the rows or split the table, but i can add indexes or something else – HQ5 Dec 27 '15 at 19:47
  • If you think outside the box, my solution is a tack-on table or two. You just have to apply the concept. You have the rep to dwell in my [mysql chat room](http://chat.stackoverflow.com/rooms/95290/mysql-stuff) if you want to hash it out. Good luck. – Drew Dec 27 '15 at 19:58
  • why you don't use auto increment for this field? – Max P. Dec 27 '15 at 20:33
  • because it say duplicate entry where i need the same id of this same entry – HQ5 Dec 27 '15 at 20:57
  • I think you are missing the point of the auto_increment with the combination of a unique index (key) like I showed in my example a day or two ago. I showed `insert ignore` which expects it to fail on occasion or often. In the end, you are still left with your unique id (the PK auto_inc) that you use in subsequent calls – Drew Dec 27 '15 at 22:43
  • The PK (auto_inc) in combination with the unique composite key (the Enforcer of no duplicates) will render the id to use. The insert ignore says *I don't care about the error of it already being there because in the end I know it will be there.* – Drew Dec 27 '15 at 22:44
  • You also haven't shown a schema. This isn't even bounty material according to the Help Center. It doesn't matter that you say in comments here or there to people that this is not your exact example or table or that you are simplifying it for an illustration versus your real schema. The point is that you are not even in a position to Assist someone to help you. – Drew Dec 27 '15 at 22:54
  • i've added the table-schema to this question, sorry – HQ5 Dec 27 '15 at 23:32
  • now modify the first table block with all sample data and describe the sample data. How it should work in words too. Right now, your top visible table chunk does not mirror your schema just provided. If you want someone to write the code to solve the flow of operations, you need to describe in words the flow of operations. – Drew Dec 27 '15 at 23:56
  • 1
    Thanks to Drew, Sebas and Mihai - good job! – HQ5 Dec 28 '15 at 01:01

5 Answers5

2

This is a conceptual. Worm it into your world if useful.

Schema

create table AnimalSize
(   id int auto_increment primary key,
    animal varchar(100) not null,
    size varchar(100) not null,
    unique key(animal,size) -- this is critical, no dupes
);

create table CountryAnimalSize
(   id int auto_increment primary key,
    country varchar(100) not null,
    animal varchar(100) not null,
    size varchar(100) not null,
    xid int not null -- USE THE id achieved thru use of AnimalSize table
);

Some queries

-- truncate table animalsize; -- clobber and reset auto_increment back to 1
insert ignore AnimalSize(animal,size) values ('snake','small'); -- id=1
select last_insert_id(); -- 1
insert ignore AnimalSize(animal,size) values ('snake','small'); -- no real insert but creates id GAP (ie blows slot 2)
select last_insert_id(); -- 1
insert ignore AnimalSize(animal,size) values ('snake','small'); -- no real insert but creates id GAP (ie blows slot 3)
select last_insert_id(); -- 1
insert ignore AnimalSize(animal,size) values ('frog','medium'); -- id=4
select last_insert_id(); -- 4
insert ignore AnimalSize(animal,size) values ('snake','small'); -- no real insert but creates id GAP (ie blows slot 3)
select last_insert_id(); -- 4

Note: insert ignore says do it, and ignore the fact that it may die. In our case, it would fail due to unique key (which is fine). In general, do not use insert ignore unless you know what you are doing.

It is often thought of in connection with an insert on duplicate key update (IODKU) call. Or should I say thought about, as in, How can I solve this current predicament. But, that (IODKU) would be a stretch in this case. Yet, keep both in your toolchest for solutions.

After insert ignore fires off, you know, one way or the other, that the row is there.

Forgetting the INNODB GAP aspect, what the above suggests is that if the row already exists prior to insert ignore, that

You cannot rely on last_insert_id() for the id

So after firing off insert ignore, go and fetch the id that you know has to be there. Use that in subsequent calls against CountryAnimalSize

continue along this line of reasoning for your CountryAnimalSize table inserts where the row may or may not already be there.

There is no reason to formalize the solution here because, as you say, these aren't even your tables anyway in the Question.

Also, back to INNODB GAP. Google that. Figure out whether or not you can live with gaps created.

Most people have bigger fish to fry that keeping id's tight and gapless.

Other people (read: OCD) are so consumed by the perceived gap problem that they blow days on it.

So, these are general comments meant to help a broader audience, than to answer your question, which, as you say, isn't even your schema.

Praveen Kumar Purushothaman
  • 164,888
  • 24
  • 203
  • 252
Drew
  • 24,851
  • 10
  • 43
  • 78
  • it's Christmas. Figured I'd leave you with something. Hope it isn't a lump of coal. – Drew Dec 25 '15 at 20:56
  • Merry Christmas, you get a +1 for your work, but it is not the solution for my question – HQ5 Dec 25 '15 at 21:06
2

User variables are awkward but should do the trick,tested on my machine

CREATE TABLE t
( animal VARCHAR(20),
 size VARCHAR(20),
x_id INT);

 INSERT INTO T(animal,size) VALUES('crocodile','small'),
   ('elephant','medium'),
 ('giraffe','medium'),
 ('giraffe','large'),
 ('crocodile','small'),
 ('elephant','medium'),
 ('giraffe','large');




 UPDATE t  RIGHT JOIN
 (SELECT animal,size,
MIN(CASE WHEN @var:=CONCAT(animal,size) THEN @id ELSE @id:=@id+1  END)id
FROM t,
(SELECT  @var:=CONCAT(animal,size) FROM t)x ,
                      (SELECT @id:=0)y
                     GROUP BY animal,size)q
                     ON t.animal=q.animal AND t.size=q.size
                     SET x_id=q.id

Results

"animal"    "size"     "x_id"
"crocodile" "small"     "1"
"elephant"  "medium"    "2"
"giraffe"   "medium"    "3"
"giraffe"   "large"     "4"
"crocodile" "small"     "1"
"elephant"  "medium"    "2"
"giraffe"   "large"     "4"

You want these indexes added for (a lot) faster access

ALTER TABLE `yourtable` ADD INDEX `as_idx` (`animal`,`size`);
ALTER TABLE `yourtable` ADD INDEX `id_idx` (`x_id`);
Mihai
  • 26,325
  • 7
  • 66
  • 81
  • its run over 40 minutes before i kill this process, thats not better than my php-script on over 18k entries, thanks for your time – HQ5 Dec 27 '15 at 21:57
  • @HQ5 Show the table definition in your question,this query should certainly not take more than a sec on 18 k rows `SHOW CREATE TABLE yourtable` – Mihai Dec 27 '15 at 21:58
  • what you mean with table definition? – HQ5 Dec 27 '15 at 22:04
  • @HQ5 Run the query I posted in the comment, edit your question with the results of that query – Mihai Dec 27 '15 at 22:04
  • thats to complex and is not the exact example here – HQ5 Dec 27 '15 at 22:09
  • @HQ5 I suspect you are missing indexes,so thats the problem not the query.But I can ony guess without the table definition – Mihai Dec 27 '15 at 22:09
  • this script runs to long but the indexes helps very much, thanks +1 – HQ5 Dec 28 '15 at 00:27
2
UPDATE myTable m
    JOIN (
        SELECT animal, size, @newid := @newid + 1 AS x_id
        FROM myTable a
            CROSS JOIN (SELECT @newid := 0) b
        WHERE x_id = -1
        GROUP BY animal, size
    ) t ON m.animal = t.animal AND m.size = t.size
SET m.x_id = t.x_id
;

http://sqlfiddle.com/#!9/5525ba/1

The group by in the subquery is not needed. It generates useless overhead. If it's fast enough, leave it like this, otherwise we can use distinct+another subquery instead.

Sebas
  • 21,192
  • 9
  • 55
  • 109
  • this command runs fine, thats exactly what i want, but it runs 30-40 minutes an my php-script runs over one hour – HQ5 Dec 27 '15 at 22:53
  • Affected rows: 18k (Query took 5.9308 sec) with the indexes @Mihai - thank you !! – HQ5 Dec 28 '15 at 00:33
  • I did not suggest anything ddl related based on your comment stating, in bold `i can't change this db table`. But I'm glad you got it sorted out regardless. Cheers – Sebas Dec 28 '15 at 01:39
1

You can use x_id as this:

CONCAT(`animal`, '_', `size`) AS `x_id`

And then compare it with x_id, so that you will get something like:

+---------+-----------+--------+------------------+
| country | animal    | size   | x_id*            |
+---------+-----------+--------+------------------+
| africa  | crocodile | small  | crocodile_small  |
| africa  | elephant  | medium | elephant_medium  |
| africa  | giraffe   | medium | giraffe_medium   |
| africa  | giraffe   | large  | giraffe_large    |
| europe  | crocodile | small  | crocodile_small  |
| europe  | elephant  | medium | elephant_medium  |
| europe  | giraffe   | large  | giraffe_large    |
+---------+-----------+--------+------------------+
Praveen Kumar Purushothaman
  • 164,888
  • 24
  • 203
  • 252
0

As I see, you are already using MyISAM engine type, You can just define both country and x_id field as PRIMARY KEY (jointly) and you can set the AUTO_INCREMENT for x_id field. Now MySQL will do the rest for you! BINGO!

Here is the SQL Fiddle for you!

CREATE TABLE `myTable` (
    `country` int(10) unsigned NOT NULL DEFAULT '1', -- country
    `animal` int(4) NOT NULL,
    `size` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
    `lang_id` tinyint(4) NOT NULL DEFAULT '1',
    `x_id` int(10) NOT NULL AUTO_INCREMENT,
    PRIMARY KEY (country,x_id)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;


INSERT INTO `myTable` (`country`, `animal`, `size`) VALUES 
    (777, 1001, 'small'),
    (777, 2002, 'medium'),
    (777, 7007, 'medium'),
    (777, 7007, 'large'),
    (42, 1001, 'small'),
    (42, 2002, 'medium'),
    (42, 7007, 'large')

The result will be like this:

| country | animal | size   |lang_id | x_id  | 
|---------+--------+--------+--------+-------|
|  777    | 1001   | small  |   1    |   1   |
|  777    | 2002   | medium |   1    |   2   |
|  777    | 7007   | medium |   1    |   3   |
|  777    | 7007   | large  |   1    |   4   |
|  42     | 1001   | small  |   1    |   1   |
|  42     | 2002   | medium |   1    |   2   |
|  42     | 7007   | large  |   1    |   4   |

NOTE: This will only work for MyISAM and BDB tables, for other engine types you will get error saying "Incorrect table definition; there can be only one auto column and it must be defined as a key!" See this answer for more on this : https://stackoverflow.com/a/5416667/5645769.

Community
  • 1
  • 1
  • ALTER TABLE `myTable` ADD PRIMARY KEY ( `country`, `animal` ); returns #1062 - Duplicate entry '###' for key 'PRIMARY' – HQ5 Dec 28 '15 at 12:00
  • @HQ5, Your error already told you about the problem, you have a duplicate entry for same country, which should be removed anyhow. If possible, clean up the table and start from the beginning again. – Tᴀʀᴇǫ Mᴀʜᴍᴏᴏᴅ Dec 28 '15 at 16:19
  • @HQ5, Well, did you actually try with `ADD PRIMARY KEY ( country, animal ); ` instead of `ADD PRIMARY KEY ( country, x_id ); ` ? Why `animal` place of `x_id`? – Tᴀʀᴇǫ Mᴀʜᴍᴏᴏᴅ Dec 28 '15 at 16:21