0

I have a table that has approximately 4 million records. I would like to make it have 240 million like so:

  1. Add an additional column of type BIGINT,
  2. Import 59 times the data I already have,
  3. And for each 4 million group of records, have the additional column to have a different value

The value of the additional column would come from another table.

So I have these records (except that I have 4 millions of them and not just 3):

| id | value |
+----+-------+
| 1  | 123   |
| 2  | 456   |
| 3  | 789   |

And I want to achieve this (except that I want 60 copies and not just 3):

| id | value | data |
+----+-------+------+
| 1  | 123   | 1    |
| 2  | 456   | 1    |
| 3  | 789   | 1    |
| 4  | 123   | 2    |
| 5  | 456   | 2    |
| 6  | 789   | 2    |
| 7  | 123   | 3    |
| 8  | 456   | 3    |
| 9  | 789   | 3    |

I tried to export my data (using SELECT .. INTO OUTFILE ...), then re-import it (using LOAD DATA INFILE ...) but it is really painfully slow.

Is there a fast way to do this?

Thank you!

Mathieu
  • 195
  • 1
  • 6
  • Create a new table where `id` is an `AUTO_INCREMENT` then just run this sixty times? `INSERT INTO new_table (value, data) SELECT value, id FROM old_table` – MatBailie May 15 '14 at 13:41

4 Answers4

1

First, I would recommend that you create a new table. You can do this using a cross join:

create table WayBigTable as 
    select t.*, n
    from table t cross join
         (select 1 as n union all select 2 union all select 3 union all select 4 union all select 5 union all
         . . .
          select 60
         ) n;

I'm not sure why you would want a bigint for this column. If you really need that, you can cast to unsigned.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
  • Are you sure that a single transaction of 240 million rows would be better than 60 transactions of 4 million rows? – MatBailie May 15 '14 at 13:43
  • @MatBailie It'll surely be faster than 60 outfiles. – Vesper May 15 '14 at 13:48
  • @MatBailie . . . I don't actually know for MySQL. For a new table with a single transaction, there is no need to log the inserts (the table can just be deleted if the statement fails). I don't know if MySQL has this optimization, but it would be in `create table as` instead of in `insert`. – Gordon Linoff May 15 '14 at 13:50
1

Sounds like you'd like to take the cartesian product of 2 tables and create a new table since you say The value of the additional column would come from another table? If so, something like this should work:

create table yourtable (id int, value int);
create table yournewtable (id int, value int, data int);
create table anothertable (data int);

insert into yourtable values (1, 123), (2, 456), (3, 789);
insert into anothertable values (1), (2), (3);

insert into yournewtable
select t.id, t.value, a.data
from yourtable t, anothertable a

Results:

ID  VALUE DATA
1   123   1
2   456   1
3   789   1
1   123   2
2   456   2
3   789   2
1   123   3
2   456   3
3   789   3

Edit, Side Note -- it looks like your ID field in your new table is not suppose to keep repeating the same ids? If so, you can use an AUTO_INCREMENT field instead. However, this could mess up the original rows if they aren't sequential.

sgeddes
  • 62,311
  • 6
  • 61
  • 83
  • Those results don't match the ops results – MatBailie May 15 '14 at 13:44
  • @MatBailie -- just sample data :) – sgeddes May 15 '14 at 13:44
  • Thank you @sgeddes, I didn't know that what I wanted was called "cartesion product". I have marked your answer as accepted, but I could not run your query on my DB because my machine doesn't have enough RAM. So I did this: `insert into yournewtable select t.id, t.value, a.data from yourtable t, 1` then `insert into yournewtable select t.id, t.value, a.data from yourtable t, 2` and so on until I couldn't anymore. – Mathieu May 23 '14 at 11:23
0

Hmm. You need a cross join of your table with a range. Something in a line of this:

INSERT INTO table (id,value,data) SELECT id, value from table 
    CROSS JOIN (SELECT 2 UNION SELECT 3 UNION ... SELECT 60) AS data;

Use this answer Generating a range of numbers in MySQL as reference on number range.

Community
  • 1
  • 1
Vesper
  • 18,599
  • 6
  • 39
  • 61
0

Here's one idea...

DROP TABLE IF EXISTS my_table;

CREATE TABLE my_table
(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,value INT NOT NULL
);

INSERT INTO my_table VALUES
(1  ,123),
(2  ,456),
(3  ,789);

ALTER TABLE my_table ADD COLUMN data INT NOT NULL DEFAULT 1;

SELECT * FROM my_table;
+----+-------+------+
| id | value | data |
+----+-------+------+
|  1 |   123 |    1 |
|  2 |   456 |    1 |
|  3 |   789 |    1 |
+----+-------+------+


SELECT * FROM ints;
+---+
| i |
+---+
| 0 |
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
| 6 |
| 7 |
| 8 |
| 9 |
+---+

INSERT INTO my_table SELECT NULL,value,data+i2.i*10+i1.i+1 FROM my_table,ints i1,ints i2;


SELECT * FROM my_table;
+-----+-------+------+
| id  | value | data |
+-----+-------+------+
|  1  |   123 |    1 |
|  2  |   456 |    1 |
|  3  |   789 |    1 |
|  4  |   123 |    2 |
|  5  |   456 |    2 |
|  6  |   789 |    2 |
|  7  |   123 |    3 |
|  8  |   456 |    3 |
...
... 
| 296 |   456 |   97 |
| 297 |   789 |   97 |
| 298 |   123 |   98 |
| 299 |   456 |   98 |
| 300 |   789 |   98 |
| 301 |   123 |   99 |
| 302 |   456 |   99 |
| 303 |   789 |   99 |
+-----+-------+------+
303 rows in set (0.00 sec)

Note, for 240 million rows, this is still going to be a bit slow :-(

Strawberry
  • 33,750
  • 13
  • 40
  • 57