63

This has been asked a few times but I cannot find a resolution to my problem. Basically when using mysqldump, which is the built in tool for the MySQL Workbench administration tool, when I dump a database using extended inserts, I get massive long lines of data. I understand why it does this, as it speeds inserts by inserting the data as one command (especially on InnoDB), but the formatting makes it REALLY difficult to actually look at the data in a dump file, or compare two files with a diff tool if you are storing them in version control etc. In my case I am storing them in version control as we use the dump files to keep track of our integration test database.

Now I know I can turn off extended inserts, so I will get one insert per line, which works, but any time you do a restore with the dump file it will be slower.

My core problem is that in the OLD tool we used to use (MySQL Administrator) when I dump a file, it does basically the same thing but it FORMATS that INSERT statement to put one insert per line, while still doing bulk inserts. So instead of this:

INSERT INTO `coupon_gv_customer` (`customer_id`,`amount`) VALUES (887,'0.0000'),191607,'1.0300');

you get this:

INSERT INTO `coupon_gv_customer` (`customer_id`,`amount`) VALUES 
 (887,'0.0000'),
 (191607,'1.0300');

No matter what options I try, there does not seem to be any way of being able to get a dump like this, which is really the best of both worlds. Yes, it take a little more space, but in situations where you need a human to read the files, it makes it MUCH more useful.

Am I missing something and there is a way to do this with MySQLDump, or have we all gone backwards and this feature in the old (now deprecated) MySQL Administrator tool is no longer available?

Kendall Bennett
  • 2,353
  • 1
  • 17
  • 18
  • 3
    This is a known shortcoming in `mysqldump`. It was [first reported in 2004](https://bugs.mysql.com/bug.php?id=4328). In 2011, both [Tim Riker](https://bugs.mysql.com/bug.php?id=4328) & [Lon Binder](https://stackoverflow.com/a/20046484) suggested a 1-line patch to fix it. Mindbogglingly, this *still* has not been implemented by the `mysqldump` developers/maintainers. Because the original bug report was closed (wrongly and unhelpfully), the issue is now being tracked [here](https://bugs.mysql.com/bug.php?id=65465). –  Jun 03 '18 at 09:17
  • Possible duplicate of [How to deal with enormous line lengths created by mysqldump](https://stackoverflow.com/questions/1293529/how-to-deal-with-enormous-line-lengths-created-by-mysqldump) –  Jun 03 '18 at 16:09
  • It looks like this _may_ have been fixed https://github.com/MariaDB/server/pull/1865 Though I still have this issue with 10.6.11-MariaDB – Sean Burlington Dec 16 '22 at 16:09

9 Answers9

49

Try use the following option: --skip-extended-insert

It worked for me.

Mohammed H
  • 6,880
  • 16
  • 81
  • 127
Eric Tan
  • 525
  • 4
  • 2
  • 9
    `--skip-extended-insert` was ruled out in the question because it makes restores slower. – seanf Mar 06 '15 at 05:29
  • 6
    it makes restores slower BUT it is easy and more safe. Other answers may fail because of complex and error-prone approach with substitutions which could corrupt data or because of out-of-memory issues while doing sed or while executing very-very long mysql query. – Lev Lukomsky Jul 15 '16 at 23:05
  • 2
    +1 I am looking for the answer "Using mysqldump to format one insert per line?". This answer is good enough for me. – Mohammed H Mar 30 '17 at 08:31
  • Slow maybe, but it depends for what use... A benefit of `--skip-extended-insert` is that it provides a really fast way to limit your recordset to a given number of records. You may need this when you prepare test datasets with an exponential size: 1k, 10k, 100k, 1M records, etc. If this is your use case, use `| head -` smartly. – Fabien Haddadi Oct 07 '20 at 14:19
38

With the default mysqldump format, each record dumped will generate an individual INSERT command in the dump file (i.e., the sql file), each on its own line. This is perfect for source control (e.g., svn, git, etc.) as it makes the diff and delta resolution much finer, and ultimately results in a more efficient source control process. However, for significantly sized tables, executing all those INSERT queries can potentially make restoration from the sql file prohibitively slow.

Using the --extended-insert option fixes the multiple INSERT problem by wrapping all the records into a single INSERT command on a single line in the dumped sql file. However, the source control process becomes very inefficient. The entire table contents is represented on a single line in the sql file, and if a single character changes anywhere in that table, source control will flag the entire line (i.e., the entire table) as the delta between versions. And, for large tables, this negates many of the benefits of using a formal source control system.

So ideally, for efficient database restoration, in the sql file, we want each table to be represented by a single INSERT. For an efficient source control process, in the sql file, we want each record in that INSERT command to reside on its own line.

My solution to this is the following back-up script:

#!/bin/bash

cd my_git_directory/

ARGS="--host=myhostname --user=myusername --password=mypassword --opt --skip-dump-date"
/usr/bin/mysqldump $ARGS --database mydatabase | sed 's$VALUES ($VALUES\n($g' | sed 's$),($),\n($g' > mydatabase.sql

git fetch origin master
git merge origin/master
git add mydatabase.sql
git commit -m "Daily backup."
git push origin master

The result is a sql file INSERT command format that looks like:

INSERT INTO `mytable` VALUES
(r1c1value, r1c2value, r1c3value),
(r2c1value, r2c2value, r2c3value),
(r3c1value, r3c2value, r3c3value);

Some notes:

  • password on the command line ... I know, not secure, different discussion.
  • --opt: Among other things, turns on the --extended-insert option (i.e., one INSERT per table).
  • --skip-dump-date: mysqldump normally puts a date/time stamp in the sql file when created. This can become annoying in source control when the only delta between versions is that date/time stamp. The OS and source control system will date/time stamp the file and version. Its not really needed in the sql file.
  • The git commands are not central to the fundamental question (formatting the sql file), but shows how I get my sql file back into source control, something similar can be done with svn. When combining this sql file format with your source control of choice, you will find that when your users update their working copies, they only need to move the deltas (i.e., changed records) across the internet, and they can take advantage of diff utilities to easily see what records in the database have changed.
  • If you're dumping a database that resides on a remote server, if possible, run this script on that server to avoid pushing the entire contents of the database across the network with each dump.
  • If possible, establish a working source control repository for your sql files on the same server you are running this script from; check them into the repository from there. This will also help prevent having to push the entire database across the network with every dump.
Todd Blumer
  • 504
  • 4
  • 4
  • 11
    Good way, but... the issue is backuped data could contain "VALUES (" or "),(" inside of row. So right way to prevent unexpected issues is use another tool (not mysqldump) or fixed mysqldump: http://stackoverflow.com/a/20046484/751932 – Maxim Kholyavkin Aug 27 '14 at 13:09
  • You are correct @Speakus, if either of those two strings exist in the table, it will trip up the sed parsing. You are also correct with respect to mysqldump needing to be fixed, and until then finding another tool may be a good idea. However, the specific question was can mysqldump do this, and the short answer is yes, with some help from sed. – Todd Blumer Oct 02 '14 at 19:25
  • A note that `--extended-insert` doesn't limit itself to generate only one bulk insert for each table, as mentioned in `--opt` in the solution. You can refer to MySQL doc for more info. http://dev.mysql.com/doc/refman/5.7/en/mysqldump.html#option_mysqldump_extended-insert – Sian Lerk Lau Apr 11 '16 at 06:27
  • Don't forget that `--opt` is enabled by default ;) – Luke Peterson Jul 17 '17 at 07:42
  • Note that as of today (2020), the _default_ dump (mysqldump AND mariadb-dump) do extended inserts, contrarily to what may have been true back at the time of this answer. – Fabien Haddadi Oct 07 '20 at 14:19
12

As others have said using sed to replace "),(" is not safe as this can appear as content in the database. There is a way to do this however: if your database name is my_database then run the following:

$ mysqldump -u my_db_user -p -h 127.0.0.1 --skip-extended-insert my_database > my_database.sql
$ sed ':a;N;$!ba;s/)\;\nINSERT INTO `[A-Za-z0-9$_]*` VALUES /),\n/g' my_database.sql > my_database2.sql

you can also use "sed -i" to replace in-line.

Here is what this code is doing:

  1. --skip-extended-insert will create one INSERT INTO for every row you have.
  2. Now we use sed to clean up the data. Note that regular search/replace with sed applies for single line so we cannot detect the "\n" character as sed works one line at a time. That is why we put ":a;N;$!ba;" which basically tells sed to search multi-line and buffer the next line.

Hope this helps

Ace.Di
  • 131
  • 2
  • 3
  • 1
    On Windows, `sed` works if `git` is installed with installer and unix commands. – Vladimir Vukanac Jul 05 '15 at 20:19
  • Didn't actually do anything on my side. No diff between input and output. – dcsordas Jan 12 '16 at 14:30
  • Needed something to make "grep" give useful results. Complemented the above with sed 's/\(INSERT INTO `[A-Za-z0-9$_]*` VALUES \)/\1\n/g' test2.sql > test3.sql To avoid false positives from tablenames and the like. – Samuel Åslund Feb 11 '19 at 10:48
10

What about storing the dump into a CSV file with mysqldump, using the --tab option like this?

mysqldump --tab=/path/to/serverlocaldir --single-transaction <database> table_a

This produces two files:

  • table_a.sql that contains only the table create statement; and
  • table_a.txt that contains tab-separated data.

RESTORING

You can restore your table via LOAD DATA:

LOAD DATA INFILE '/path/to/serverlocaldir/table_a.txt' 
  INTO TABLE table_a FIELDS TERMINATED BY '\t' ...

LOAD DATA is usually 20 times faster than using INSERT statements.

If you have to restore your data into another table (e.g. for review or testing purposes) you can create a "mirror" table:

CREATE TABLE table_for_test LIKE table_a;

Then load the CSV into the new table:

LOAD DATA INFILE '/path/to/serverlocaldir/table_a.txt' 
  INTO TABLE table_for_test FIELDS TERMINATED BY '\t' ...

COMPARE

A CSV file is simplest for diffs or for looking inside, or for non-SQL technical users who can use common tools like Excel, Access or command line (diff, comm, etc...)

Cristian Porta
  • 4,393
  • 3
  • 19
  • 26
  • 1
    That's useful, just keep in mind that it creates the file **on the database server** if you're running it from a remote host. – Alex Ciminian Oct 30 '13 at 18:30
  • @AlexCiminian: yes, only the sysadmin(or the dba -or a user- with the rigth permission on the dbserver) can take and provide the file – Cristian Porta Oct 31 '13 at 05:52
6

I'm afraid this won't be possible. In the old MySQL Administrator I wrote the code for dumping db objects which was completely independent of the mysqldump tool and hence offered a number of additional options (like this formatting or progress feedback). In MySQL Workbench it was decided to use the mysqldump tool instead which, besides being a step backwards in some regards and producing version problems, has the advantage to stay always up-to-date with the server.

So the short answer is: formatting is currently not possible with mysqldump.

Mike Lischke
  • 48,925
  • 16
  • 119
  • 181
1

Try this:

mysqldump -c -t --add-drop-table=FALSE --skip-extended-insert -uroot -p<Password> databaseName tableName >c:\path\nameDumpFile.sql
jschoi
  • 1,884
  • 1
  • 12
  • 26
0

I found this tool very helpful for dealing with extended inserts: http://blog.lavoie.sl/2014/06/split-mysqldump-extended-inserts.html

It parses the mysqldump output and inserts linebreaks after each record, but still using the faster extended inserts. Unlike a sed script, there shouldn't be any risk of breaking lines in the wrong place if the regex happens to match inside a string.

seanf
  • 6,504
  • 3
  • 42
  • 52
0

I liked Ace.Di's solution with sed, until I got this error: sed: Couldn't re-allocate memory

Thus I had to write a small PHP script

mysqldump -u my_db_user -p -h 127.0.0.1 --skip-extended-insert my_database | php mysqlconcatinserts.php > db.sql

The PHP script also generates a new INSERT for each 10.000 rows, again to avoid memory problems.

mysqlconcatinserts.php:

#!/usr/bin/php
<?php
/* assuming a mysqldump using --skip-extended-insert */
$last = '';
$count = 0;
$maxinserts = 10000;
while($l = fgets(STDIN)){
  if ( preg_match('/^(INSERT INTO .* VALUES) (.*);/',$l,$s) )
  {
    if ( $last != $s[1] || $count > $maxinserts )
    {
      if ( $count > $maxinserts ) // Limit the inserts
        echo ";\n";
      echo "$s[1] ";
      $comma = ''; 
      $last = $s[1];
      $count = 0;
    }
    echo "$comma$s[2]";
    $comma = ",\n";
  } elseif ( $last != '' ) {
    $last = '';
    echo ";\n";
  }
  $count++;
} 
Kjeld Flarup
  • 1,471
  • 10
  • 15
-1

add

set autocommit=0;

to first line of your sql script file, then import by:

mysql -u<user> -p<password> --default-character-set=utf8 db_name < <path>\xxx.sql

, it will fast 10x.

Peyman Mohamadpour
  • 17,954
  • 24
  • 89
  • 100
jinzuchi
  • 27
  • 5