1

I'm trying to upload huge SQL files (over a couple of GB) to a remote mysql server using C#.

Is there any standard way to;

  1. Start reading an SQL file to memory
  2. Stop once we have something usable
  3. Execute that bit
  4. Remove that bit from memory
  5. Read to the next usable part
  6. Etc.

Or do I have to write a regex or something myself? The files are too big to read to memory in one go.

Start of one of the sql files (if it helps at all)

-- MySQL dump 10.13  Distrib 5.1.49, for debian-linux-gnu (x86_64)
--
-- Host: localhost    Database: qqqq
-- ------------------------------------------------------
-- Server version   5.1.49-3

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Current Database: `qqqq`
--

CREATE DATABASE /*!32312 IF NOT EXISTS*/ `qqqq` /*!40100 DEFAULT CHARACTER SET latin1 */;

USE `qqqq`;

--
-- Table structure for table `config`
--

DROP TABLE IF EXISTS `config`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `config` (
  `name` varchar(200) NOT NULL,
  `value` varchar(200) NOT NULL,
  `timestamp` int(11) NOT NULL,
  UNIQUE KEY `name` (`name`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `config`
--

LOCK TABLES `config` WRITE;
/*!40000 ALTER TABLE `config` DISABLE KEYS */;
INSERT INTO `config` VALUES ('someConfigValue','324',0),('someConfigValue2','27',0),('someConfigValue3','0',0);
/*!40000 ALTER TABLE `config` ENABLE KEYS */;
UNLOCK TABLES;

I think I can't just split it each time a ; occurs, because one of those may be inside a text string as well.

Shiplu Mokaddim
  • 56,364
  • 17
  • 141
  • 187
natli
  • 3,782
  • 11
  • 51
  • 82

4 Answers4

3

As you have created this dump by mysqldump utility, I'd recommend you to dump partially. Here are some tricks to partial dump.

Split by table

If you have many tables you can split the dumping process by table

mysqldump database table1 > table.sql
mysqldump database table2 table3 > table2-3.sql

Split by rows

If there are some tables where you have millions of rows you can split it by rows while dumping. Say you have a table bigtable and it has a auto column id the following 2 commands split it by odd even numbered auto columns.

mysqldump --where="id%2=1" database bigtable  > bigtable_part1.sql
mysqldump --where="id%2=0" database bigtable  > bigtable_part2.sql

If no auto column present you need to depend on other heuristics.

mysqldump --where="gender='M'" database users  > users_male.sql
mysqldump --where="gender='F'" database users  > users_female.sql
Shiplu Mokaddim
  • 56,364
  • 17
  • 141
  • 187
0

Try looking for common break points, or similar places where it would be safe to split the SQL file. For example:

--
-- (command here)
--

Seems to be used to denote an 'important' command, which comes before a block of operations. You could try splitting it for every --\n-- as a common 'safespot', as line breaks should be escaped inside queries.

Death
  • 1,999
  • 12
  • 14
  • I was hoping for something more robust. Something that would work on any SQL file without me knowing it's exact formatting etc. Was hoping something like that existed already. – natli May 28 '12 at 13:20
  • In which case it really depends on the formatting. The only other 'safe-ish' way would be to count opening and closing unescaped quotation marks using regex, and if a ; occurs when there's an even number of unescaped quotation marks, then split the file, as this should denote that the ; isn't inside a query. You'd also need to take comments into account if you used this. – Death May 28 '12 at 13:22
0

You could try to load some bit of SQL in memory, execute it on the server using "SET PARSEONLY ON". If it parses ok then try and execute and go next, if it does not, load a small chunk of text (until the next ';') and try again.

Of course the fact that it parses ok does not mean it will execute ok but I assume you already have error handling for failed execution, right? :)

Tallmaris
  • 7,605
  • 3
  • 28
  • 58
0

I had a similar problem and what I ended up doing was splitting the sql file using the system split command.

split -l 1000 import.sql splited_file

The above will split the sql file every 1000 lines.

Hope this helps someone.

jodu
  • 77
  • 8