50

I've a database table of timesheets with some common feilds.

id, client_id, project_id, task_id, description, time, date 

There are more but thats the gist of it.

I have an export running on that table to a CSV file overnight to give the user a backup of their data. It also is used as a data import for a macro Excel file with some custom reports.

This all works with me looping through the timesheets with php and printing the lines to a file.

The problem is with a big database it can take hours to run which isn't acceptable. So I rewrote it with the MySQL INTO OUTFILE command and it reduced it down to a few seconds to run which was great.

The problem now is I can't seem to escape all the new line characters, etc., in the description field. Really, a user can type potentially any combination of characters in here including carriage returns/new lines.

This is a snippet of the MySQL code I have:

SELECT id, 
       client,
       project,
       task,
       REPLACE(REPLACE(ifnull(ts.description,''),'\n',' '),'\r',' ') AS description, 
       time,
       date  
      INTO OUTFILE '/path/to/file.csv'
      FIELDS ESCAPED BY '""'
      TERMINATED BY ',' ENCLOSED BY '"'
      LINES TERMINATED BY '\n'
      FROM ....

But...

When I try look at the source of the output file, newlines still exist in the file, therefore the CSV import for the Excel breaks all the fancy macros and pivot tables the Excel wizard has created.

Any thoughts on a best course of action?

Buttle Butkus
  • 9,206
  • 13
  • 79
  • 120
Derek Organ
  • 8,323
  • 17
  • 56
  • 75

6 Answers6

84

I think your statement should look like:

SELECT id, 
   client,
   project,
   task,
   description, 
   time,
   date  
  INTO OUTFILE '/path/to/file.csv'
  FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
  LINES TERMINATED BY '\n'
  FROM ts

Mainly without the FIELDS ESCAPED BY '""' option, OPTIONALLY ENCLOSED BY '"' will do the trick for description fields etc and your numbers will be treated as numbers in Excel (not strings comprising of numerics)

Also try calling:

SET NAMES utf8;

before your outfile select, that might help getting the character encodings inline (all UTF8)

Let us know how you get on.

Daniel Serodio
  • 4,229
  • 5
  • 37
  • 33
Question Mark
  • 3,557
  • 1
  • 25
  • 30
  • yes this is the statement i'm using now actually and it appears to be working. I am replacing the \n and , in description though as well just in case. – Derek Organ Jul 29 '09 at 11:31
  • How exactly do you use the `SET NAMES utf8;`? – dayne Aug 07 '13 at 15:42
  • dayne, use SET NAMES utf8; as a normal query (just as you would query a SELECT or UPDATE statement, i.e. in the mysql command line or via PHP or Ruby or.... ) before your SELECT statement – Question Mark Aug 07 '13 at 16:31
  • If you're having file permission issues, just write to /tmp like @Felix Z did and that should solve any issues. – zmonteca Apr 06 '15 at 18:45
  • `OPTIONALLY` was the missing piece of the puzzle! `OPTIONALLY ENCLOSED BY '"' ESCAPED BY '"'` did the trick for me (Windows Office style escapes). – CodeManX Aug 28 '15 at 23:33
18

Here is what worked here: Simulates Excel 2003 (Save as CSV format)

SELECT 
REPLACE( IFNULL(notes, ''), '\r\n' , '\n' )   AS notes
FROM sometables
INTO OUTFILE '/tmp/test.csv' 
FIELDS TERMINATED BY ',' ENCLOSED BY '"' ESCAPED BY '"'
LINES TERMINATED BY '\r\n';
  1. Excel saves \r\n for line separators.
  2. Excel saves \n for newline characters within column data
  3. Have to replace \r\n inside your data first otherwise Excel will think its a start of the next line.
Felix Z
  • 181
  • 1
  • 2
2

What happens if you try the following?

Instead of your double REPLACE statement, try:

REPLACE(IFNULL(ts.description, ''),'\r\n', '\n')

Also, I think it should be LINES TERMINATED BY '\r\n' instead of just '\n'

stealthyninja
  • 10,343
  • 11
  • 51
  • 59
devuxer
  • 41,681
  • 47
  • 180
  • 292
2

Without actually seeing your output file for confirmation, my guess is that you've got to get rid of the FIELDS ESCAPED BY value.

MySQL's FIELDS ESCAPED BY is probably behaving in two ways that you were not counting on: (1) it is only meant to be one character, so in your case it is probably equal to just one quotation mark; (2) it is used to precede each character that MySQL thinks needs escaping, including the FIELDS TERMINATED BY and LINES TERMINATED BY values. This makes sense to most of the computing world, but it isn't the way Excel does escaping.

I think your double REPLACE is working, and that you are successfully replacing literal newlines with spaces (two spaces in the case of Windows-style newlines). But if you have any commas in your data (literals, not field separators), these are being preceded by quotation marks, which Excel treats much differently than MySQL. If that's the case, then the erroneous newlines that are tripping up Excel are actually newlines that MySQL had intended as line terminators.

John Y
  • 14,123
  • 2
  • 48
  • 72
0

Probably won't help but you could try creating a CSV table with that content:

DROP TABLE IF EXISTS foo_export;
CREATE TABLE foo_export LIKE foo;
ALTER TABLE foo_export ENGINE=CSV;
INSERT INTO foo_export SELECT id, 
   client,
   project,
   task,
   REPLACE(REPLACE(ifnull(ts.description,''),'\n',' '),'\r',' ') AS description, 
   time,
   date
  FROM ....
Jorge Bernal
  • 265
  • 3
  • 11
0

Below procedure worked for me to resolve all the escaping issues and have the procedure more a generic utility.

CREATE PROCEDURE `export_table`(
IN tab_name varchar(50), 
IN select_columns varchar(1000),
IN filename varchar(100),
IN where_clause varchar(1000),
IN header_row varchar(2000))

BEGIN
INSERT INTO impl_log_activities(TABLE_NAME, LOG_MESSAGE,CREATED_TS) values(tab_name, where_clause,sysdate());
COMMIT;
SELECT CONCAT( "SELECT ", header_row,
    " UNION ALL ",
    "SELECT ", select_columns, 
    " INTO OUTFILE ", "'",filename,"'"
    " FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '\"' ESCAPED BY '""' ",
    " LINES TERMINATED BY '\n'"
    " FROM ", tab_name, " ",
    (case when where_clause is null then "" else where_clause end)
) INTO @SQL_QUERY;

INSERT INTO impl_log_activities(TABLE_NAME, LOG_MESSAGE,CREATED_TS) values(tab_name, @SQL_QUERY, sysdate());
COMMIT;

PREPARE stmt FROM @SQL_QUERY;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;

END