3

I am importing some Excel spreadsheets into a MS SQL Server. I load the spreadsheets, cleanse the data and then export it to SQL using Alteryx. Some files have text columns where the cells span multiple lines (i.e. with new line characters, like when you press ALT + ENTER in Excel). When I export the tables to SQL and then query the table, I see lots of '_x000D_' which are not in the original file.

Is it some kind of newline character encoding? How do I get rid of it?

I haven't been able to replicate the error. The original file contains some letters with accents (à á etc); I created multi-line spreadsheets with accented letters, but I managed to export these to SQL just fine, with no 'x000D'.

If these were CSV files I would think of character encoding, but Excel spreadsheets? Any ideas? Thanks!

Pythonista anonymous
  • 8,140
  • 20
  • 70
  • 112
  • "Is it some kind of newline character encoding?" - Yes, that's what it is. "How do I get rid of it?" - String replace. Either in Excel (VBA) or in the DB. – JimmyB Oct 19 '15 at 15:54
  • But why cannot I replicate the issue, then? If I create a new multi-line spreadsheet myself, these weird characters do not appear. Is it some option in the OleDB connection used to write to SQL? Is this documented anywhere? – Pythonista anonymous Oct 19 '15 at 16:00
  • Where does the file come from? A tool exporting data to Excel may introduce those characters while you cannot input them manually. – JimmyB Oct 19 '15 at 16:16
  • This is ASCII 13, carriage return. – Shanerk Jul 25 '22 at 21:49

2 Answers2

2

I know this is old, but: if you're using Alteryx, just run it through the "Data Cleansing" tool as the last thing prior to your export to SQL. For the field in question, tell the tool to remove new lines by checking the appropriate checkbox.

If that still doesn't work... 0x000D is basically ASCII 13; (Hex "D" = Int 13)... so try running your data through a regular Formula tool, and for the [field] in question, just use the expression Replace([field],CharFromInt(13),""), which should remove that character by replacing it with the empty string.

johnjps111
  • 1,160
  • 9
  • 24
0

This worked for me:

REGEX_REPLACE([field],"_x000D_","") 
Lauren Rutledge
  • 1,195
  • 5
  • 18
  • 27
  • This only works post import as the actual character (ASCII 13) is being translated into that string value. So you'd have to scrub it again after importing which is non-ideal. – Shanerk Jul 25 '22 at 21:51