1

I am importing some data from a CSV with the intention of then doing a lookup on one of the column names to get their location from another table (we're having to do it by name rather than an ID as we aren't provided with one unfortunately from the import data).

However when I do a left join on the names which look the same they aren't matching. If I do a select on the columns from each table using CAST(ColumnName AS VARBINARY) I can see the binary data is actually different.

How can I either do a left join so they link or update the data to strip whatever the CSV is adding? Looking in notepad at the CSV it looks clean weirdly. I'm currently doing:

BULK
    INSERT [tbl_Import]
FROM 
    'C:\data.csv'
WITH
    (
    FIELDTERMINATOR = ',',
    ROWTERMINATOR = '\n'
    )
GO

Update [tbl_Import]
set Name = LTRIM(RTRIM(Name))

But when I try and do a join, it returns nothing

Select 
ci.Name
,p.NorthSouth
,p.Name
From 
tbl_Import ci
left join lu_NorthSouth p
on p.Name = ci.Name

Any help appreciated. Thanks

  • Sample data would greatly help your question. – Tim Biegeleisen Jul 09 '21 at 09:07
  • 1
    "I can see the binary data is actually different." So what is the difference? – HoneyBadger Jul 09 '21 at 09:07
  • without seeing an example of the data that's not matching it's really impossible to say. Maybe you have white space, carriage return or tab characters, unprintable spaces, you need to isolate 1 row and check the values in each column and the `datalength` – Stu Jul 09 '21 at 10:30
  • Does this answer your question? [SQL Server - Remove all non-printable ASCII characters](https://stackoverflow.com/questions/43148767/sql-server-remove-all-non-printable-ascii-characters) – SMor Jul 09 '21 at 15:58
  • While fixing your data is good, fixing the process that populates the table is far better. Seems that might involve the process that generates your imported file? – SMor Jul 09 '21 at 15:58
  • Remember that `ltrim()`, `rtrim()` and `trim()` only remove space characters (ASCII 32). Your data could contain other control characters such as tabs (9), extra linefeeds (10) or carriage returns (13), or for the first field on the first row maybe even Unicode Preambles (aka. Byte Order Marks). – AlwaysLearning Jul 09 '21 at 16:24

0 Answers0