3

I have prepared an SQL query that I will have to run on several databases (Oracle and Sybase) where some data might be stored differently.

I have noticed that one of the differences in data storage is the blank string. For example, in the column PRODUCT_TYPE below, please have a look at the second record:

enter image description here

This "empty string" (the data type is CHAR(15)) circled in red is equal to '' in some of the databases, whereas it's equal to ' ' to some others. The length is never constant and there are several fields that behave as such.

So, since I need to filter on these "empty strings", I should change the following statement in my WHERE clause:

WHERE PRODUCT_TYPE = ''

...because the above will take the ' ' string as different than '' even if "functionally" speaking is not. I would hence like to make the statement in a way that it "ignores white spaces", i.e. ' ' is equal to '' that is equal to ' ' etc. How should I do this change in order to make it work?

I have tried the simple replacing approach:

WHERE REPLACE(PRODUCT_TYPE,' ','') = ''

...but it doesn't seem to work, probably because I should use a different character. For sake of testing, inside the ' below there is a copied-pasted example of what I find in these "empty strings":

'               '

Ideally, it should be a "non-specific SQL" solution since I will have to run the same query on both Oracle and Sybase RDBMS. Any idea?

Matteo NNZ
  • 11,930
  • 12
  • 52
  • 89
  • Why does it have to be the exact same query for both databases? – Thilo Nov 17 '15 at 14:53
  • 1
    If `REPLACE(PRODUCT_TYPE,' ','')` is not removing all the spaces then you have a different character than space like a carriage return or tab or some other unprintable character. – Atri Nov 17 '15 at 14:55
  • @Thilo it's just a matter of how the results will come. If we run two types of queries (one on all Oracle databases and one on all Sybase ones, we'll get two separate files to analyze). Unless is not possible to make a IF Oracle ELSE statement? I admit I'm not very good at SQL, would love your adivse on that. – Matteo NNZ Nov 17 '15 at 14:58
  • The idea of having two separate queries is to be able to get the same standardized file out of them (in a fashion that's easier to achieve than trying to have a single query that works for both). I am not sure which way is more appropriate here. Either way, working with multiple database vendors is messy, especially when you cannot control how the data got into the DB in the first place. – Thilo Nov 17 '15 at 15:01
  • ... and it sounds like that in your case, you not only have Oracle vs Sybase, but also "old version of the schema" vs "new version of the schema" (with possible many in-betweens). :-( – Thilo Nov 17 '15 at 15:03
  • @Thilo yeah, I see your point. Sometimes the data models have changed, so far this is the only difference that I've found but it's true that I might need to choose a RDBMS-dependent solution in the end if more differences come up. – Matteo NNZ Nov 17 '15 at 15:06
  • Why do you use `CHAR` **data type**? It is a fixed length data type, which means that you will *always* consume extra storage even if you don't need it. – Lalit Kumar B Nov 17 '15 at 15:20
  • @LalitKumarB, true, but it's not me having the right to change the data type unfortunately. – Matteo NNZ Nov 17 '15 at 15:22
  • 1
    @MatteoNNZ I understand. Now you have a strong technical idea to explain your stake holders. I have upvoted the answers that I found good and helpful. Also, upvoted your question. By the way, please make sure that the next time you provide the create and insert statements as sample data. Or, at least a SQL Fiddle with a sample schema. All the best :-) – Lalit Kumar B Nov 17 '15 at 15:28
  • @LalitKumarB I'll do for the SQL fiddle :) As for my bosses, there's no need to explain them, they've already changed this in newer versions of the database. But unfortunately, what has been released before can't be changed any longer. Thanks a lot for your contribution to the thread. – Matteo NNZ Nov 17 '15 at 15:29

3 Answers3

5

You can use trim on the column.

where trim(product_type) is null

The above is not DBMS-independent, since Sybase does not provide the trim function. However, the below approach will work both in Sybase and Oracle:

where rtrim(ltrim(product_type)) is null
Matteo NNZ
  • 11,930
  • 12
  • 52
  • 89
Vamsi Prabhala
  • 48,685
  • 4
  • 36
  • 58
  • Thanks vkp. This worked and it's the approach that I'm going to follow since it's the most readable. – Matteo NNZ Nov 17 '15 at 15:04
  • Does that "trim to NULL" on Sybase as well? – Thilo Nov 17 '15 at 15:09
  • @Thilo actually, it's telling me that "Function 'trim' not found". I will have to try the REPLACE approach, although I preferred this one. – Matteo NNZ Nov 17 '15 at 15:18
  • @MatteoNNZ Have you tried to check if LTRIM and RTRIM exists independently? Some DBMS only implements those 2 and not the combination of them (TRIM). – Julien Blanchard Nov 17 '15 at 15:43
  • @JulienBlanchard yes, I was going to write exactly this. By combining the LTRIM and RTRIM, I get the DBMS-independent solution I was looking for. – Matteo NNZ Nov 17 '15 at 15:48
  • vkp, I have edited and accepted your answer accordingly. Thanks again for the help – Matteo NNZ Nov 17 '15 at 15:49
  • DBMS-independent is probably not the right term considering that there are probably some DBMS who don't implement `LTRIM`/`RTRIM` (especially since some implement `TRIM(LEADING ...)` and `TRIM (TRAILING ...)`. – Julien Blanchard Nov 17 '15 at 15:55
2

You can use the replace statement you've tried but you should test for "is null" instead of =''

WHERE REPLACE(PRODUCT_TYPE,' ','') is null

See also: null vs empty string in Oracle

Community
  • 1
  • 1
Rene
  • 10,391
  • 5
  • 33
  • 46
  • Thanks Rene, that worked. I prefer the solution of vkp using `trim` (just for readibility), but thanks a lot for your help. This is still a valid answer, I would accept it if I could accept 2 :) – Matteo NNZ Nov 17 '15 at 15:04
2

The simple (and non-DBMS specific) answer is:

Do not use CHAR(15).

char(n) is a fixed length data type. So no matter what you store in there, the value will always be padded to the defined length. If you store a single character, the DBMS will store that single character and 14 spaces.

Change your columns to use varchar(15) and you should not have any problems.

  • 1
    Conversely, how come that a single space can exist in `char(15)` ? – Thilo Nov 17 '15 at 14:55
  • 2
    @Thilo: _theoretically_ it can't, but some DBMS do not comply with the ANSI standard and do **not** pad the stored values to the defined length (Microsoft and Sybase come to mind). But no DBMS I know of pads a `varchar` column to the defined length –  Nov 17 '15 at 14:56
  • 1
    OTOH, in Oracle, empty string cannot be stored at all (it becomes NULL). Not much better, either. – Thilo Nov 17 '15 at 14:57
  • 1
    Unfortunately, changing the data type is out of discussion on my side (I can only read the databases). But I agree with you that this is the wrong data type, it has been in fact improved but unfortunately I still need to run my query on some older databases where this problem still exists. – Matteo NNZ Nov 17 '15 at 15:00
  • I completely agree with @a_horse_with_no_name, I don't see any reason to use `CHAR` **data type**? It is a fixed length data type, which means that you will *always* consume extra storage even if you don't need it. +1 as usual – Lalit Kumar B Nov 17 '15 at 15:22
  • @a_horse_with_no_name Just a FYI, "*Change your columns to use varchar(15)*" I am sure you mean `VARCHAR2(15)`. So, edited your answer :-) – Lalit Kumar B Nov 17 '15 at 15:23
  • @LalitKumarB: no, I **did** mean VARCHAR as the question was supposed to be DBMS independent. And for Oracle it doesn't make a difference. Oracle has been claiming that they _might_ change the behaviour of `varchar` vs. `varchar2` in the future for over 20 years now, and in Oracle 12 both are still the same. So I don't think this will ever happen. And even if it does, tables created _today_ with `varchar` will be tables with `varchar2` - even tomorrow. And _if_ they change it, `varchar` will be a standard compliant type where `NULL <> ''` which is the better thing to do here anyway –  Nov 17 '15 at 15:25
  • @a_horse_with_no_name Ok, no problem. I can't assure about the future of `varchar`, but trust me it is one of the hot topics for the next release(timeline not yet decided) – Lalit Kumar B Nov 17 '15 at 15:33
  • @LalitKumarB I'll only believe it when I see it ;) –  Nov 17 '15 at 15:35