1

I've got an Oracle Siebel database which is currently the subject of a data cleansing exercise.

In the S_CONTACT table there are a load of invalid email addresses which are to be replaced by either a null or an empty string.

As I understand it, this activity will not reduce the size of the table, due to it being a fixed length column.

However, is this activity likely to reduce the size of the indexes?

That is, replacing a few million "no email" / "asdf" / "no@email.com" values with blanks or nulls (I am led to believe that blanks and nulls are the same in Oracle).

Ste
  • 1,136
  • 2
  • 10
  • 30
  • You should also consider that this `S_CONTACT` column *might* be mapped to a required field in some business component. If that's the case and you nullify the invalid data, then you will be unable to update the record from Siebel unless you specify an email address in the same update. It could be problematic if said business component is used in a workflow, web service, etc. – AJPerez Aug 20 '14 at 10:05

1 Answers1

2

Oracle treats empty strings as null.

Take a look at this link: null vs empty string in Oracle

NULL values, as a rule of thumb, are never indexed, so yes, that could reduce the size of your index, IF the nullable column is the only column on said index.

If you wanted to index NULL values, you would have to either create a function based index using NVL (and adjust your queries accordingly), or create and index on ([fixed value],[nullable column])

[EDIT] as pointed out by @a_horse_with_no_name , "tuples containing only NULL values are never indexed", which is what I meant to say, but way clearer.

Community
  • 1
  • 1
Daniel Stolf
  • 245
  • 3
  • 11
  • "*NULL values [...] are never indexed*" should be: tuples containing *only* NULL values are never indexed –  Aug 18 '14 at 19:15
  • you are absolutely right, sir. It is what I meant to say at first, but in a clearer manner. I have edited the answer to reflect that. – Daniel Stolf Aug 18 '14 at 19:55
  • Thank you. The largest index in the sytem is a LAST_UPDATED column so we'll not save much space there. I'll do some analysis into other all nullable column indexes now. – Ste Aug 27 '14 at 08:22