2

Let's say I have a table in Oracle 12c with columns:

create table t1 (
a number (5,0),
b varchar (5,0)
d ...
e ...
);

Then I insert 100,000,000 records in both columns which have the same values - e.g.

20151 and '20152' ... (for a first record)
20152 and '20152' ... (for a second record)
20153 and '20153' ... (for a third record)
...

Then I add index 1 on column 'a' and index 2 on column 'b'.

Question is - would the query perform equally fast when executing against column 'a' as on column 'b' (e.g. join query with other table based on a column 'a' or based on a column 'b' or WHERE clause on either of columns)?

Also - would using index on a 'varchar' column - use more CPU than using index on a 'number' column?

Thanks.

Kaushik Nayak
  • 30,772
  • 5
  • 32
  • 45
Joe
  • 11,983
  • 31
  • 109
  • 183
  • I don't think this is a realistic concern. Both will perform approximately identical. Choose the data type that makes sense for the data (in particular if you need to sort, compare and use ranges). – Thilo Nov 01 '17 at 13:24
  • I am about to redesign database (currently reviewing) - so there is no performance difference? How about resource usage? – Joe Nov 01 '17 at 13:29
  • Store numbers as numbers (as long as they stay within the support bounds and precision). Everything else is silly. – Thilo Nov 01 '17 at 13:38
  • I agree with that. However - I did not design database with many tables but am involved in redesigning as I see needed. Now if I decide to redesign from Varchar to a Number - there might be a lot of work doing changes on all applications involved. But if difference is very very small - maybe this effort is not worth it.. – Joe Nov 01 '17 at 13:42
  • There will be no significant performance difference. But if you currently store numbers as varchar that seems like a very strange design decision. – Thilo Nov 01 '17 at 13:44
  • I agree - it seemed strange to me as well when I saw it first time. (think of it as a 'legacy' design). That's why I am asking here before proposing a change.. – Joe Nov 01 '17 at 13:55
  • Please do not ask the [same question twice](https://stackoverflow.com/q/47043564/1509264) - you can edit a question to make it clearer or put a bounty on it to get more feedback. I opted to close the previous one since it has less detail in than this one but if you keep doing it the duplicates are likely to get closed and downvoted by the community. – MT0 Nov 01 '17 at 14:25

1 Answers1

1

[TL;DR] Use dates to store dates, numbers to store numbers and strings to store strings.

How about resource usage?

Oracle stores the NUMBER data type as 1 byte per 2 digits.

Oracle stores the CHAR data type as 1 byte per ASCII character (UTF-8 and other encodings may take more for characters in extended sets) and will right-pad the string with space characters so the strings are all exactly the same length.

Oracle stores the VARCHAR2 data type as 1 byte per ASCII character plus a small overhead (1 or 2 bytes) for the string length.

Oracle stores the DATE data type as 7 bytes (2 for year and 1 for each of month, day, hour, minute, second).

Based on your previous question you appear to be storing year and quarter and assuming that you are always going to have 4-digit years and 1-digit quarters then:

  • NUMBER(5,0) would take 3 bytes;
  • CHAR(5 CHARACTER) would take 5 bytes;
  • VARCHAR2(5 CHARACTER) would take 6 bytes; and
  • DATE would take 7 bytes.

So only considering memory a NUMBER(5,0) would be the most efficient.

However

As soon as you start to do arithmetic on year/quarters stored as numbers/strings then you get into performance issues:

For example, getting the next quarter :

  • If quarter is a NUMBER data type then you could use: CASE WHEN MOD(quarter,10) = 4 THEN quarter + 7 ELSE quarter + 1 END but this doesn't handle when you want to add 5 quarters or start subtracting quarters and then the logic starts to get much more complicated.
  • If quarter is a CHAR data type then you could convert it to a number or a date and use either of those methods (string manipulation is not likely to be performant).
  • If quarter is a DATE then you just need to use ADD_MONTHS( quarter, 3 ).

The DATE method is self-documenting and already exists whereas the NUMBER method would just become a custom function for your approximation of a QUARTER data type and once you implement all the comparison and manipulation functions you need you will have effectively rewritten the DATE data type as a UDT for quarters and those functions will be less perfomant than the optimised date functions.

Don't use inappropriate data types - just store dates as dates; numbers as numbers; and strings as string.

Community
  • 1
  • 1
MT0
  • 143,790
  • 11
  • 59
  • 117