teradata data block size and table size calculations

Question

Been using this formula as a std for TB size calculation , as part of Cap. Planning effort . We are on TD 14

     ( rc *   ( rsz / ( blocksize -38) ) * blocksize ) 
      + ( SQL (sel Hashamp()+1 ; ) * 1024 ) 

rsz : row size , rc : count ( * )

Here actually

 (blocksize-38)/rsz

is nothing but rows / block. It comes out a fraction < 1. I think that's bad because it'd mean several blocks span a row. My questions are

Do the formulas need any further honing. Latter part after the addition sign , provides for table header . There are NO SI's for this table - just 2 dates , 1 Integer and 1 varchar (50) with a NUPI which is NPPI . None of them are Nullable and obviously without the data , nothing is compressible to begin with ( well there's not enough data info to inc. compression now but we'd run compression scripts later )
because it'd be several blocks spanning a row - I should be upping the block size ? how much - what should be the ideal number of rows per block. Table data will get a full refresh every quarter and in there'd be nothing happening in the interim

dnoeth · Accepted Answer · 2016-01-07T08:03:16.493

3

A row in Teradata never spans blocks.

You simply got your calculation wrong, you talk about (blocksize-38)/rsz, but the actual calculation shows rsz / ( blocksize -38).

As the block overhead in newer releases increased to 74 this should be the correct calculation:

 ( rc /  (( blocksize - 74)/rsz ) * blocksize ) 
      + ( (HASHAMP()+1  ) * 1024 )

It's found in Sizing Base Tables, Hash Indexes, and Join Indexes

But you will notice that this approaches rc * rsz for larger tables. As nobody cares about small tables, I usually use this simplified calculation to size a table (you might add 1 or 2 percent to get a maximum possible size).

Edit:

Not the calculation is wrong, it's due to base data types used (probably a truncation of a DECIMAL). Change to a FLOAT or NUMBER:

 ( rc *   ( rsz / ( CAST(blocksize -74 AS FLOAT)) ) * blocksize ) 
  + ( (HASHAMP()+1  ) * 1024 )

edited Jan 07 '16 at 08:03

answered Jan 07 '16 at 07:24

dnoeth

59,503
4
39
56

Hi Dieter . thank you for your answer. . The formula I gave is what comes out finally from this basic calculation. `Tb size = ( # blocks * block size ) + Header.`. Header calculation checks out as expected. Number of blocks is rc / rows per block. rows per block is ` block size - 38/ rsz ` . Now if you put all these together , its what happens as division by division is inverse multiplication. And so that is how I got the formula above. If the calculation approach in this comment is okay , then as I verified the above formula is what it finally comes down to. – user1874594 Jan 07 '16 at 07:47
2

@user1874594: Now I see. Yep, mathematically correct. But you run into trouble because of rounding/truncating :-) I'll modify my answer... – dnoeth Jan 07 '16 at 08:00
Thanks again . Links posted refer 2 version 15. I'm on 14. Would that be the same overhead or the older 38 – user1874594 Jan 07 '16 at 08:41
2

@user1874594: Check the manual version matching your release, should be 74, too. The number is wrong in several places in the manuals, still showing 38 while this *Block Overhead Block Header + Block Trailer = 72 bytes + 2 bytes = 74 bytes* is correct. – dnoeth Jan 07 '16 at 09:07
Thanks . All arrows up on you ! – user1874594 Jan 07 '16 at 09:56
Thanks . All arrows up on you !Had one Q on though on DB Block Size. I am taking 63.5 . Based on the load and usage pattern. Any pointers to approach this value determination. TY – user1874594 Jan 07 '16 at 16:48
1

@user1874594: Block size depends on a *dbscontrol* setting, Filesystem 3: PermDBSize. Based on the hardware (1xxx,2xxx,6xxx) and TD release this might be up to 2047 sectors. On your system it's probably either 127 or 254, 63.5KB or 127KB. When a table is initially loaded (and Freespace still defaults to 0) all but the last block per AMP will be of the maximum size. When it's maintained by Insert/Update/Delete it will be lower, but it's always hard to know exactly, you need to run a *Ferret* to get exact details. – dnoeth Jan 07 '16 at 17:25
Hi Dieter. TY for that valuable info . I was actually talking about customizing the block size at Table Level ( here we can't running console ) . So I wanted to identify typical situations where we'd double the table DB size to 2X default Vs situations to reduce it to < default. Our default is 65, I believe. Lets say the table gets refreshed just quarterly with a full refresh. How would a custom table level DB block have impact on performance. .e.g a update table benefits from a a custom DB block size < default. – user1874594 Jan 07 '16 at 17:51
Here is what TD gives on choosing a custom Table DB block `typical block size = ROUNDUP ( Max X 3/4,512)` Here whats the context of max block size- is it always 130,560 bytes or it comes from system . But system wide there's just one DB Block value which is the default DB block size ? – user1874594 Jan 07 '16 at 18:05
I had one last Q reg. Table overhead . TD 14.1 and TD 15 manuals both give this as 74 like you said but notwithstanding this value 38 is still used for rpb calculation as `rpb=(blocksz-38)/ rsz ` ref to this link http://www.info.teradata.com/htmlpubs/DB_TTU_14_10/index.html#page/Database_Management/B035_1094_112A/ch14.064.165.html#ww15240899 I guess this is what you said was the error ? – user1874594 Jan 07 '16 at 23:23
1

@user1874594: Yes, 74 is correct, 38 was correct in older releases. You shouldn't care about block sizes, it's just a maximum. Only in some very specific cases for some very large tables you might think about changing it. – dnoeth Jan 08 '16 at 06:58
Hi Deiter . If 74 is the new number then a datablock size < 74 kb is not compatible for a rpb calculation. e.g. we have somethiing like 65.xx kb. – user1874594 Jan 08 '16 at 13:23
Hi Dieter . If 74 is the new number then a datablock size < 74 kb is not compatible for a rpb calculation. e.g. we have somethiing like 65.xx kb. & we can get an rpb value with 38 we can get value for rpb as ( 65.5 - 38)/ 85 where rsz was 85. With 74 as overhead it'd be (65.5 - 74 ) a -ve number. How would one interpret this . TY again – user1874594 Jan 08 '16 at 13:29
1

@user1874594: 74 **bytes** are the overhead per data block. The `blocksize` in the formula is also in bytes, not kilobytes. – dnoeth Jan 08 '16 at 13:31
Thanks for the catch -explains why rpb was coming out as < 1 which did not make sense given that multiple blocks never span a row and its visa-versa . So PL confirm this e.g.`rc is 5,000,000 blocksize is 65.5 kb or 65500 bytes . rsz came out 100. rpb is ( 65500 - 74 ) / 100 = 654.26 so # Blocks is 7642.221747 ( 'floated') and so table size would be =7642.221747*65500+ (1024 * 1008) .Result is rounded to 502 mb` Does that sound right – user1874594 Jan 08 '16 at 15:11
1

@user1874594: Should be ok, now compare this to '5000000*100' plus *1 or 2 percent* :-) Btw, 1008 AMPs is a huge system... – dnoeth Jan 08 '16 at 15:20
Thank you so much for those confirmations. I just randomly picked up values to exemplify & .. simplify with 100 and 1000 figures . Our systems is far smaller than that... you are right. Thanks again – user1874594 Jan 08 '16 at 15:44
and also yep... the 1% is right on dot with +4mb. Makes sense to just thumb using the 1-2% rule – user1874594 Jan 08 '16 at 15:52

teradata data block size and table size calculations

1 Answers1