Will data bus width size change when word size changes?
I think that it will change because data bus width is usually in multiples of word size. So if word size changes, data bus width also changes.
Am I correct?
Will data bus width size change when word size changes?
I think that it will change because data bus width is usually in multiples of word size. So if word size changes, data bus width also changes.
Am I correct?
Not necessarily. The 8086/8088 both had a word size of 16-bits. The 8086 had a data bus of 16-bits, but the 8088 was only 8-bits. 80186/80188 and 80386/80386SX were similar.
Yes, if you made a variant of x86 with 9-bit bytes / 36-bit "dword", then its internal and external busses would be multiples of that instead of multiples of 64 bits.
But otherwise no, the ratios between word size and internal / external bus widths are flexible. You can let that ratio change as you widen various buses or increase the "word size" (or register width for non-integer registers).
x86 since P5 Pentium is architecturally required have atomic 64-bit loads/stores for aligned pointers. By far the easiest way to implement this is with 64-bit / 128-bit / 256-bit / 512-bit data busses. Intel was able to make that atomicity guarantee basically for free in P5 because they widened its external and internal data busses to 64-bit. So even for "32-bit" x86 CPUs of that generation, 32-bit busses weren't an option if they wanted to be compatible with Pentium.
Modern x86 CPUs have internal data paths up to 512 bits (64-bytes) wide. e.g. Skylake has a 64-byte wide path between L2 and L1 cache. Skylake-AVX512 has 64-byte load/store units. i.e. it can load/store whole cache lines at once. (The external data bus is 64-bit DDR3/4 DRAM that does burst transfers of whole 64-byte cache lines. Of course, for non-DRAM access, transfers go over PCIe)
Sandybridge / Ivybridge do AVX 256-bit loads/stores as two 128-bit (16-byte) halves, because the data path from execution units to L1D are only half as wide as the register size. See How can cache be that fast?
AMD Bulldozer-family and Ryzen split all 256-bit ops into 128-bit halves, so it's really two separate loads into two separate vector registers which get treated as one architectural YMM register. This is different from SnB/IvB where vaddps ymm
is a single uop, it's just that loads/stores need two cycles in the load/store execution unit because the bus isn't as wide as the physical registers.
With different FPU and SIMD register widths, the integer register width and "word size" are not as meaningful as they used to be! The same concepts apply, but it's just register width not "word size" that matters.