Questions tagged [cpu-architecture]

The hardware microarchitecture (x86, x86_64, ARM, ...) of a CPU or microcontroller.

The hardware architecture and ISA (x86, x86_64, ARM, ...) and the micro-architectural implementation of a CPU or microcontroller.

Some of the key architecture:

arm - 32-bit Advanced RISC Machine.
arm64 - 64-bit Advanced RISC Machine.
ia32 - 32-bit Intel Architecture.
mips - 32-bit Microprocessor.
mipsel - 64-bit Microprocessor.
ppc - PowerPC Architecture.
ppc64 - 64-bit PowerPC Architecture.

Use this tag for questions regarding features, bugs and details concerning the inner working of specific CPU architectures.

3996 questions

27072

votes

25 answers

Why is processing a sorted array faster than processing an unsorted array?

In this C++ code, sorting the data (before the timed region) makes the primary loop ~6x faster: #include #include #include int main() { // Generate data const unsigned arraySize = 32768; int…

asked Jun 27 '12 at 13:51

GManNickG

494,350
52
494
543

696

votes

4 answers

How do I achieve the theoretical maximum of 4 FLOPs per cycle?

How can the theoretical peak performance of 4 floating point operations (double precision) per cycle be achieved on a modern x86-64 Intel CPU? As far as I understand it takes three cycles for an SSE add and five cycles for a mul to complete on most…

c++ assembly x86-64 cpu-architecture flops

asked Dec 05 '11 at 17:54

user1059432

7,518
3
19
16

344

votes

4 answers

Deoptimizing a program for the pipeline in Intel Sandybridge-family CPUs

I've been racking my brain for a week trying to complete this assignment and I'm hoping someone here can lead me toward the right path. Let me start with the instructor's instructions: Your assignment is the opposite of our first lab assignment,…

c++ optimization x86 intel cpu-architecture

asked May 21 '16 at 09:29

Cowmoogun

2,507
4
12
17

319

votes

7 answers

Why does this code execute more slowly after strength-reducing multiplications to loop-carried additions?

I was reading Agner Fog's optimization manuals, and I came across this example: double data[LEN]; void compute() { const double A = 1.1, B = 2.2, C = 3.3; int i; for(i=0; i

assembly optimization x86-64 cpu-architecture simd

asked May 19 '22 at 14:39

ttsiodras

10,602
6
55
71

278

votes

3 answers

What is a retpoline and how does it work?

In order to mitigate against kernel or cross-process memory disclosure (the Spectre attack), the Linux kernel1 will be compiled with a new option, -mindirect-branch=thunk-extern introduced to gcc to perform indirect calls through a so-called…

security assembly x86 cpu-architecture spectre

asked Jan 04 '18 at 05:52

BeeOnRope

60,350
16
207
386

263

votes

5 answers

How does the ARM architecture differ from x86?

Is the x86 Architecture specially designed to work with a keyboard while ARM expects to be mobile? What are the key differences between the two?

x86 arm cpu-architecture

asked Feb 10 '13 at 03:39

user1922878

2,833
3
13
7

257

votes

7 answers

Difference between core and processor

What is the difference between a core and a processor? I've already looked for it on Google, but I only get definitions for multi-core and multi-processor, which is not what I am looking for.

cpu core cpu-architecture

asked Oct 07 '13 at 13:13

Saad Achemlal

3,616
5
16
17

249

votes

3 answers

How much of ‘What Every Programmer Should Know About Memory’ is still valid?

I am wondering how much of Ulrich Drepper's What Every Programmer Should Know About Memory from 2007 is still valid. Also I could not find a newer version than 1.0 or an errata. (Also in PDF form on Ulrich Drepper's own site:…

optimization memory x86 cpu-architecture cpu-cache

asked Nov 14 '11 at 18:30

Framester

33,341
51
130
192

236

votes

4 answers

What is the purpose of the "Prefer 32-bit" setting in Visual Studio and how does it actually work?

It is unclear to me how the compiler will automatically know to compile for 64-bit when it needs to. How does it know when it can confidently target 32-bit? I am mainly curious about how the compiler knows which architecture to target when…

c# .net visual-studio compilation cpu-architecture

asked Aug 22 '12 at 05:13

Aaron

10,386
13
37
53

217

votes

10 answers

What is the difference between Trap and Interrupt?

What is the difference between Trap and Interrupt? If the terminology is different for different systems, then what do they mean on x86?

x86 operating-system kernel interrupt cpu-architecture

asked Jun 30 '10 at 12:23

David

3,190
8
25
31

209

votes

13 answers

Why is a boolean 1 byte and not 1 bit of size?

In C++, Why is a boolean 1 byte and not 1 bit of size? Why aren't there types like a 4-bit or 2-bit integers? I'm missing out the above things when writing an emulator for a CPU

c++ boolean byte cpu-architecture abi

asked Jan 07 '11 at 15:02

Asm

2,101
2
13
4

190

votes

4 answers

What happens when a computer program runs?

I know the general theory but I can't fit in the details. I know that a program resides in the secondary memory of a computer. Once the program begins execution it is entirely copied to the RAM. Then the processor retrive a few instructions (it…

c++ memory operating-system x86 cpu-architecture

asked Mar 02 '11 at 01:50

gaijinco

2,146
4
17
16

179

votes

2 answers

What is difference between sjlj vs dwarf vs seh?

I can't find enough information to decide which compiler should I use to compile my project. There are several programs on different computers simulating a process. On Linux, I'm using GCC. Everything is great. I can optimize code, it compiles fast…

c++ compiler-construction mingw cpu-architecture mingw-w64

asked Mar 27 '13 at 21:48

sorush-r

10,490
17
89
173

175

votes

1 answer

Why is processing an unsorted array the same speed as processing a sorted array with modern x86-64 clang?

I discovered this popular ~9-year-old SO question and decided to double-check its outcomes. So, I have AMD Ryzen 9 5950X, clang++ 10 and Linux, I copy-pasted code from the question and here is what I got: Sorted - 0.549702s: ~/d/so_sorting_faster$…

c++ performance clang cpu-architecture branch-prediction

asked Mar 07 '21 at 20:57

DimanNe

1,791
3
12
19

170

votes

5 answers

Write-back vs Write-Through caching?

My understanding is that the main difference between the two methods is that in "write-through" method data is written to the main memory through the cache immediately, while in "write-back" data is written in a "later time". We still need to wait…

caching cpu-architecture cpu-cache

asked Nov 23 '14 at 10:25

triple fault

13,410
8
32
45

2 3

…

99 100 Next