I wonder if the lw
MIPS instruction can work on a variable. I read a book that said the compiler will associate register with variable. Then, don't I have to move variable in memory to register?

- 328,167
- 45
- 605
- 847

- 1
- 1
-
3The `lw` instruction operates on addresses. You can choose what address you use. Could be the address of a variable. Could be the address of an array. Could be an address that doesn't correspond to an array or a variable at all. – Raymond Chen Sep 18 '22 at 13:13
-
Then we have to code lw,sw everytime we want to move data between memory with register? – parabola Sep 18 '22 at 14:07
-
3The MIPS is a load-store architecture. You access memory by loading and storing. (Some assemblers may provide shortcuts or other macros, but at the CPU level, it's just loads and stores.) – Raymond Chen Sep 18 '22 at 14:15
-
You can only avoid loading by not keeping the variable in memory in the first place, *only* in a register. Like a local loop counter or something. Look at actual compiler output with optimization enabled (like `-O1` at least), e.g. https://godbolt.org/ has MIPS GCC. ([How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116)) – Peter Cordes Sep 18 '22 at 16:41
1 Answers
Algorithms in pseudo code and in high level languages and have logical variables, while the machine code equivalent algorithm in a program has physical storage.
Logical variables have names, types, scope, lifetime, and at runtime have a location and hold a value. In a program/algorithm, a name typically refers to the content held by a variable; its scope goes to what variables are reachable from any given line of code (or data); lifetime goes to the duration of variables, e.g. global variables have full program lifetime, whereas function parameters live as long as the function is active but then cease to exist upon the function's return; some variables are referred to indirectly by the program, and certain variables can hold a value at runtime that can be changed and later recalled by the program.
Physical storage of the machine consists of CPU registers and main memory; both allow for storage and retrieval of what values have been stored. Physical storage is essentially unnamed, has no real type, or scope, and has permanent (full program) lifetime.
CPU registers are fast and limited in count and cannot be indexed, whereas main memory is vast, and indexable or addressable. Thus, for main memory there is a notion of address that is first-class: you can identify a memory location by a number (an address), and use that value (that address), say as a parameter.
Addressing or indexing is not possible with the CPU registers, they can only be named in machine code instructions.
One job of the compiler or an assembly language writer is to map (to translate) the logical variables from our algorithms in to the physical storage available on the processor. Any mapping that works is acceptable; though some will be more efficient than others. For logical variables that have overlapping lifetimes, they require separate physical storage. Physical storage is frequently reused, repurposed for different logical variables — as logical variables' lifetimes end, their storage can be repurposed for other logical variables whose lifetime is just beginning.
Data structures that require indexing must live in main memory; however, main memory can be also be used any way that the compiler or programmer likes, so it can be used for simple logical variables as well. Data structures as as arrays, trees, linked lists, inherently require indexing/addressing. Arrays b/c we are selecting one of many elements, and the others because we use pointers (references) so the items being pointed (referred) to must have addresses and so must live in main memory.
Since the CPU registers are precious, fast resources, they are mostly used for logical variables that have short lifetimes. Such variables are function parameters and local variables (locals).
Sometimes a logical variable has to be moved from one physical storage location to another. Such is the case with some parameters passed in registers, if its associated register will be clobbered for some reason before the program's final usage of the variable, then such variable will have to be mapped to yet another storage location and its value copied from the original to the other (by the machine code program), before the physical storage of its original mapping gets clobbered.
Assembly language is like machine code though with named labels and separated sections. Sections subdivide main memory for the program — separating code from global data — both code and data are initialized with values as per the program prior to program start.
Other, uninitialized main memory, is also available to the program: configured as the stack and as the heap. While both the stack and heap refer to physical storage, each has a different usage model conventionally applied by programs and functions; the usage model allows for sharing of physical storage among multiple functions — this usage model has a notion of allocation, initialization (and usage) and eventual deallocation.
By convention stack memory allows for allocation upon function entry and deallocation upon function exit, and in this manner the stack memory is repeatedly repurposed and reused by one function after another.
Also by convention, heap memory allows for explicit allocation and deallocation of storage — storage that does not have to correspond to function activation & deactivation, so is used for data structures that are to outlive functions that create them.
Labels are used to identify locations in the code and data — locations in code for branching and calling; locations in data for storage for global variables. Labels in data work to identify storage location of global variables mapped to physical storage. It is up to the programmer to reserve sufficient storage for each kind of global variable. A label is equivalent to the constant value that is the address of the start of physical storage for an item (rather than to the logical variable as a whole as would a name of a variable imply in high level language). It is up to the compiler and assembly programmer to access physical storage in a manner that is wholly consistent with the intent of the logical variables mapped there. The processor does not read data declarations, and as the physical storage is constantly being repurposed, it is the machine code program's job to inform the processor how to treat storage.
Labels are removed during build of (assembly) source code into machine code programs. Labels are not seen by the processor, and in the program, do not separate what comes before from what comes after — they are just a convenience for the assembly programmer. Labels alone, in code, do not affect flow of control and in data they do not prevent or preclude access that goes past or prior (by memory address) the intended logical variable there.
Then don't I have to move variable in memory to register?
As an assembly programmer you can map a logical variable to a CPU register alone without using main memory, with the caveats that this is inappropriate under some circumstances:
if the register will not survive the lifetime of the logical variable. This can happen, for example, if the logical variable is a global variable, or if the register will be otherwise clobbered by code, such as with function calling.
if the logical variable is of a nature that it requires indexing, which is not possible among CPU registers.
If you use main memory for a variable (as is appropriate for a global variable), them to recall its value or store a new value there you will have to used loads and stores, respectively. Global data will be initialized prior to program start but other physical storage (such as CPU registers, stack memory, heap memory) needs to be initialized by execution of machine code in the program itself.

- 23,049
- 2
- 29
- 53