279

What is the difference between object code, machine code and assembly code?

Can you give a visual example of their difference?

mmcdole
  • 91,488
  • 60
  • 186
  • 222
  • 2
    I'm also curious about where did the "object code" name came from? What does the "object" word supposed to mean in it? Is it somehow related to object-oriented programming or just a coincidence of names? – SasQ Apr 04 '16 at 01:52
  • 1
    @SasQ: [Object code](https://en.wikipedia.org/wiki/Object_code). – Jesse Good Aug 22 '16 at 01:18
  • 2
    I'm not asking about what is an object code, Captain Obvious. I'm asking about where did the name come from and why is it called "object" code. – BarbaraKwarc Aug 23 '16 at 08:14
  • I’d say it’s object code because it’s an object that the linker can use to link together with other objects to produce the machine code (an executable or a shared library). – Maf Oct 02 '22 at 20:35

10 Answers10

355

Machine code is binary (1's and 0's) code that can be executed directly by the CPU. If you open a machine code file in a text editor you would see garbage, including unprintable characters (no, not those unprintable characters ;) ).

Object code is a portion of machine code not yet linked into a complete program. It's the machine code for one particular library or module that will make up the completed product. It may also contain placeholders or offsets not found in the machine code of a completed program. The linker will use these placeholders and offsets to connect everything together.

Assembly code is plain-text and (somewhat) human read-able source code that mostly has a direct 1:1 analog with machine instructions. This is accomplished using mnemonics for the actual instructions, registers, or other resources. Examples include JMP and MULT for the CPU's jump and multiplication instructions. Unlike machine code, the CPU does not understand assembly code. You convert assembly code to machine code with the use of an assembler or a compiler, though we usually think of compilers in association with high-level programming language that are abstracted further from the CPU instructions.


Building a complete program involves writing source code for the program in either assembly or a higher level language like C++. The source code is assembled (for assembly code) or compiled (for higher level languages) to object code, and individual modules are linked together to become the machine code for the final program. In the case of very simple programs the linking step may not be needed. In other cases, such as with an IDE (integrated development environment) the linker and compiler may be invoked together. In other cases, a complicated make script or solution file may be used to tell the environment how to build the final application.

There are also interpreted languages that behave differently. Interpreted languages rely on the machine code of a special interpreter program. At the basic level, an interpreter parses the source code and immediately converts the commands to new machine code and executes them. Modern interpreters are now much more complicated: evaluating whole sections of source code at a time, caching and optimizing where possible, and handling complex memory management tasks.

One final type of program involves the use of a runtime-environment or virtual machine. In this situation, a program is first pre-compiled to a lower-level intermediate language or byte code. The byte code is then loaded by the virtual machine, which just-in-time compiles it to native code. The advantage here is the virtual machine can take advantage of optimizations available at the time the program runs and for that specific environment. A compiler belongs to the developer, and therefore must produce relatively generic (less-optimized) machine code that could run in many places. The runtime environment or virtual machine, however, is located on the end user's computer and therefore can take advantage of all the features provided by that system.

Joel Coehoorn
  • 399,467
  • 113
  • 570
  • 794
  • 33
    +1: nice, but somewhat simplifying answer - not all assembly instructions are translated 1:1 to machine instructions, and object files may also contain other data (relocation information, symbol tables, ...) – Christoph Jan 21 '09 at 20:25
  • 7
    Added a weasel word for your first issue, edited to make the 2nd clearer. – Joel Coehoorn Jan 21 '09 at 20:35
  • 4
    @Christoph: you say "not all assembly instructions are translated 1:1 to machine instructions" please give an example. – Olof Forshell Feb 22 '11 at 17:47
  • 7
    @Olof: RISC architectures sometimes provide an assembly-level virtual instruction set - eg MIPS pseudo-instructions ( http://en.wikipedia.org/wiki/MIPS_architecture#Pseudo_instructions ) – Christoph Feb 23 '11 at 19:39
  • Is any extra overhead whatsoever created by using assembly, instead of machine code? I just read somewhere that there isn't and want to confirm that. You said something in this answer that is almost a "yes", but not quite there. – Panzercrisis Mar 22 '13 at 12:52
  • 2
    @Panzercrisis Machine code is the only code that ever runs. Before assembly code can run on a computer, an assembler program must first convert it to machine code. In that context, your question has no meaning, because at run-time they are the same thing. – Joel Coehoorn Mar 22 '13 at 14:53
  • 1
    In other words, does going through assembly and an assembler create overhead in the finished product, as does going through a high-level programming language and a compiler? Does extra, unncessary machine code get put in by the assembler, in the same way that extra, unecessary assembly code gets put in by a compiler? – Panzercrisis Mar 22 '13 at 20:43
  • 4
    @Panzercrisis Nothing get's added by the assembler. It's a directly translation of what you wrote to actual machine instructions. And I wouldn't call the extra code put in by compilers "unecessary" – Joel Coehoorn Mar 22 '13 at 20:54
  • @JoelCoehoorn Is it valid to say that the machine code is a collection of the object code files for your design? – ha9u63a7 Jun 27 '13 at 08:08
  • @hagubear From a certain point of view, but it's not the way I'd normally think of it. – Joel Coehoorn Jun 27 '13 at 13:25
  • @JoelCoehoorn what creates the machine code and object code? Is it the assembler, compiler, or something else? The wiki page for assembly language says the assembler creates object code, but then the wiki page page for object code says the compiler creates object code. – cokedude Apr 26 '15 at 15:55
  • 1
    @cokedude both an assembler and compiler can create object code. If you start from assembly code, you use an assembler. If you start from a higher level language, you use a compiler. – Joel Coehoorn Apr 26 '15 at 21:28
  • @JoelCoehoorn what about the machine code? What creates the machine code? I'm having trouble understanding the differences machine code and object code. – cokedude Apr 26 '15 at 23:46
  • @cokedude It's usually the linker or some dynamic loader. It takes object code and fills these gaps with actual addresses in memory. It can also relocate the object code to some other address. – SasQ Apr 04 '16 at 01:44
  • - - everything on a computer is binary. – Celeritas Jul 27 '16 at 03:39
  • so object code is more low level than assembly code? And object code is somewhat an incomplete machine code? Also don't you have to convert assembly code to object code then machine code? Or you can convert assembly to machine code directly? – mfaani Mar 12 '23 at 14:06
151

The other answers gave a good description of the difference, but you asked for a visual also. Here is a diagram showing they journey from C code to an executable.

Archy Will He 何魏奇
  • 9,589
  • 4
  • 34
  • 50
Graphics Noob
  • 9,790
  • 11
  • 46
  • 44
  • 7
    I find this really helpful, but it is missing the "Machine code" label – Alexx Roche Aug 06 '13 at 09:12
  • So when it's at the executable code level, is that equivalent to machine code? – CMCDragonkai Dec 14 '14 at 06:39
  • 4
    In the context of this diagram, the "object code" is the machine code. – Graphics Noob Dec 14 '14 at 23:36
  • 12
    Actually, both the object code and executable code are machine codes. the difference is that object code is not the completed program. It needs to be combined with other helper library/module codes as indicated in the diagram to form a complete executable program/code. – okey_on Mar 26 '16 at 20:24
  • @okeyxyz at what level would it be correct to say it is directly executed by the processor? After the assembler, after the linker, after the loader, after it gets converted to microcontroller? – Celeritas Jul 27 '16 at 03:45
58

Assembly code is a human readable representation of machine code:

mov eax, 77
jmp anywhere

Machine code is pure hexadecimal code:

5F 3A E3 F1

I assume you mean object code as in an object file. This is a variant of machine code, with a difference that the jumps are sort of parameterized such that a linker can fill them in.

An assembler is used to convert assembly code into machine code (object code) A linker links several object (and library) files to generate an executable.

I have once written an assembler program in pure hex (no assembler available) luckily this was way back on the good old (ancient) 6502. But I'm glad there are assemblers for the pentium opcodes.

endolith
  • 25,479
  • 34
  • 128
  • 192
Toon Krijthe
  • 52,876
  • 38
  • 145
  • 202
  • 90
    No no no no. Machine code is not hex code. it's pure binary. Hex code is just a convenient representation of binary. – Breton Jan 21 '09 at 21:05
  • 71
    If we are really going into extremes, its not binary, it is an amount of stored electricity in a circuit. ;-) – Toon Krijthe Jan 21 '09 at 21:13
  • @Breton: it is certainly possible to write a program binary using a hex editor. Of course, the hex digits entered will correspond to the intended instructions and data. Messy but doable. I've used it for patching binary (firmware) images with bug fixes. I prefer to have a bit more than just the hex editor though :-) – Olof Forshell Feb 22 '11 at 17:54
  • 22
    Yes of course. There is a relationship between the hexidecimal, and what you would call "Machine Code", but it's not quite accurate to say hexidecimal *is* machine code. That's all I'm trying to say. – Breton Feb 23 '11 at 03:41
  • 1
    I used to program Z80 machine code in decimal. In those days I never understood the point of hex. I never programmed in binary but I did write several disassemblers and those of course do have to deal in the ones and zeros that make up the instructions. – hippietrail Apr 30 '11 at 11:45
  • 12
    @Breton In that sense, there is no such thing as "hex code" right? "Hex code" is just a way of viewing the machine code. You can view the machine code in hexadecimal, binary, octal, decimal, or however you like. Also again in that sense, there is no "binary code" as well. Again, "binary code" is just a way of viewing the machine code. – Utku Nov 15 '15 at 11:22
  • 11
    @Breton What you say does not really make much sense.. Binary is a way of representation, just like hex. If it is not hex, it is not binary either. – Koray Tugay Mar 09 '16 at 06:12
  • 3
    @hippietrail The point of hex is that it is a neat shorthand notation for binary. Each hexadecimal digits maps directly into 4 bits: 0000=0, 0001=1, 0010=2, 0011=3, 0100=4, 0101=5, 0110=6, 0111=7, 1000=8, 1001=9, 1010=A, 1011=B, 1100=C, 1101=D, 1110=E, 1111=F. So you can encode each nibble of a byte by one hex digit, two hex digits for a byte. If you know the mapping, you can pretty much figure out the binary bits by just looking at the hex digits. It's not that easy with decimal, though, because for decimal digits the mapping isn't 1:1. That's why hex is preferable. – SasQ Apr 04 '16 at 01:50
  • @SasQ: Yes eventually I came to understand the point of hex, but 12-year-old-me had no use of it (-: – hippietrail Apr 05 '16 at 08:38
  • 77 should be 4D (64+13), was the text representation intentionally gibberish, or is there a reason the specific hex is there – MrMesees Apr 17 '17 at 14:30
23

8B 5D 32 is machine code

mov ebx, [ebp+32h] is assembly

lmylib.so containing 8B 5D 32 is object code

Quassnoi
  • 413,100
  • 91
  • 616
  • 614
  • Hex isn't really machine code, just a easier way of representing it – madladzen Nov 02 '20 at 17:29
  • I think it's just binary getting translated into different amounts of electricity, I'm not sure. I just know hex isn't actual machine code, it's like representing C++ with the English language – madladzen Nov 04 '20 at 23:24
  • @madladzen Actually you can say hex is machine code.. hex, binary, they are actually the same, simply saying. Well, you can represent it with decimal, though it's not convenient since it is not 2^N. – starriet Sep 15 '21 at 02:11
12

Source code, Assembly code, Machine code, Object code, Byte code, Executable file and Library file.

All these terms are often very confusing for most people for the fact that they think they are mutually exclusive. See the diagram to understand their relations. The description of each term is given below.


Types of code


Source code

Instructions in human readable (programming) language


High-level code

Instructions written in a high level (programming) language
e.g., C, C++ and Java programs


Assembly code

Instructions written in an assembly language (kind of low-level programming language). As the first step of the compilation process, high-level code is converted into this form. It is the assembly code which is then being converted into actual machine code. On most systems, these two steps are performed automatically as a part of the compilation process.
e.g., program.asm


Object code

The product of a compilation process. It may be in the form of machine code or byte code.
e.g., file.o


Machine code

Instructions in machine language.
e.g., a.out


Byte code

Instruction in an intermediate form which can be executed by an interpreter such as JVM.
e.g., Java class file


Executable file

The product of linking proccess. They are machine code which can be directly executed by the CPU.
e.g., an .exe file.

Note that in some contexts a file containing byte-code or scripting language instructions may also be considered executable.


Library file

Some code is compiled into this form for different reasons such as re-usability and later used by executable files.

Community
  • 1
  • 1
Bertram Gilfoyle
  • 9,899
  • 6
  • 42
  • 67
  • 1
    I would argue that not all assembly is truly *source* in the strictest sense of code written and/or maintained by humans. Often it's machine-generated from source, and never intended for human consumption (for example, gcc really does create asm text that it feeds to a separate assembler, instead of having a built-in assembler inside the `cc1` executable). I think the asm circle should stick out the left side of the "source" circle, because some asm is just asm, not source. It's never *object* code, of course, but some asm is a step on the way from source to object files. – Peter Cordes Apr 08 '19 at 23:25
  • @PeterCordes Thank you very much for the comment. I wasn't aware of what you said about the working of gcc. However, I am afraid if I can agree with you completely. What I mean is, source code is something written using a human-readable programming language. It may or may not be written or maintained by humans. I am sure that you will be aware of transcompilers. From your point of view, to which category will you put the product of such a compiler? Source code or something else? Please correct me if i'm wrong. Further comments are always welcome. – Bertram Gilfoyle Apr 09 '19 at 13:07
  • Machine-generated code in any language is often not considered "source". e.g. a GUI builder might emit a bunch of C++ code that implements the button handlers, and while you *could* edit that by hand, it's not a good starting point for something maintainable. Same with compiler-generated asm text. Or for example, the output of the C preprocessor is also C, but not maintainable C. So yes, your Venn diagram could have a 3rd category: machine-generated text as an intermediate product during compilation from true human-edited source to object code. – Peter Cordes Sep 25 '20 at 19:21
  • But another definition of the word "source" could include any text language. You certainly *can* use compiler output as the starting point of a hand-written asm function, just by adding some comments, giving the labels meaningful names, etc. So there's no hard division. (IDK if I missed your earlier comment a year ago, just happened to see it now.) – Peter Cordes Sep 25 '20 at 19:23
10

One point not yet mentioned is that there are a few different types of assembly code. In the most basic form, all numbers used in instructions must be specified as constants. For example:

$1902: BD 37 14 : LDA $1437,X
$1905: 85 03    : STA $03
$1907: 85 09    : STA $09
$1909: CA       : DEX
$190A: 10       : BPL $1902

The above bit of code, if stored at address $1900 in an Atari 2600 cartridge, will display a number of lines in different colors fetched from a table which starts at address $1437. On some tools, typing in an address, along with the rightmost part of the line above, would store to memory the values shown in the middle column, and start the next line with the following address. Typing code in that form was much more convenient than typing in hex, but one had to know the precise addresses of everything.

Most assemblers allow one to use symbolic addresses. The above code would be written more like:

rainbow_lp:
  lda ColorTbl,x
  sta WSYNC
  sta COLUBK
  dex
  bpl rainbow_lp

The assembler would automatically adjust the LDA instruction so it would refer to whatever address was mapped to the label ColorTbl. Using this style of assembler makes it much easier to write and edit code than would be possible if one had to hand-key and hand-maintain all addresses.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • 1
    +1. One more additional point: there are also different assembly language [syntaxes](http://en.wikipedia.org/wiki/X86_assembly_language#Syntax), most famous being [Intel and AT&T](http://www.imada.sdu.dk/Courses/DM18/Litteratur/IntelnATT.htm). – informatik01 Jan 07 '14 at 15:40
  • 1
    @informatik01: How about Intel 8080 mnemonics vs Zilog Z80? I would guess that predates the Intel vs AT&T syntax war. – supercat Jan 07 '14 at 16:39
  • Not arguing, I just mentioned that aspect (different syntax) and gave an example of two most popular/well known/famous syntaxes. – informatik01 Jan 07 '14 at 16:46
3

Assembly is short descriptive terms humans can understand that can be directly translated into the machine code that a CPU actually uses.

While somewhat understandable by humans, Assembler is still low level. It takes a lot of code to do anything useful.

So instead we use higher level languages such as C, BASIC, FORTAN (OK I know I've dated myself). When compiled these produce object code. Early languages had machine language as their object code.

Many languages today such a JAVA and C# usually compile into a bytecode that is not machine code, but one that easily be interpreted at run time to produce machine code.

Jim C
  • 4,981
  • 21
  • 25
  • Your comment about Java and C# - both use Just In Time compilation so that bytecodes are not interpretted. C# (.NET generally) compiles to Intermediate Language (IL) which is then JITed into native machine language for the target CPU. – Craig Shearer Jan 21 '09 at 20:52
2

Assembly code is discussed here.

"An assembly language is a low-level language for programming computers. It implements a symbolic representation of the numeric machine codes and other constants needed to program a particular CPU architecture."

Machine code is discussed here.

"Machine code or machine language is a system of instructions and data executed directly by a computer's central processing unit."

Basically, assembler code is the language and it is translated to object code (the native code that the CPU runs) by an assembler (analogous to a compiler).

rbrayb
  • 46,440
  • 34
  • 114
  • 174
2

I think these are the main differences

  • readability of the code
  • control over what is your code doing

Readability can make the code improved or substituted 6 months after it was created with litte effort, on the other hand, if performance is critical you may want to use a low level language to target the specific hardware you will have in production, so to get faster execution.

IMO today computers are fast enough to let a programmer gain fast execution with OOP.

Alberto Zaccagni
  • 30,779
  • 11
  • 72
  • 106
0

The source files of your programs are compiled into object files, and then the linker links those object files together, producing an executable file including your architecture's machine codes.

Both object file and executable file involves architecture's machine code in the form of printable and non-printable characters when it's opened by a text editor.

Nonetheless, the dichotomy between the files is that the object file(s) may contain unresolved external references (such as printf, for instance). So, it may need to be linked against other object files.. That is to say, the unresolved external references are needed to be resolved in order to get the decent runnable executable file by linking with other object files such as C/C++ runtime library's.