40

While in the search for the various differences in the meanings of source code, bytecode, assembly code, machine code, compilers, linkers, interpreters, assemblers and all the rest, I only got confused on the difference between bytcode and assembly code.

Particularly the introduction this wikipedia article to describe CIL confused me since it seems to use both terms (assembly code and bytecode) interchangeably making me think they might mean exactly the same.

jldupont
  • 93,734
  • 56
  • 203
  • 318
Aaron Gusman
  • 401
  • 1
  • 4
  • 3
  • If you have access to Andrew Tanenbaum's Structured Computer Organization book, he will have a technically correct definition of the two terms. – J. Polfer Nov 23 '09 at 14:57
  • See also e.g. [this answer](http://stackoverflow.com/a/2203296/874188) to a similar question about Java. – tripleee Sep 13 '15 at 15:02
  • See also http://stackoverflow.com/questions/17511931/what-exactly-is-bytecode – tripleee Sep 13 '15 at 15:05

7 Answers7

23

Assembly code normally does mean the human readable form of a machine's native language (the so-called machine language). Byte code on the other hand is normally a language that can be interpreted by a byte code interpreter — so it is not the processor's native language.

Why the confusion then? You can't compare Assembly language versus Byte code this way. Of course a byte code can also have an assembly code — meaning a human readable form of it, because "Assembly language" does not necessary mean that it is for a real machine — but it is a human readable form of some native language — for processors, this native language is the machine code — but you also can have assembly code of a pseudo-(or interpreted) machine like Bytecode.

See also: Assembly Language

Further distress comes of course — like you can see in all the discussion here — because IT people (also myself) tend to be lax in wording. "Assembly language" is often used when speaking about machine code. This of course is not totally correct, because Assembly Language is only the human readable form of some machine's code.

Wai Ha Lee
  • 8,598
  • 83
  • 57
  • 92
Juergen
  • 12,378
  • 7
  • 39
  • 55
  • What IT people do (hopefully) is to abstract from form. The mapping between machine code and assembly is performed to optimise for the intended audience, CPU or human. – rsp Nov 23 '09 at 15:51
3

I remember that since the begining of microcontrollers and microprocessors the word Assembly was used to designate the machine code in a human readable way. It seems to me that Microsoft has caused confusion by using the same word "Assembly" to name what would be the ByteCode produced by their dotNET Framework compilers. So in this case I'd say that what "Bytecode" means to the Java runtime is similar to what this new use of the word "Assembly" means for Microsoft dotNET runTime environment. Am I wrong to assume that?

Ade_Oliv
  • 51
  • 6
  • Pretty sure that conclusion about Java is incorrect (everything else is correct, though). IDK if Java has a name for collections of bytecode (in a `.jar` file?), but if it does I don't think it's called "an assembly". Maybe "package" or "library". There is stuff like http://maven.apache.org/plugins/maven-assembly-plugin/ that collects a bunch of stuff including documentation into an "assembly", in a similar sense to .net, but AFAIK unrelated. For manipulating Java bytecode (including at runtime via reflection?) there's [tag:java-bytecode-asm] for a Java package. – Peter Cordes Dec 07 '20 at 13:15
3

Assembly code normally is used to refer to code that, once compiled to Machine Code, can be executed by a CPU whilst bytecode in a virtual machine.

The source of confusion over CIL might be related to the fact that machine code for CPU X can be interpreted by a Virtual Machine running on CPU Y (for example).

Note that a Virtual Machine implementation can be crafted to interpret any machine code and/or bytecode: it is left to the developers and their aspiration (and time on their hands) ;-)

jldupont
  • 93,734
  • 56
  • 203
  • 318
  • 1
    Once again: Assembly code is not executed by a real CPU. What is executed is "Machine Code". Assembly code is the human readable form of a machine code (or in some cases: byte code). – Juergen Nov 23 '09 at 13:40
1

Assembled code is runnable on a CPU with a specific instruction set, while bytecode can be executed in a virtual machine (such as the Java runtime) on any CPU that can run the VM.

Matt Stephenson
  • 8,442
  • 1
  • 19
  • 19
  • "Assembled code" is also called "machine code" -- just for clarification. See link in my answer. – Juergen Nov 23 '09 at 14:01
  • When you have meant "Assembly Code" this of course is not machine code, since it must be "assembled" by an Assembler first -- the result than is machine code. – Juergen Nov 23 '09 at 14:03
0

assembler is a macro language. It's a set of instructions used to instruct the CPU or other device. It's translated in machine code which are readable instructions by the CPU.

Byte code s are instructions for the virtual machine to be interpreted and still need to get translated into machine code before being executed.

guyumu
  • 3,457
  • 2
  • 19
  • 18
  • 1
    assembler is not necessarily a macro language. Assembler in its basic form is just a human readable form of machine code. – Juergen Nov 23 '09 at 13:37
-1

Bytecode is mainly for platform independence and needs a virtual environment to run.

Assembly code is human readable machine code (at a bit upper level) that directly run by the CPU.

Bytecode is not machine/hardware specific (directly handling hardware) but assembly code is machine/hardware specific.

Wai Ha Lee
  • 8,598
  • 83
  • 57
  • 92
omkar
  • 71
  • 1
  • 6
  • As other answers have pointed out, you can have an assembly language for bytecode, i.e. a human-readable text version of bytecode. There's even an SO tag for [tag:java-bytecode-asm]. (So yes, there's often a distinction between asm for hardware machine code vs. other assembly languages.) – Peter Cordes Jan 13 '20 at 03:17
-3

Assembly code is (represents) the native code for the processor you are programming.

Bytecode is a term for the binary version of the "commands" that are compiled to be executed by an interpreter, or a virtual machine.

In essence bytecodes define the opcodes for a virtual processor, while assembly consists of the opcodes of a physical processor. (we will ignore the microcode inside the CPU for now :-) )

rsp
  • 23,135
  • 6
  • 55
  • 69
  • 3
    Not totally correct. Assembly code is the human readable form of a machine code. Machine code is the native code for a processor. – Juergen Nov 23 '09 at 13:38
  • @Juergen, you mix form and content, it is a matter of detail or context that decides between terminology like "cpu opcodes", "machine language", "assembler". In the context of OP's question they are equivalent imho. – rsp Nov 23 '09 at 14:56
  • 1
    IMHO you mix things up, since IT people are often lax in their wording, things get confusing. Assembly language is a human readable representation of some machine language (can also be a virtual machine -- e.g. bytecode) period. See the wikipedia article I linked in my answer. – Juergen Nov 23 '09 at 15:20