5

So I know the very basics, that a compiler turns source code into assembly code and an assembler turns assembly code into machine code. What I haven't been able to google properly though, is where are they actually located?

I'm assuming that the compiler is just somewhere on the hard drive since you can download compilers from the web and use them for various programming languages.

Is the assembler located on the hard drive, built into the operating system, or somewhere in the actual CPU? Is it possible to select a different assembler to use or do they come preinstalled into the hardware? Also is an assembler language specific like a compiler depending on how the assembly code originated, or is there only one assembler for an entire system?

Austin
  • 6,921
  • 12
  • 73
  • 138
  • 1
    compilers/assemblers are themselves software, and reside wherever they were installed on the computer. that also implies that you can have as many/few of each as you want. – Marc B Jun 16 '16 at 20:30
  • The assembler is more confusing for me because it seems that it has to exactly match a particular processor, or is this incorrect? If it does need to match a processor why isn't it just in a processor register or something? – Austin Jun 16 '16 at 20:31
  • 1
    no, it doesn't. e.g. you can trivially compile/assemble code for an ARM cpu running Android while working away on an Intel x86 cpu running Windows. compilers/assemblers are really just translation programs. they take code in language X(and sometimes Y,Z,P,Q,etc..) and output code in a different language. e.g. gnu GCC can compile code in a large number of languages, not just C and C++. – Marc B Jun 16 '16 at 20:33
  • Also is there a different assembler used for each programming language or usually just one for the system? – Austin Jun 16 '16 at 20:33
  • 1
    wrong way of looking at it - there can be an assembler for every COMPILER. e.g. if you're using visual c++ from microsoft, you get an MS compiler and assembler, ditto for an Intel compiler, ditto for gcc, etc.. and as long as the compiler and assembler agree on file formats, you don't even need to have one of each from the same vendor, they can be from entirely different vendors. – Marc B Jun 16 '16 at 20:34
  • 2
    Read the [assembly tag wiki](http://stackoverflow.com/tags/assembly/info), and also http://stackoverflow.com/questions/6463938/how-do-assembly-languages-work. For example, all x86 CPUs understand the same machine-language; that's what makes them x86. – Peter Cordes Jun 16 '16 at 20:35
  • 1
    Thanks for the replies, sort of beginning to understand. So an assembler isn't more specific to a particular system than a compiler? So you can download a compiler/assembler package for an x86 system and the assembler will be able to translate the assembly to machine code for all x86 processors? – Austin Jun 16 '16 at 20:43
  • 1
    @Jake Exactly. That's right. – fuz Jun 16 '16 at 21:10
  • Maybe a stupid question, but since the assembly has a direct translation into machine code, why isn't a single assembler just included with the operating system to translate assembly code generated from any language? – Austin Jun 16 '16 at 21:16
  • 1
    @Jake On may systems there is, but compilers are usually independent of the operating system and might want to use the same assembly dialect on all systems. – fuz Jun 16 '16 at 21:57
  • @Jake: separate assembler SW can be sold (profit). There were times when computers were sold with programming language in ROM memory (often an BASIC variant), and no OS. You could start to program your own code right from the box, but you had to buy OS separately for some commercial SW (most of the SW in such case did work on bare HW without additional OS). I did bought assembler (full IDE: coupled with editor and debugger) for ZX Spectrum. x86 assemblers produce machine code for all x86, but the x86 is family of CPUs - extended over time, 686 specific instruction will not work on 486, etc... – Ped7g Jun 17 '16 at 10:44

2 Answers2

5

You are overcomplicating this. A compiler takes text in some format and converts it, typically, to text in another format. Say for example a C compiler turns C into assembly. A compiler is just a program, nothing special about it just like your web browser is a program, the text editor you use for writing the programs is just a program, the command line/console if you use one is just a program. No magic.

An assembler is just a program that takes text in and typically outputs some form of binary file. There are many formats just like there are many binary formats for images and videos (bmp, jpg, png, gif, tiff, m4v, mpeg, etc). No magic, just a program that does a job like any of the ones listed above.

Same goes for the linker, it takes binary files in and typically outputs a binary file out.

These programs are, typically, like all other programs on your hard drive, or at least on a drive you have mounted and can access. Like the web browser and text editor, etc. Now to run them you need them "in the path" ideally or if part of some IDE then the IDE might not need them in the path it may know relative to itself where they are. Likewise the compiler which often calls the assembler and linker for you, might not need the path it may know/assume relative to where it is where they are. But they live on the file system like any other program/file but to execute them they need to be able to be found. And depending on the operating system and the installer for the toolchain there are often different choices and not one global rule.

There is no reason why you cant have as many different compilers and assemblers as you can fit on your filesystem, they are just programs like any other, so you have to find a place for them and have to have a way to run them. There is no reason to assume that any two compilers produce the same binary from the same source code, likewise there is no reason to assume that any assembler is able to assemble the output of any compiler. That is where the term toolchain comes from, a set of tools that link together in a chain, compiler outputs something the assembler in the toolchain knows how to deal with the assembler outputs something the linker knows how to deal with. You might have some cross compatibility among different toolchains/vendors, but that doesnt mean they have to that could either be by design, or dumb luck.

old_timer
  • 69,149
  • 8
  • 89
  • 168
0

Let's avoid the details and assume that the excution process is so easy (compilation=>assmbly=>CPU) to answer your main question : a compiler and an assembler are both programs (depending on your CPU architecture) so you could install them or choose the buit in with the system

For exemple in Gnu/linux systems : gcc is a compiler Nasm is an assembler

they are located on your hard drive for exemple under : /usr/bin/

an assembler can use CPU interuption and syscalls so he coud use the kernell to talk to the cpu to perform much complex actions.

you could use any compiler or assembler from this list (https://en.wikipedia.org/wiki/Comparison_of_assemblers) but keep in mind you should respect which architecture are you using ARM X86 X64 AMD..

Badr Bellaj
  • 11,560
  • 2
  • 43
  • 44
  • 1
    All programs need to use system calls to produce any observable effects (unless they have direct access to hardware). There's nothing special about an assembler in this respect. I can run an ARM assembler on my x86 desktop to assemble ARM source into ARM machine code. There's nothing special about the non-cross-compiling situation, even though it's the most common case. (That's a slight simplification: NASM only supports x87 80-bit floating point constants when running on hardware with native 80-bit floating point, because it doesn't have software floating point emulation for it) – Peter Cordes Jun 17 '16 at 00:33