How does the assembler transform an ASCII string of 1s and 0s to the electrical pattern that matches that particular Opcode?

Question

For the sake of simplicity, Let's assume I just want to translate the following assembly instruction:

AND      R2, R2, #0

Let's again assume the binary representation for the Opcode to that instruction is:

0101010010100000

If I've understood correctly what I've read, the assembler translates the former to the later, but that string of 1s and 0s that results from the translation is not actual machine code, it is an ASCII string of 1s and 0s.

At the electrical level, this string is completely different to the electrical pattern that corresponds to the Opcode, so it does not behave like the Opcode it represents.

My question is: how does this particular string of 1s and 0s become the actual pattern of electricity that commands the ALU to perform the correct Opcode? How does the assembler transform that ASCII string of 1s and 0s to the electrical pattern that matches that particular Opcode?

I have searched high and low for a clear, beginner-friendly answer to this question, so I would appreciate any insights. Thanks a lot.

The simple answer is that the assembler does not use ascii strings of 1s and 0s, there is no point. It produces the raw binary directly. Even if it did, it's a fairly trivial transformation to turn the text representation into actual binary. Just iterate the digits, and shift in a 0 or 1 bit into your partial result as appropriate. — Jester, Jan 28 '21 at 00:05
Other than your misconception about an *ASCII* string of 0s and 1s existing, [How does an assembly instruction turn into voltage changes on the CPU?](https://stackoverflow.com/q/3706022) is about how bit-patterns in bytes are represented electrically, and how CPUs work. A singe byte has 8 bits, each of which is a separate 0 or 1. — Peter Cordes, Jan 28 '21 at 00:12
... and please don't bring 'electrical patterns' here - they have really nothing to do with Assembler. This is hardware stuff. Assembler translates opcode mnemonics into binary representations of opcodes and there its work ends. It knows nothing about electronics and/or electricity )) — tum_, Jan 28 '21 at 00:13
The file it reads the code from is ASCII, so it does have to deal with it. You said the transformation is tricial, as all textbooks do, but I don't see anything trivial about it. Saying the transformation is trivial does not answer my question. Thanks anyway for taking the time to comment. — simón, Jan 28 '21 at 00:13
I will bring electrical patterns into the question because that is precisely my concern. The assembler has everything to do with hardware — simón, Jan 28 '21 at 00:15
I did specifically describe the algorithm. Also, it does take ascii input but that is the assembler source. It does not process strings of binary numbers. In your simulated architecture the assembler could do `if (mnemonic =="AND") emit(0x5400 | (reg1 << 8) | (reg2 << 6) | imm);` — Jester, Jan 28 '21 at 00:19
So now you're also asking how the assembler turns the ASCII text `AND R2, R2, #0` into the 32-bit ARM instruction `e2022000`? It's not "trivial", but it's fairly straightforward for most ISAs, and is separate from how the bit-pattern is physically represented as voltage levels. — Peter Cordes, Jan 28 '21 at 00:19
I suggest you try to improve your question based on the comments given so far. You've already been told that the paragraph starting with "If I've understood correctly what I've read..." is wrong, you have not understood correctly. There is an [Edit](https://stackoverflow.com/posts/65929213/edit) button under the question for this. Chats in comments are not welcomed on this resource. — tum_, Jan 28 '21 at 00:22
@Jester, I appreciate your reply, but I do not understand your answer — simón, Jan 28 '21 at 00:23
@PeterCordes, that is what I am asking. I don't understand how that transformation takes place — simón, Jan 28 '21 at 00:24
A few of the linked duplicates include some details and examples of how assemblers work, e.g. [How do assembly languages work?](https://stackoverflow.com/a/6464159) has a specific example for a very simple MIPS instruction. That task of an assembler has nothing to do with voltage levels, just reading text input and writing binary output to a file. Not fundamentally different from a program that takes a checksum of an input file and writes out a 4-byte binary output, in terms of the system calls it makes (file read and write). It's just parsing and lookup tables of opcodes, and register codes — Peter Cordes, Jan 28 '21 at 00:33
@PeterCordes I understand that an assembler is just a translator, but a string of 1s is just a representation of the Opcode, it is not the actual machine instruction. Hoe does the transformation from the binary representation of the machine instruction happen? — simón, Jan 28 '21 at 02:02
All bit-patterns that exist in a CPU already are represented by voltage states. Simply producing the right binary value in a register and storing it to memory already has created those voltage states in memory, ready for the CPU to fetch them as instructions. For example, the 5 bytes hex bytes `68 65 6C 6C 6F` would be decoded by an x86 CPU as a `push 0x6f6c6c65` instruction, with the 0x68 = 0b1101000) being the opcode, and the next 4 bytes being immediate data to be pushed. Fun fact, that sequence of bytes also represents (as ASCII), the string `hello`. — Peter Cordes, Jan 28 '21 at 02:09
In your web browser's memory right now, there's a 5-byte sequence of bytes `hello`. Those byte are valid x86 machine code for that push instruction. (They're probably not in an executable page, and you might not be using an x86 CPU...) — Peter Cordes, Jan 28 '21 at 02:10
@PeterCordes I understand what you are saying, my confusion stems from the fact that the assembler gets its input in ASCII format, so the bit patterns in the cpu are different from the machine instruction. That's my confussion — simón, Jan 28 '21 at 02:14
*so the bit patterns in the cpu are different from the machine instruction* - That's not true. Bit-patterns in the CPU and in memory are identical to machine instructions. That's kind of the whole point of stored-program digital computers, that instructions are just another form of data. But you brought up ASCII text asm source lines. Those are *not* machine instructions, those are assembly-language text that merely represents them. — Peter Cordes, Jan 28 '21 at 02:28
i feel like you just want to show me that I'm wrong (which I already know I am), instead of clearing my confusion. I don't intend to prove I'm right, I just want to to understand, so I guess I'll have to figure it out on my own. I do appreciate the time you've taken to reply. — simón, Jan 28 '21 at 02:35
Didn't see your reply earlier since you didn't @peter notify me. Anyway, yeah that's a fair criticism, I see why you'd get that impression. 0s and 1s literally are low and high voltage levels on busses or in logic gates. I think your mental model is based on some misconceptions, like there are things / distinctions you think you need to understand which don't even exist. By tearing down those misconceptions, you can hopefully start from scratch reading some of those duplicate links with an open mind, instead of trying to fit them into a broken mental model. — Peter Cordes, Jan 28 '21 at 21:17
Perhaps [What Every Programmer Should Know About Memory](https://www.akkadia.org/drepper/cpumemory.pdf) will help: it shows circuit diagrams for SRAM and DRAM cells, which may help you connect voltage levels to logic states. (It also discusses caches and performance, and is [still relevant](https://stackoverflow.com/questions/8126311/what-every-programmer-should-know-about-memory) today.) — Peter Cordes, Jan 28 '21 at 21:20
@PeterCordes thanks again for taling the time to reply. I for sure have severe misconceptions, no doubt about it. However, I do believe that my main problem is that I haven't stated my question clearly enough. I cannot restate it in a comment, so I will take my time to edit this question or make a new one and link it to you, if you don't mind. I am genuinely interested in continuing this conversation. — simón, Jan 29 '21 at 13:12

How does the assembler transform an ASCII string of 1s and 0s to the electrical pattern that matches that particular Opcode?

0 Answers0