0

I'm trying to do an assignment linking c and nasm. The C program sends me a string representing a 32-bit number (e.g. "000...0011"). I need to print its value, as a string (the string "3" for the example above), using C's printf with %s.

Note: to make life easier, I'll ignore the case of negative numbers for now.

I'm new to nasm and so I pretty much have no clue what goes wrong where. I tried converting the given string to a number, store it somewhere, and then have it printed, but this simply prints the binary representation.

Here's my code:

.rodata section
    format_string: db "%s", 10, 0   ; format string

.bss section
    an: resb 12     ; enough to store integer in [-2,147,483,648 (-2^31) : 2,147,483,647 (2^31-1)]
convertor:
    push ebp
    mov ebp, esp    
    pushad          

    mov ecx, dword [ebp+8]  ; get function argument (pointer to string)

    mov eax, 1                  ; initialize eax with 1 - this will serve as a multiplier
    mov dword [an], 0           ; initialize an with 0
ecx_To_an:
        cmp eax, 0                  ; while eax != 0
        jz done                     ; do :
        shr dword [ecx], 1          ;
        jnc carry_flag_not_set      ; if carry isn't set, lsb was 0
        add [an], eax               ; else - lsb was 1 - an += eax 
carry_flag_not_set:
        shl eax, 1                  ; eax = eax*2
        jmp ecx_To_an               ; go to the loop
done:
    push an         ; call printf with 2 arguments -  
    push format_string  ; pointer to str and pointer to format string
    call printf

I don't see how it can be possible to print the int value, given that I can't change the %s argument that is given to printf.

Help will be much appreciated.

Jester
  • 56,577
  • 4
  • 81
  • 125
J. Doe
  • 35
  • 5
  • 1
    You will need to convert to decimal digits. Plenty of examples for that. – Jester Mar 26 '19 at 00:50
  • @Jester I tried looking up many other questions and sites, but I really don't know much about asm yet, so I don't know what I'm looking for. Could you point me at some things relevant? – J. Doe Mar 26 '19 at 00:55
  • And by the way, if I need to convert it to decimal digits, is the whole idea of converting it to a number incorrect? – J. Doe Mar 26 '19 at 00:56
  • No, it's easier if you first convert to number (binary) as you have done and go from there to decimal digits. – Jester Mar 26 '19 at 01:01
  • 3
    Sample decimal conversion codes: [1](https://stackoverflow.com/a/28524951/547981) [2](https://stackoverflow.com/a/42014324/547981) [3](https://stackoverflow.com/a/27654888/547981) You can find plenty more. – Jester Mar 26 '19 at 01:11
  • 3
    You never increment ecx, so you are only looking at the first 4 bytes of the string. – prl Mar 26 '19 at 01:11
  • @prl This is what I intended. As stated in the question, the string represent a 32bit number. – J. Doe Mar 26 '19 at 01:36
  • @Jester I literally spent 15 minutes trying to understand, but I don't. Either these pieces of code are terribly written, or I am just stupid. Either way, I can't seem to understand how to add these and apply them to my code. – J. Doe Mar 26 '19 at 01:38
  • Let me clarify - I can't understand which is the equivalent to `an` in my code (which holds the binary value), and where is their results stored. – J. Doe Mar 26 '19 at 01:41
  • 1
    Well, can you write it in C? Obviously without using standard library functions. The logic is just repeated division by 10. PS: the first one I linked literally says in the comments ... _"Convert EAX to ASCII and store it onto the stack"_ and _"Pointer to the first ASCII digit"_ Heck, it even prints the string although using `write()` not `printf("%s", ...)` but those are basically equivalent. – Jester Mar 26 '19 at 01:48
  • @J.Doe: your code doesn't have a 32-bit binary number as input, it only has a base-2 ASCII string (a serialization format for binary numbers that uses 1 byte per bit, and is thus up to 32 *bytes* long). See [NASM Assembly convert input to integer?](//stackoverflow.com/a/49548057) for a base-10 -> binary integer input function, easily adapted for base 2. Some more optimizations are maybe possible with `adc same,same` to shift in a new bit from CF , but `total = total*2 + next_digit` is straightforward with `lea eax, [eax*2 + ecx]` or whatever registers you pick. – Peter Cordes Mar 26 '19 at 03:08
  • Your loop might work correctly (but slowly) if you did have a binary integer in `[ecx]`. Or better, in `ecx`, no need to ever store it to memory. Keep stuff in registers in loops, it's much faster. – Peter Cordes Mar 26 '19 at 03:11
  • @Jester I can't write it in C. But I still don't get how am I to make a string out of it. Getting the number in `an` is one thing, but to make a string value off of it is the part I honestly don't understand, even though I guess it's in those links you attached. – J. Doe Mar 26 '19 at 19:49
  • @PeterCordes I do understand this part "your code doesn't have a 32-bit binary number as input, it only has a base-2 ASCII string... and is thus up to 32 bytes long" and so I can now see why my loop isn't working as it's supposed to. As for the rest, I didn't understand a single word you said. Also, right now getting the code to be faster is the least of my concerns, as you can see, I can't even get it done right. And by the way, I don't know how to tell the difference between registers, and what I've used. I was pretty sure `an` is a register name. – J. Doe Mar 26 '19 at 19:57
  • @PeterCordes I realized why prl said what he said here earlier. This fixes one bug. But the problem I explained in my question still remains unanswered. – J. Doe Mar 26 '19 at 20:00
  • 1
    Keeping your data in registers is simpler, as well as more efficient. Since string->integer will leave the result in a register, you should keep it there instead of storing to memory. (Also, finding ways to use fewer instructions and/or cheaper instructions is the fun part of writing in asm.) But anyway, I think my answer that I linked about base10string-> integer is pretty understandable, and easy to adapt for base2string -> integer. Then you feed that integer to an integer->base10string function. Jester linked 3 existing Q&As about that part. – Peter Cordes Mar 26 '19 at 20:10
  • Or my answer on [How do I print an integer in Assembly Level Programming without printf from the c library?](//stackoverflow.com/a/46301894) explains how / why the repeated-division algorithm for integer -> base10string works. (Just leave out the part at the end that actually prints it, if you just want the string.) – Peter Cordes Mar 26 '19 at 20:14
  • @PeterCordes I see exactly what you mean, but I don't know how this is supposed to work. I've literally been mutating code pieces for the better part of this day, and nothing works. When I tried simply copy pasting your piece, it didn't even compile (not sure if that's the right terminology) - many errors arose. When I try to read it, I don't even see your condition to stop the loop. To me it seems like you stop only when you get invalid input, which is really not the case. Same goes for the pieces Jester sent here. So as I said before, I still can't get this to work... – J. Doe Mar 26 '19 at 21:10
  • My code is for x86-64. You can port it to 32-bit by simply changing `r` to `e` in all the register names, like `rcx` becomes `ecx`. (That's not sufficient in the general case, but it does just work for this function because I wrote it using only 32-bit operand-size.) I updated the answer to say so. As explained in the answer, the loop condition is seeing a non-digit. Your command line arg will be stored as a 0-terminated string, so this is exactly what you want. – Peter Cordes Mar 26 '19 at 21:58
  • @PeterCordes Alright, I finally got this part to work (Hoorah!). However, the part which Jester tried to help me with, is a disaster. – J. Doe Mar 26 '19 at 23:00
  • I can't seem to understand what they've done in their solutions so I can't apply it to my code. I've tried to make a function out of it but I failed miserably (Sig faults). I'm guessing it's because the solutions mess with `esp`, and therefore the return is corrupted (or perhaps I'm just uttering non sense). PS - I used the first link in Jester's message, the others are totally gibbrish to me. – J. Doe Mar 26 '19 at 23:01
  • [How do I print an integer in Assembly Level Programming without printf from the c library?](//stackoverflow.com/a/46301894) explains the int->base10string algorithm, and again can be trivially ported to 32-bit. – Peter Cordes Mar 26 '19 at 23:03
  • @PeterCordes This goes to a sig fault as well (in the return statement, using gdb it says "0x00000001 in ?? ()"). I tried copying the entire thing, except the part between `;;; rsi points to the first digit` and `ret`. Any suggestion? – J. Doe Mar 26 '19 at 23:12
  • oh, yeah I forgot that there is a dependency on 64-bitness. push ecx is only 4 bytes, vs. `push rcx` being 8. So at the end, you need `add esp, 20` not 24. (Updated my answer there to mention that.) You would get a segfault if you `ret` with ESP pointing to a number like `0x00000001`, because that's not a valid code address. – Peter Cordes Mar 26 '19 at 23:19
  • @PeterCordes Ok, no sigmentation fault now. But I don't know how to get the value into `an` (I know you might have said I should work with registers, but this is a part of the assignment specs). I tried `lea an, esi` or `mov [an], esi` (and few similar variants, but it seems like I always print garbage. `an` is defined with `an: resb 12` in `.bss`. P.S - are these pieces of code suppose to work only for unsigned ints? – J. Doe Mar 26 '19 at 23:36
  • You need to copy the string from the tmp buffer into `an`, or have the int->string code start `mov edi, an + 12` (a pointer to the end of the buffer). And yes, my function are for unsigned integers. To handle signed inputs, check for negative and then convert the absolute value, prepending a `-` if it was negative. (And BTW, it's *segmentation* not *sigmentation*. Perhaps you got mixed up with SIGSEGV, where SIG stands for signal - segmentation violation.) – Peter Cordes Mar 26 '19 at 23:53
  • **Moderator Note:** Please reserve comments for soliciting clarification or suggesting improvements. They're not meant to be "mini answers". When the comments get this long, they become messy and hard to understand. If you have advice or suggestions, please post an answer. – Cody Gray - on strike Mar 26 '19 at 23:59
  • @CodyGray I'm sorry for this long chain of comments, but this question appeared far more vast than I expected it to, I know we shouldnt. I hope this won't take more than a few more comments, and perhaps Peter will want to sum it up eventually, so I can mark this as answered. – J. Doe Mar 27 '19 at 00:05
  • @PeterCordes What is the "tmp buffer"? Or what do you mean by the `mov edi, an + 12`? Eventually I think I need to have the pointer to the string (first letter) in `an`, and this command only changes edi. And about the segmentation part, you are absolutely correct, I know its segment, but I mixed it up with sigsegv. – J. Doe Mar 27 '19 at 00:09
  • I should have said `mov esi, an+12`, because the code in my loop uses `esi` as the pointer to its temporary buffer, not edi. I meant doing that *instead* of `mov esi, esp`, so the convert loop would store directly into `an` instead of needing to copy later. `an+12` is address of the last byte in the buffer, computed at assemble time (actually link time). Anyway no, `an` is a 12-byte buffer that needs to hold the ASCII bytes, not a pointer to characters somewhere else. Remember you're printing it with `printf("%s\n", an)`, so `an` has to *be* a char array, not hold a pointer to an array. – Peter Cordes Mar 27 '19 at 00:17
  • @PeterCordes I tried adding it to your code, right before the `add esp,20` and it doesn't seem to work. When I run it, it prints an empty string (for any input). So I guess this doesn't change `an`, or perhaps I don't understand once again. Why would this command change the value of `an`? – J. Doe Mar 27 '19 at 00:47
  • I said to put it *before* the loop, so the loop converts into `an` when it stores to `[esi]`. But that won't line the start of the string up with the start of `an`, so it doesn't really solve the problem. Instead, copy bytes from the temporary buffer on the stack to `an`. – Peter Cordes Mar 27 '19 at 00:53
  • @PeterCordes Hey, sorry for disappearing for so long. As for the assignment - I failed it. I will be talking to my teacher and have him guide me thoroughly because I feel like I lack basic understanding of whatever is going on here. Anyway I came to say thank you for all your help. I genuinely appreciate that you bared with me while I'm trying to figure it all out. Thanks alot! – J. Doe Apr 02 '19 at 22:25

0 Answers0