7

I just started learning Assembly language. In java, if we have an Array, we can always use array.length to get its length. Is there such a thing in assembly? If so, can someone please guide me here?

Edit:

My apologies, I know assembly doesn't have arrays, I was trying to simplify things.

What I meant was, if for example I have a variable

data DB 1,2,3,5,7,8,9,10

Given that DB can contain any amount of elements, how can i check the total variable it contains?

Something like java, where use int array to store this

int data = {1,2,3,4,57,8,9,10};

We can just data.length to find the total amountt of elements.

Mike
  • 81
  • 1
  • 1
  • 3

8 Answers8

14

The best way to answer this is to use C examples. In C, there are two ways of keeping track of the length of an array:

  1. You store a variable telling you how long you made the array.
  2. You do what strings do and have the last element as 0. Then, you can implement a "string" length function that loops over the array until it finds zero.

For the first example, depending on what assembler you're using, you might be able to use some tricks. For example, in nasm you can do this:

SECTION .data       

msg:    db "Hello World",10,0  ; the 0-terminated string.
len:    equ $-msg              ; "$" means current address.

As you can see, we use the equ operator to get nasm to calculate the difference between the current address and the start of msg which should equal its length. Alternatively, you could just write the length in there as a digit.

For the second case, you could easily write a small function to do it. Roughly speaking, if you:

SECTION .text

global _mystrlen

_mystrlen:

    push    ebp        ; conform to C calling conventions.
    mov     ebp, esp

    xor     eax, eax
    lea     ecx, [esp+8]   ; load the start of the array into ecx
    jecxz   end            ; jump if [ecx] is zero.

loop:
    add     eax, 1     ; could use inc eax as well. 
    add     ecx, 4     ; always increment by (sizeof(int)). Change as appropriate
    mov     edx, [ecx] ; load ecx
    cmp     edx, 0     ; compare with zerp
    je      end        ; if ecx is zero, we're done.
    jmp     loop       ; if ecx isn't zero, loop until it is.

end:
    leave              ; restore stack frame
    ret                ; return. eax is retval

Note that I haven't tested that. It's just to give you an idea.

Edit I've tested the x86_64 version on Linux, using rdi as param1, passing in int arr[10] = {1,2,3,4,5,6,7,8,9,0};. Returns 9 as expected. Note that on Linux the underscore preceding mystrlen is unnecessary.

  • Though the `$` feature applies indeed, it is worth to note that length zero-terminated-strings (in contrast to Pascal strings) aren't calculated that way. The string itself is instead iterated looking for a zero byte and the number of cycles is returned. This method, however cannot be used with arrays since they should allow for zero values too. For them there's really no other way than keeping track of their length in a variable. – Powerslave May 26 '13 at 13:41
7

Assembly is a lot more lowlevel than Java. This, among other things, means there's no such thing as an "array". At least in the safe Java form in which you know it.

What would be equivalent to an array is allocating a chunk of memory, and treat it as an array. The length and such you'll have to manage yourself, though, as all you have is a chunk of memory containing your data. If you want to store any metadata, such as length, you'll have to do that yourself.

Arrays as you know them in Java contain metadata such as length, and do bounds checking. These do the same thing you'll have to do, only they hide it so you won't have to worry about those things.

I suggest you take a look at the following for an introduction on how one would commonly create and use what is equivalent to an array in assembly:

Sebastian Paaske Tørholm
  • 49,493
  • 11
  • 100
  • 118
  • mmm Sebastian, because i'm expecting a random amount of integer to be given by a user, therefore i'm trying to code my code in a more dynamic way. i tried – Mike Jan 31 '11 at 15:40
  • If you need a variable number of elements, you could do a few things. You could use a [linked list](http://en.wikipedia.org/wiki/Linked_list), or you could do an array in chunks, each with a size as their first element, and a pointer to the next chunk as their second. (Mixture of linked list and array) – Sebastian Paaske Tørholm Jan 31 '11 at 15:45
  • Bad link at http://www.arl.wustl.edu/~lockwood/class/cs306/books/artofasm/Chapter_5/CH05-2.html#HEADING2-4; if you can find a page with the same contents please replace the link, or else remove it – Hawken Oct 21 '12 at 04:53
2

There is no such thing as an array in assembly (as far as I know). You are free to invent how an array works if you wish.

If you're reading the assembly generated by a compiler then you will have to ask specifically about that compiler.

One way of doing this would be to allow the first byte of the array to store the length of each element. Another way would be to null terminate the array (this is generally how strings are maintained).

Joe Phillips
  • 49,743
  • 32
  • 103
  • 159
2

You could use the following program (ATT assembly syntax) to determine the number of elements in an array:

.section .data
array:
    .long 3,5,8             # create an array of 32-bit integers
    arrsize = . - array     # in bytes (assemble-time constant, not stored)
    arrlen = (. - array)/4  # in dwords

.section .text
.globl _start
_start:

    mov   $arrlen, %edi
    mov   $60, %eax      # __NR_exit  in asm/unistd_64.h
    syscall              # sys_exit(arrlen)

Assemble and link the program:

gcc -c array.s && ld array.o
  #or
gcc -nostdlib -static array.s

The result is 3, as expected:

>./a.out            # run program in shell
>echo $?            # check the exit status (which is the number of elements in the array)
3
builder-7000
  • 7,131
  • 3
  • 19
  • 43
  • 1
    You don't need `div` to divide by 4! Use a right shift. Or better, use an assemble-time operator, like `len = (. - array) / 4`. Also, the 32-bit `int 0x80` ABI take the first arg in EBX, so your code does `sys_exit(4)`. I confirmed on my own desktop. If you were thinking of the 64-bit ABI for `syscall`, you'd use `movzbl %ax, %edi`, not `%si`. (Unless you're using a 32-bit install, `gcc -c` and `ld` will make 64-bit code, so [What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?](https://stackoverflow.com/q/46087730) also applies.) – Peter Cordes Oct 18 '18 at 02:30
  • Thanks for pointing out the problems with the original answer. I find the updated answer much clearer by avoiding the use of the `div` instruction. – builder-7000 Oct 18 '18 at 03:21
  • BTW, you original also depended on `%dx` being zero for `div` to not overflow and raise a #DE exception (causing Linux to deliver a SIGFPE). So it crashed if you linked it as a dynamic executable. But anyway, `shr $2, %edi` is what you should use for runtime unsigned division by 4 in cases where you want that, never `div`. (Also, 16-bit operand-size is generally slower, especially in 32 and 64-bit mode. See https://agner.org/optimize/) – Peter Cordes Oct 18 '18 at 03:35
1

I found this thread very enlightening on the subject: Asm Examples: Finding the size of an array with different data types

karlphillip
  • 92,053
  • 36
  • 243
  • 426
1
a db 10h,20h,30h,40h,50h,60h
n db n-a

in the above code 'a' is an array having 6 elements by default 'a' will point to the first element of the array and 'n' will be initialised just after the array. so value 'n-a' will correspond to the length of the array which we are storing in n. don't initialise other variables in between a and n. This might give you wrong results.

rishuverma
  • 79
  • 1
  • 7
0

MOV EAX,LENGTHOF data

Returns the number of items in array variable.

vengy
  • 1,548
  • 10
  • 18
0

I think this will help you....

.data

num db 2,4,6,8,10

.code

main proc
mov eax,0 ; initialize with zero
mov ax,lengthof num 

output=5

chiwangc
  • 3,566
  • 16
  • 26
  • 32