How to check an "array's length" in Assembly Language (ASM),

Question

I just started learning Assembly language. In java, if we have an Array, we can always use array.length to get its length. Is there such a thing in assembly? If so, can someone please guide me here?

Edit:

My apologies, I know assembly doesn't have arrays, I was trying to simplify things.

What I meant was, if for example I have a variable

data DB 1,2,3,5,7,8,9,10

Given that DB can contain any amount of elements, how can i check the total variable it contains?

Something like java, where use int array to store this

int data = {1,2,3,4,57,8,9,10};

We can just data.length to find the total amountt of elements.

There are so many assembly languages, which one you are pointing out? — Abimaran Kugathasan, Jan 31 '11 at 15:22
There is no such thing as an inherent length of arrays. Arrays are just a chunk of memory you put things in. You don't have any way of knowing what size it is; you'll need to keep track of this yourself. — Sebastian Paaske Tørholm, Jan 31 '11 at 15:37
No one ever indicated what architecture the OP is asking about. — Jonathon Reinhart, Dec 19 '14 at 01:43

score 14 · Answer 1 · 2011-01-31T17:42:04.783

The best way to answer this is to use C examples. In C, there are two ways of keeping track of the length of an array:

You store a variable telling you how long you made the array.
You do what strings do and have the last element as 0. Then, you can implement a "string" length function that loops over the array until it finds zero.

For the first example, depending on what assembler you're using, you might be able to use some tricks. For example, in nasm you can do this:

SECTION .data       

msg:    db "Hello World",10,0  ; the 0-terminated string.
len:    equ $-msg              ; "$" means current address.

As you can see, we use the equ operator to get nasm to calculate the difference between the current address and the start of msg which should equal its length. Alternatively, you could just write the length in there as a digit.

For the second case, you could easily write a small function to do it. Roughly speaking, if you:

SECTION .text

global _mystrlen

_mystrlen:

    push    ebp        ; conform to C calling conventions.
    mov     ebp, esp

    xor     eax, eax
    lea     ecx, [esp+8]   ; load the start of the array into ecx
    jecxz   end            ; jump if [ecx] is zero.

loop:
    add     eax, 1     ; could use inc eax as well. 
    add     ecx, 4     ; always increment by (sizeof(int)). Change as appropriate
    mov     edx, [ecx] ; load ecx
    cmp     edx, 0     ; compare with zerp
    je      end        ; if ecx is zero, we're done.
    jmp     loop       ; if ecx isn't zero, loop until it is.

end:
    leave              ; restore stack frame
    ret                ; return. eax is retval

Note that I haven't tested that. It's just to give you an idea.

Edit I've tested the x86_64 version on Linux, using rdi as param1, passing in int arr[10] = {1,2,3,4,5,6,7,8,9,0};. Returns 9 as expected. Note that on Linux the underscore preceding mystrlen is unnecessary.

Though the `$` feature applies indeed, it is worth to note that length zero-terminated-strings (in contrast to Pascal strings) aren't calculated that way. The string itself is instead iterated looking for a zero byte and the number of cycles is returned. This method, however cannot be used with arrays since they should allow for zero values too. For them there's really no other way than keeping track of their length in a variable. — Powerslave, May 26 '13 at 13:41

Sebastian Paaske Tørholm · Answer 2 · 2014-12-19T01:24:21.730

7

Assembly is a lot more lowlevel than Java. This, among other things, means there's no such thing as an "array". At least in the safe Java form in which you know it.

What would be equivalent to an array is allocating a chunk of memory, and treat it as an array. The length and such you'll have to manage yourself, though, as all you have is a chunk of memory containing your data. If you want to store any metadata, such as length, you'll have to do that yourself.

Arrays as you know them in Java contain metadata such as length, and do bounds checking. These do the same thing you'll have to do, only they hide it so you won't have to worry about those things.

I suggest you take a look at the following for an introduction on how one would commonly create and use what is equivalent to an array in assembly:

The Art of Assembly Language Programming - Arrays

edited Dec 19 '14 at 01:24

answered Jan 31 '11 at 15:25

Sebastian Paaske Tørholm

49,493
11
100
118

mmm Sebastian, because i'm expecting a random amount of integer to be given by a user, therefore i'm trying to code my code in a more dynamic way. i tried – Mike Jan 31 '11 at 15:40
If you need a variable number of elements, you could do a few things. You could use a [linked list](http://en.wikipedia.org/wiki/Linked_list), or you could do an array in chunks, each with a size as their first element, and a pointer to the next chunk as their second. (Mixture of linked list and array) – Sebastian Paaske Tørholm Jan 31 '11 at 15:45
Bad link at http://www.arl.wustl.edu/~lockwood/class/cs306/books/artofasm/Chapter_5/CH05-2.html#HEADING2-4; if you can find a page with the same contents please replace the link, or else remove it – Hawken Oct 21 '12 at 04:53

Joe Phillips · Answer 3 · 2011-01-31T15:33:47.397

There is no such thing as an array in assembly (as far as I know). You are free to invent how an array works if you wish.

If you're reading the assembly generated by a compiler then you will have to ask specifically about that compiler.

One way of doing this would be to allow the first byte of the array to store the length of each element. Another way would be to null terminate the array (this is generally how strings are maintained).

builder-7000 · Answer 4 · 2018-10-18T03:12:05.597

2

You could use the following program (ATT assembly syntax) to determine the number of elements in an array:

.section .data
array:
    .long 3,5,8             # create an array of 32-bit integers
    arrsize = . - array     # in bytes (assemble-time constant, not stored)
    arrlen = (. - array)/4  # in dwords

.section .text
.globl _start
_start:

    mov   $arrlen, %edi
    mov   $60, %eax      # __NR_exit  in asm/unistd_64.h
    syscall              # sys_exit(arrlen)

Assemble and link the program:

gcc -c array.s && ld array.o
  #or
gcc -nostdlib -static array.s

The result is 3, as expected:

>./a.out            # run program in shell
>echo $?            # check the exit status (which is the number of elements in the array)
3

edited Oct 18 '18 at 03:12

answered Oct 18 '18 at 02:20

builder-7000

7,131
3
19
43

1

You don't need `div` to divide by 4! Use a right shift. Or better, use an assemble-time operator, like `len = (. - array) / 4`. Also, the 32-bit `int 0x80` ABI take the first arg in EBX, so your code does `sys_exit(4)`. I confirmed on my own desktop. If you were thinking of the 64-bit ABI for `syscall`, you'd use `movzbl %ax, %edi`, not `%si`. (Unless you're using a 32-bit install, `gcc -c` and `ld` will make 64-bit code, so [What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?](https://stackoverflow.com/q/46087730) also applies.) – Peter Cordes Oct 18 '18 at 02:30
Thanks for pointing out the problems with the original answer. I find the updated answer much clearer by avoiding the use of the `div` instruction. – builder-7000 Oct 18 '18 at 03:21
BTW, you original also depended on `%dx` being zero for `div` to not overflow and raise a #DE exception (causing Linux to deliver a SIGFPE). So it crashed if you linked it as a dynamic executable. But anyway, `shr $2, %edi` is what you should use for runtime unsigned division by 4 in cases where you want that, never `div`. (Also, 16-bit operand-size is generally slower, especially in 32 and 64-bit mode. See https://agner.org/optimize/) – Peter Cordes Oct 18 '18 at 03:35

score 1 · Answer 5 · edited Jan 31 '11 at 17:45

1

I found this thread very enlightening on the subject: Asm Examples: Finding the size of an array with different data types

edited Jan 31 '11 at 17:45

answered Jan 31 '11 at 15:34

karlphillip

92,053
36
243
426

score 1 · Answer 6 · answered Mar 25 '18 at 16:30

a db 10h,20h,30h,40h,50h,60h
n db n-a

in the above code 'a' is an array having 6 elements by default 'a' will point to the first element of the array and 'n' will be initialised just after the array. so value 'n-a' will correspond to the length of the array which we are storing in n. don't initialise other variables in between a and n. This might give you wrong results.

score 0 · Answer 7 · answered Oct 27 '13 at 02:12

0

MOV EAX,LENGTHOF data

Returns the number of items in array variable.

answered Oct 27 '13 at 02:12

vengy

1,548
10
18

Compiler EMU8086 doesn't seem to recognize `lengthof` keyword. What compiler do you use? – Jose Manuel Abarca Rodríguez May 06 '16 at 15:30

score 0 · Answer 8 · edited Feb 01 '15 at 03:53

0

I think this will help you....

.data

num db 2,4,6,8,10

.code

main proc
mov eax,0 ; initialize with zero
mov ax,lengthof num

output=5

edited Feb 01 '15 at 03:53

chiwangc

3,566
16
26
32

answered Feb 01 '15 at 03:00

omorepro

11

Compiler EMU8086 doesn't seem to recognize `lengthof` keyword. What compiler do you use? – Jose Manuel Abarca Rodríguez May 06 '16 at 15:30

How to check an "array's length" in Assembly Language (ASM),

8 Answers8

Linked