Can I turn this into a loop through some 16-Bit Magic?

Question

I'm starting out with 6502 Assembly right now and have a problem wrapping my head around loops that need to deal with numbers bigger than 8 bit.

Specifically, I want to loop through some memory locations. In pseudo-c-code, I want to do this:

    // Address is a pointer to memory
    int* address = 0x44AD;
    for(x = 0; x < 21; x++){
        // Move pointer forward 40 bytes
        address += 0x28;
        // Set memory location to 0x01
        &address = 0x01;
    }

So starting at address $44AD I want to write $01 into ram, then jump forward $28, write $01 into that, then jump forward $28 again until I've done that 20 times (last address to write is $47A5).

My current approach is loop unrolling which is tedious to write (even though I guess an Assembler can make that simpler):

ldy #$01
// Start from $44AD for the first row, 
    // then increase by $28 (40 dec) for the next 20
sty $44AD
sty $44D5
sty $44FD
    [...snipped..]
sty $477D
sty $47A5

I know about absolute addressing (using the Accumulator instead of the Y register - sta $44AD, x), but that only gives me a number between 0 and 255. What I really think I want is something like this:

       lda #$01
       ldx #$14 // 20 Dec
loop:  sta $44AD, x * $28
       dex
       bne loop

Basically, start at the highest address, then loop down. Problem is that $14 * $28 = $320 or 800 dec, which is more than I can actually store in the 8-Bit X register.

Is there an elegant way to do this?

Raymond Chen · Accepted Answer · 2014-02-15T00:04:32.797

10

The 6502 is an 8-bit processor, so you aren't going to be able to calculate 16-bit addresses entirely in registers. You will need to indirect through page zero.

      // set $00,$01 to $44AD + 20 * $28 = $47CD
      LDA #$CD
      STA $00
      LDA #$47
      STA $01

      LDX #20  // Loop 20 times
      LDY #0
loop: LDA #$01 // the value to store
      STA ($00),Y // store A to the address held in $00,$01
      // subtract $28 from $00,$01 (16-bit subtraction)
      SEC
      LDA $00
      SBC #$28
      STA $00
      LDA $01
      SBC #0
      STA $01
      // do it 19 more times
      DEX
      BNE loop

Alternatively, you could use self-modifying code. This is a dubious technique in general, but common on embedded processors like the 6502 because they are so limited.

      // set the instruction at "patch" to "STA $47CD"
      LDA #$CD
      STA patch+1
      LDA #$47
      STA patch+2

      LDX #20  // Loop 20 times
loop: LDA #$01 // the value to store
patch:STA $FFFF
      // subtract $28 from the address in "patch"
      SEC
      LDA patch+1
      SBC #$28
      STA patch+1
      LDA patch+2
      SBC #0
      STA patch+2
      // do it 19 more times
      DEX
      BNE loop

edited Feb 15 '14 at 00:04

answered Feb 05 '14 at 05:32

Raymond Chen

44,448
11
96
135

1

Thank you so much! Reading up on the indirect mode, that eluded me. The self-modifying code is also interesting, I need to wrap my head around the whole "code = memory" idea that the old machine had. – Michael Stum Feb 05 '14 at 06:06
2

STC?? SEC, surely. Incidentally, as a readability-aid in self-modifying code, I use $C0DE instead of $FFFF and have my syntax highlighter fluoresce it in yellow - makes it very easy to spot places where you're doing something gnarly. – Eight-Bit Guru Feb 05 '14 at 16:01
@EightBitGuru Sorry got my instruction sets confused. – Raymond Chen Feb 05 '14 at 16:34
Nitpick: The second STA in the first code snippet should be to $01. – rettvest Feb 14 '14 at 20:32
3

Since this question is tagged `c64`, you want to avoid using zeropage addresses `$00` and `$01`, as they are used for I/O ports and ROM/RAM configuration by the 6510 CPU. If you want your code to be run safely from BASIC through a `SYS` command, your best bet is to use the `$FB` through `$FE` range. – Lars Haugseth Apr 24 '14 at 09:24
I used to use self-modifying code on the C64. Sure it's ugly, but when you need it to run on exactly *one* computer, it works. – David Crowell Sep 16 '14 at 18:24

score 2 · Answer 2 · answered May 07 '14 at 18:43

More efficient way to copy 1k of data:

    ldy #0
nextvalue:
    lda address, y
    sta address, y

    lda address+$100, y
    sta address+$100, y

    lda address+$200, y
    sta address+$200, y

    lda address+$300, y
    sta address+$300, y
    iny
    bne nextvalue

Few notes:

Faster, as loop overhead is reduced. Takes more space due to more commands.
If the assembler you use supports macros, you can easily make it configurable, how many blocks the code handles.

Might not be 100% relevant to this, but here's another way to have longer-than-255 loops:

nextblock:
    ldy #0
nextvalue:
    lda address, y
    iny
    bne nextvalue

;Insert code to be executed between each block here:

    dec numblocks
    bpl nextblock

numblocks:
    .byte 3

Few notes:

For now, the code doesn't really do anything meaningful, but runs the loop "numblocks" times. "Add your own code" :-) (Often I use this together with some self-modifying code that increments sta, y address for example)
bpl can be dangerous (if you don't know how it works), but works well enough in this case (but wouldn't, if numblocks address contained big enough value)
If you need to execute the same code again, numblocks needs to be re-set.
Code can be made a little bit faster by putting numblocks to zero page.
If not needed for something else (like it often is), you can use X register instead of memory location.

Can I turn this into a loop through some 16-Bit Magic?

2 Answers2