9

I'm starting out with 6502 Assembly right now and have a problem wrapping my head around loops that need to deal with numbers bigger than 8 bit.

Specifically, I want to loop through some memory locations. In pseudo-c-code, I want to do this:

    // Address is a pointer to memory
    int* address = 0x44AD;
    for(x = 0; x < 21; x++){
        // Move pointer forward 40 bytes
        address += 0x28;
        // Set memory location to 0x01
        &address = 0x01;
    }

So starting at address $44AD I want to write $01 into ram, then jump forward $28, write $01 into that, then jump forward $28 again until I've done that 20 times (last address to write is $47A5).

My current approach is loop unrolling which is tedious to write (even though I guess an Assembler can make that simpler):

ldy #$01
// Start from $44AD for the first row, 
    // then increase by $28 (40 dec) for the next 20
sty $44AD
sty $44D5
sty $44FD
    [...snipped..]
sty $477D
sty $47A5

I know about absolute addressing (using the Accumulator instead of the Y register - sta $44AD, x), but that only gives me a number between 0 and 255. What I really think I want is something like this:

       lda #$01
       ldx #$14 // 20 Dec
loop:  sta $44AD, x * $28
       dex
       bne loop

Basically, start at the highest address, then loop down. Problem is that $14 * $28 = $320 or 800 dec, which is more than I can actually store in the 8-Bit X register.

Is there an elegant way to do this?

Michael Stum
  • 177,530
  • 117
  • 400
  • 535

2 Answers2

10

The 6502 is an 8-bit processor, so you aren't going to be able to calculate 16-bit addresses entirely in registers. You will need to indirect through page zero.

      // set $00,$01 to $44AD + 20 * $28 = $47CD
      LDA #$CD
      STA $00
      LDA #$47
      STA $01

      LDX #20  // Loop 20 times
      LDY #0
loop: LDA #$01 // the value to store
      STA ($00),Y // store A to the address held in $00,$01
      // subtract $28 from $00,$01 (16-bit subtraction)
      SEC
      LDA $00
      SBC #$28
      STA $00
      LDA $01
      SBC #0
      STA $01
      // do it 19 more times
      DEX
      BNE loop

Alternatively, you could use self-modifying code. This is a dubious technique in general, but common on embedded processors like the 6502 because they are so limited.

      // set the instruction at "patch" to "STA $47CD"
      LDA #$CD
      STA patch+1
      LDA #$47
      STA patch+2

      LDX #20  // Loop 20 times
loop: LDA #$01 // the value to store
patch:STA $FFFF
      // subtract $28 from the address in "patch"
      SEC
      LDA patch+1
      SBC #$28
      STA patch+1
      LDA patch+2
      SBC #0
      STA patch+2
      // do it 19 more times
      DEX
      BNE loop
Raymond Chen
  • 44,448
  • 11
  • 96
  • 135
  • 1
    Thank you so much! Reading up on the indirect mode, that eluded me. The self-modifying code is also interesting, I need to wrap my head around the whole "code = memory" idea that the old machine had. – Michael Stum Feb 05 '14 at 06:06
  • 2
    STC?? SEC, surely. Incidentally, as a readability-aid in self-modifying code, I use $C0DE instead of $FFFF and have my syntax highlighter fluoresce it in yellow - makes it very easy to spot places where you're doing something gnarly. – Eight-Bit Guru Feb 05 '14 at 16:01
  • @EightBitGuru Sorry got my instruction sets confused. – Raymond Chen Feb 05 '14 at 16:34
  • Nitpick: The second STA in the first code snippet should be to $01. – rettvest Feb 14 '14 at 20:32
  • 3
    Since this question is tagged `c64`, you want to avoid using zeropage addresses `$00` and `$01`, as they are used for I/O ports and ROM/RAM configuration by the 6510 CPU. If you want your code to be run safely from BASIC through a `SYS` command, your best bet is to use the `$FB` through `$FE` range. – Lars Haugseth Apr 24 '14 at 09:24
  • I used to use self-modifying code on the C64. Sure it's ugly, but when you need it to run on exactly *one* computer, it works. – David Crowell Sep 16 '14 at 18:24
2

More efficient way to copy 1k of data:

    ldy #0
nextvalue:
    lda address, y
    sta address, y

    lda address+$100, y
    sta address+$100, y

    lda address+$200, y
    sta address+$200, y

    lda address+$300, y
    sta address+$300, y
    iny
    bne nextvalue 

Few notes:

  • Faster, as loop overhead is reduced. Takes more space due to more commands.

  • If the assembler you use supports macros, you can easily make it configurable, how many blocks the code handles.

Might not be 100% relevant to this, but here's another way to have longer-than-255 loops:

nextblock:
    ldy #0
nextvalue:
    lda address, y
    iny
    bne nextvalue

;Insert code to be executed between each block here:

    dec numblocks
    bpl nextblock

numblocks:
    .byte 3

Few notes:

  • For now, the code doesn't really do anything meaningful, but runs the loop "numblocks" times. "Add your own code" :-) (Often I use this together with some self-modifying code that increments sta, y address for example)

  • bpl can be dangerous (if you don't know how it works), but works well enough in this case (but wouldn't, if numblocks address contained big enough value)

  • If you need to execute the same code again, numblocks needs to be re-set.

  • Code can be made a little bit faster by putting numblocks to zero page.

  • If not needed for something else (like it often is), you can use X register instead of memory location.

Jupp3
  • 221
  • 1
  • 1