2

I am working on an arm-linux board that has a couple of PCI slots on it.

I wanted to check the vendor IDs / device IDs of the PCI modules in UBoot. So I ported the initialization portion of the PCI driver from linux to UBoot.

The Hack: Since the PCI topology on my board is fixed, I took the liberty of hardcoding the bus numbers(pirmary, secondary, subordinate) in UBoot, so I dont have to port the enumeration code to UBoot. To get the bus numbers I had written a small loadable kernel module that gets me the device bus numbers once the kernel is done with enumerating devices on the PCI bus.

Problem: Now, if the modules are present in PCI slots, I am able to read their IDs successfully. But if a module is not present and I try to read its IDs, I get hit by ARM's data abort handler.

Is the are a way around to this data abort exception, or knowing in advance if the slot is populated or not before trying to read the IDs.


Update 1: I modified UBoot source according to auselen's input as follows:

start.S

//Added following macro

     .macro  irq_restore_user_regs_mod
     ldmia   sp, {r0 - lr}^                  @ Calling r0 - lr
     mov     r0, r0
     ldr     lr, [sp, #S_PC]                 @ Get PC
     add     sp, sp, #S_FRAME_SIZE
     mov     pc, lr                 @ return & move spsr_svc into cpsr
     .endm

Changed data_abort code as follows

data_abort:
    get_bad_stack
    irq_save_user_regs
    bl  do_data_abort
    irq_restore_user_regs_mod

interrupts.c Modified do_data_abort to

void do_data_abort (struct pt_regs *pt_regs)
{   
    if (flag == 1)
    {
        flag = 0;
        return;
    }
    printf ("data abort handler\n");
    printf ("Originally installed by U-Boot\n");
    show_regs (pt_regs);
    bad_mode ();
}

mypcie.c Portion of code that attempts to read possibly invalid address

    printf("Trying possibly invalid address\n");
    flag = 1;
    data = *((volatile unsigned int *)(0xbe200000))  ;
    if (flag == 0) printf("Bad address \n");
    flag = 1;

Concerned Portion of UBoot Log:

Trying possibly invalid address
data abort handler
Originally installed by U-Boot
pc : [<00012150>]    lr : [<00012144>]
sp : 46069a00  ip : 78000000  fp : 00000000
r10: 07f7eca4  r9 : 00000000  r8 : 07f7efdc
r7 : 00000000  r6 : 000000f8  r5 : 00000001  r4 : bb000000
r3 : be200000  r2 : 00020b28  r1 : 00000020  r0 : 07f7ea49
Flags: nzcv  IRQs on  FIQs on  Mode USER_32
U-Boot::Resetting CPU ...

What I suspect is that *irq_restore_user_regs_mod* is sending UBoot back to *do_data_abort*. So 1st time when do_data_abort executes the flag is 1, do_data_abort changes the flag to 0, irq_restore_user_regs_mod sends the UBoot back to do_data_abort. Since the flag is 0 UBoot enters into bad mode.

Kindly tell me whether I should use

MOVS PC, LR

or

MOV PC, LR

in irq_restore_user_regs_mod (command in code snippet is different from the text).

Also please elaborate why you used MOV(S) PC, LR instead of SUBS PC, LR, #4 .


Update 2: (in light of auselen's comments)

i) Changed flag from simple int to volatile ii) Added printf(s) in interrupts.c for debugging purposes as follows:

printf("flag = %d\n",flag);
    if (flag == 1)
    {
        flag = 0;
        printf("FLAG = %d\n",flag);
        return;
    }

iii) Added asm volatile("" ::: "memory"); before and after making data abortable access, in file mypcie.c

flag = 1;
asm volatile("" ::: "memory");
data = *((volatile unsigned int *)(0xbe200000))  ;
asm volatile("" ::: "memory");
if (flag == 0) printf("Bad address \n");

Results

UBoot Log 1:

Trying possibly invalid address
flag = 1
FLAG = 0
flag = 1
FLAG = 0
(continues forever)

It seems that control kept returning to flag=1; instruction in mypcie.c If I comment out this instruction, and initialize flag to 1 outside of this function, then I get the following log:

UBoot Log 2:

Trying possibly invalid address
flag = 1
FLAG = 0
flag = 0
data abort handler
Originally installed by U-Boot
pc : [<00012174>]    lr : [<5306b01e>]
sp : c6a69a08  ip : 78000000  fp : 00000000
r10: 07f7eca1  r9 : 00000000  r8 : 07f7efdc
r7 : 00000000  r6 : 000000fb  r5 : 00000001  r4 : bb000000
r3 : be200000  r2 : 00000000  r1 : 00000020  r0 : 07f7ea4d
Flags: nzcv  IRQs on  FIQs on  Mode USER_32
U-Boot::Resetting CPU ...

Now It looks as if following instruction executed twice:

data = *((volatile unsigned int *)(0xbe200000))  ;

In 2nd execution flag was 0, so we hit the data abort.


Update 3 (In light of auselen's comment regarding MOV, MOVS and SUBS) removed -O2 flag from config.mk file in UBoot Directory.

UBoot Logs

Using subs pc, lr, #4

Trying possibly invalid address
flag = 1
FLAG = 0
prefetch abort handler
Originally installed by U-Boot
pc : [<90000004>]    lr : [<00012174>]
sp : 07f7eb80  ip : 78000000  fp : 00000000
r10: 00000000  r9 : 00000000  r8 : 07f7efdc
r7 : 00000000  r6 : 00000000  r5 : 00000000  r4 : 00008e00
r3 : 00000000  r2 : c6a68e1c  r1 : 00010001  r0 : 00000003
Flags: nZCv  IRQs on  FIQs on  Mode USER_32
U-Boot::Resetting CPU ...

Using subs pc, lr, #8

Trying possibly invalid address
flag = 1
FLAG = 0
flag = 0
data abort handler
Originally installed by U-Boot
pc : [<00012174>]    lr : [<00008e7c>]
sp : c6a68cf4  ip : 78000000  fp : 00000000
r10: 07f7eca1  r9 : 00000000  r8 : 07f7efdc
r7 : 00000000  r6 : 000000fb  r5 : 00000001  r4 : bb000000
r3 : be200000  r2 : 00000000  r1 : 00000020  r0 : 07f7ea4d
Flags: nzcv  IRQs on  FIQs on  Mode USER_32
U-Boot::Resetting CPU ...
microMolvi
  • 636
  • 11
  • 30
  • 1
    You can change the data abort handler as well to jump to next instruction. – auselen Jan 20 '14 at 14:16
  • Thanks auselen, but wouldn't it be better not to get the exception in the first place? Like if theres a way to check if the target address is valid or not before making the transaction. Also the processor will be in abort mode now. – microMolvi Jan 20 '14 at 14:25
  • 1
    talk to the root complex which is chip/vendor specific for detection of the boards. you may not have to do a full enumeration (which isnt that complicated anyway) but enough of it to not blindly talk to something that isnt there that is likely the cause of the data abort. I assume the pcie controller is causing the abort and will continue to so long as you try to talk to something that isnt there. – old_timer Jan 21 '14 at 02:28
  • @dwelch my understanding of PCI enumeration algo is that it brute forces all possible addresses of devices on a bus, **including those which are invalid**, and tries to read VID/PID from function 0 register 0. If it gets a valid address then the algo adds that device to some data structure. Kindly tell me if I have missed something. – microMolvi Jan 21 '14 at 10:14
  • 1
    Put some printf to check if what you think about calling data abort handler twice, for example print the status of the flag always. I mean that part should be easily debugabble. There is no difference for your case between mov and movs. The reason we don't use `subs PC, LR, #4` is that call would re-execute the same instruction (data aborting one) and `mov pc, lr` continues with next instruction. One funny issue might be compiler reordering access to flag. I would put `asm volatile("" ::: "memory");` before and after making data abortable access to make sure compiler respects access order. – auselen Jan 21 '14 at 20:52
  • 1
    http://stackoverflow.com/questions/14950614/working-of-asm-volatile-memory – auselen Jan 21 '14 at 20:53
  • 1
    btw, flag should be volatile as well, there is no chance compiler can anticipate such code flow. – auselen Jan 21 '14 at 21:26

1 Answers1

1

I haven't try this my self but one should be able to modify u-boot to handle data abort during certain address accesses.

arch/arm/cpu/armv7/start.S contains

data_abort:
        get_bad_stack
        bad_save_user_regs
        bl      do_data_abort

It seems from the code that bad_save_user_regs needs to be changed with irq_save_user_regs / irq_restore_user_regs* pair just like IRQ/FIQ is handled. Making data_abort read like

data_abort:
        get_bad_stack
        irq_save_user_regs
        bl      do_data_abort
        irq_restore_user_regs*

do_data_abort is located at arch/arm/lib/interrupts.c

  void do_data_abort (struct pt_regs *pt_regs)
  {
          printf ("data abort\n\n    MAYBE you should read doc/README.arm-unaligned-accesses\n\n");
          show_regs (pt_regs);
          bad_mode ();
  } 

and bad_mode resets the cpu.

One approach might be to raise flag before trying possible aborting address then in do_data_abort check the flag and instead of bad_mode and if that's the case lower the flag and continue with next instruction, which should check if flag was lowered or not.

[*]Return to next instruction can be handled with subs PC, LR, #4 in a modified copy of irq_restore_user_regs. Making it read as

          .macro  irq_restore_user_regs_mod
          ldmia   sp, {r0 - lr}^                  @ Calling r0 - lr
          mov     r0, r0
          ldr     lr, [sp, #S_PC]                 @ Get PC
          add     sp, sp, #S_FRAME_SIZE
          subs PC, LR, #4                         @ return & move spsr_svc into
                                                  @ cpsr
          .endm
auselen
  • 27,577
  • 7
  • 73
  • 114
  • 1
    @microMolvi It looks like (from documentation) I was wrong about mov/movs. You should use movs for returning from exception. It seems I was also wrong about using subs. `subs PC, LR, #4` seems right way to return to next instruction and `subs PC, LR, #8` would re-execute data aborting instruction. – auselen Jan 22 '14 at 12:35
  • irq_restore_user_regs_mod=irq_restore_user_regs now? – microMolvi Jan 23 '14 at 07:03
  • @microMolvi it looks like, however I still didn't test it myself. – auselen Jan 23 '14 at 07:39
  • I've attached the logs using `subs PC, LR, #4` & `subs PC, LR, #8`. Kindly tell me if you need anything else. I've also removed -O2 flag from config.mk file. – microMolvi Jan 23 '14 at 11:38
  • @microMolvi I would like to try this on my BeagleboneBlack which would probably take a few days. Eventually I hope I'll be clarifying my answer. – auselen Jan 23 '14 at 22:08
  • I got a BeagleboneBlack too. If there is something you want me to try out there, let me know. – microMolvi Jan 23 '14 at 22:32