I have a similar issue mentioned here but with a different behavior.
We have an FPGA (from Altera) acts as a 32KB memory on a PCIe bus of IMX8M-Plus CPU (ARM Cortex-A53).
I wrote a simple driver to access FPGA's memory. As you can see from lspci
output below, 32KB memory mapped to Region 4 (BAR4) and I use avalon_ioctl_set_operation()
and avalon_ioctl_get_operation()
to get and set BAR content.
static int avalon_ioctl_get_operation(unsigned long arg, uint8_t op_size)
{
struct avalon_pcie_operation o;
if (copy_from_user(&o, (void __user*)arg, sizeof(o)))
return -EFAULT;
if (o.bar >= PCI_SRIOV_NUM_BARS)
return -EFAULT;
if (!io_dev.bar_addrs[o.bar])
return -ENOMEM;
if (!IS_ALIGNED(o.offset, op_size))
return -EFAULT;
switch (op_size)
{
case 1:
o.data8 = ioread8(io_dev.bar_addrs[o.bar] + o.offset);
break;
case 2:
o.data16 = ioread16(io_dev.bar_addrs[o.bar] + o.offset);
break;
case 4:
o.data32 = ioread32(io_dev.bar_addrs[o.bar] + o.offset);
break;
case 8:
o.data64 = ioread64(io_dev.bar_addrs[o.bar] + o.offset);
break;
default:
return -EFAULT;
}
if (copy_to_user((void __user *)arg, &o, sizeof(o)))
return -EFAULT;
return 0;
}
static int avalon_ioctl_set_operation(unsigned long arg, uint8_t op_size)
{
struct avalon_pcie_operation o;
if (copy_from_user(&o, (void __user*)arg, sizeof(o)))
return -EFAULT;
if (o.bar >= PCI_SRIOV_NUM_BARS)
return -EFAULT;
if (!io_dev.bar_addrs[o.bar])
return -ENOMEM;
if (!IS_ALIGNED(o.offset, op_size))
return -EFAULT;
switch (op_size)
{
case 1:
iowrite8(o.data8, io_dev.bar_addrs[o.bar] + o.offset);
break;
case 2:
iowrite16(o.data16, io_dev.bar_addrs[o.bar] + o.offset);
break;
case 4:
iowrite32(o.data32, io_dev.bar_addrs[o.bar] + o.offset);
break;
case 8:
iowrite64(o.data64, io_dev.bar_addrs[o.bar] + o.offset);
break;
default:
return -EFAULT;
}
return 0;
}
For my testing, I used BAR4 and offset 0. Whenever I call those functions for 8/16/32 bit variants of read/write all are working fine. I can read whatever I write.
But when I attempt to use iowrite64()
and ioread64()
from offset 0, I read garbage data (0xFFFFFFFFFFFFFFFF) and PCIe config page is altered and PCIe device stops functioning (you can see lspci
output in the altered state at the bottom). And that is happening immediately after ioread64()
function call.
I stepped into ioread64()
function and saw that at the end it uses __raw_readq()
which is defined as
static inline u64 __raw_readq(const volatile void __iomem *addr)
{
u64 val;
asm volatile(ALTERNATIVE("ldr %0, [%1]",
"ldar %0, [%1]",
ARM64_WORKAROUND_DEVICE_LOAD_ACQUIRE)
: "=r" (val) : "r" (addr));
return val;
}
We use Linux v5.4 aarch64, I believe that accessing to a bus as a 64-bit should be fine.
uname -a
Linux 5.4.193-0+git.a301219f58a2 #1 SMP PREEMPT Thu Dec 15 14:09:11 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
Here is the lspci -vv
output at the beginning of the test:
00:00.0 PCI bridge: Synopsys, Inc. DWC_usb3 / PCIe bridge (rev 01) (prog-if 00 [Normal decode])
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 218
Region 0: Memory at 18000000 (32-bit, non-prefetchable) [size=1M]
Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
I/O behind bridge: [disabled]
Memory behind bridge: [disabled]
Prefetchable memory behind bridge: 18100000-181fffff [size=1M]
Secondary status: 66MHz- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- <SERR- <PERR-
Expansion ROM at 18200000 [virtual] [disabled] [size=64K]
BridgeCtl: Parity- SERR+ NoISA- VGA- VGA16- MAbort- >Reset- FastB2B-
PriDiscTmr- SecDiscTmr- DiscTmrStat- DiscTmrSERREn-
Capabilities: [40] Power Management version 3
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=375mA PME(D0+,D1+,D2-,D3hot+,D3cold+)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable+ 64bit-
Address: bc022000 Data: 0000
Masking: 00000000 Pending: 00000000
Capabilities: [70] Express (v2) Root Port (Slot-), MSI 00
DevCap: MaxPayload 128 bytes, PhantFunc 0
ExtTag- RBE+
DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop-
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
LnkCap: Port #0, Speed 8GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <1us, L1 <8us
ClockPM- Surprise- LLActRep+ BwNot+ ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x1 (ok)
TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt+
RootCap: CRSVisible+
RootCtl: ErrCorrectable- ErrNon-Fatal- ErrFatal- PMEIntEna+ CRSVisible+
RootSta: PME ReqID 0000, PMEStatus- PMEPending-
DevCap2: Completion Timeout: Range ABCD, TimeoutDis+, NROPrPrP+, LTR-
10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt-, EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS-, LN System CLS Not Supported, TPHComp-, ExtTPHComp-, ARIFwd-
AtomicOpsCap: Routing- 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled ARIFwd-
AtomicOpsCtl: ReqEn- EgressBlck-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [100 v2] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr+ BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
AERCap: First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
HeaderLog: 00000000 00000000 00000000 00000000
RootCmd: CERptEn+ NFERptEn+ FERptEn+
RootSta: CERcvd+ MultCERcvd+ UERcvd- MultUERcvd-
FirstFatal- NonFatalMsg- FatalMsg- IntMsg 0
ErrorSrc: ERR_COR: 0000 ERR_FATAL/NONFATAL: 0000
Capabilities: [148 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
LaneErrStat: 0
Capabilities: [158 v1] L1 PM Substates
L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2- ASPM_L1.1+ L1_PM_Substates+
PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
T_CommonMode=10us
L1SubCtl2: T_PwrOn=10us
Kernel driver in use: pcieport
01:00.0 RAM memory: Altera Corporation Device e001
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0
Interrupt: pin A routed to IRQ 219
Region 0: Memory at 18108000 (64-bit, prefetchable) [size=512]
Region 4: Memory at 18100000 (64-bit, prefetchable) [size=32K]
Capabilities: [50] MSI: Enable+ Count=1/4 Maskable- 64bit+
Address: 00000000bc022000 Data: 0001
Capabilities: [78] Power Management version 3
Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [80] Express (v2) Endpoint, MSI 00
DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
LnkCap: Port #1, Speed 8GT/s, Width x2, ASPM not supported
ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk-
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 8GT/s (ok), Width x1 (downgraded)
TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
DevCap2: Completion Timeout: Not Supported, TimeoutDis+, NROPrPrP-, LTR-
10BitTagComp-, 10BitTagReq-, OBFF Not Supported, ExtFmt+, EETLPPrefix-
EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
FRS-, TPHComp-, ExtTPHComp-
AtomicOpsCap: 32bit- 64bit- 128bitCAS-
DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
AtomicOpsCtl: ReqEn-
LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
Compliance De-emphasis: -6dB
LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+, EqualizationPhase1+
EqualizationPhase2+, EqualizationPhase3+, LinkEqualizationRequest-
Capabilities: [100 v1] Virtual Channel
Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
Arb: Fixed- WRR32- WRR64- WRR128-
Ctrl: ArbSelect=Fixed
Status: InProgress-
VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff
Status: NegoPending- InProgress-
Capabilities: [200 v1] Vendor Specific Information: ID=1172 Rev=0 Len=044 <?>
Capabilities: [300 v1] Secondary PCI Express
LnkCtl3: LnkEquIntrruptEn-, PerformEqu-
LaneErrStat: 0
Kernel driver in use: avalon-dma
Kernel modules: avalon_drv
Corrupted state after ioread64()
:
00:00.0 PCI bridge: Synopsys, Inc. DWC_usb3 / PCIe bridge (rev 01) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0, IRQ 218
Memory at 18000000 (32-bit, non-prefetchable) [size=1M]
Bus: primary=00, secondary=01, subordinate=ff, sec-latency=0
I/O behind bridge: [disabled]
Memory behind bridge: [disabled]
Prefetchable memory behind bridge: 18100000-181fffff [size=1M]
Expansion ROM at 18200000 [virtual] [disabled] [size=64K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable+ 64bit-
Capabilities: [70] Express Root Port (Slot-), MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [148] Secondary PCI Express
Capabilities: [158] L1 PM Substates
Kernel driver in use: pcieport
01:00.0 RAM memory: Altera Corporation Device e001 (rev ff) (prog-if ff)
!!! Unknown header type 7f
Kernel driver in use: avalon-dma
Kernel modules: avalon_drv
Do you know any restrictions available not to access PCIe bus as 64-bit on a 64-bit CPU with 64-bit OS? Or is there any special procedure that I need to follow to use ioread64()
?
Explanation for @0andriy comment: I went through iMX8M-Plus reference manual and it says that it supports 32- and 64-bit PCI Express addresses and also 64 bit MSI at hardware level but I'm not sure if it is supported at Linux driver level. Do you know how I can be sure about it? About your question, The reason why I did choose to use ioreadXX/iowriteXX apis at the first place is that I have an example avalon dma driver and the driver accesses to the DMA registers in the FPGA which are mapped to BAR0 by using ioread32/iowrite32 apis. DMA and all MSI interrupts are working fine. Basically I followed the same approach. Only exception which is not used by the Avalon dma driver but in my custom driver was using ioread64/iowrite64 apis which I had problems there and trying to figure it out why.