Questions tagged [ofed]

14 questions
3
votes
2 answers

Java Sockets on RDMA (JSOR) vs jVerbs performance in Infiniband

I have basic understanding of both JSOR and jVerbs. Both handle limitations of JNI and use fast path to reduce latency. Both of them use user Verbs RDMA interface for avoiding context switch and providing fast path access. Both also have options for…
phoenix
  • 3,069
  • 3
  • 22
  • 29
2
votes
1 answer

How can I increase OpenFabrics memory limit for Torque jobs?

When I run MPI job over InfiniBand, I get the following worning. We use Torque Manager. -------------------------------------------------------------------------- WARNING: It appears that your OpenFabrics subsystem is configured to only allow…
kojiwell
  • 872
  • 1
  • 10
  • 10
1
vote
0 answers

mpirun : ORTE daemon has unexpectedly failed

I'm on a fresh install of a Slurm (version 20.11.9) cluster with 4 nodes on CentOS 8 Stream, with Mellanox infiniband connection. Mellanox drivers has been built from this ISO : https://network.nvidia.com/products/infiniband-drivers/linux/mlnx_ofed/…
DSX
  • 139
  • 4
1
vote
1 answer

Is it possible to use RDMA Mellanox libraries from within a kernel module?

I want to develop a kernel module that is able to send/receive RDMA messages. I am wondering if the Mellanox libraries can be called from kernel space. Can I call Mellanox RDMA functions from a kernel module? Answer: I have some working code here:…
JC1
  • 657
  • 6
  • 21
1
vote
1 answer

Sockect Direct Protocol vs FTP Java Library

Currently I am using Apache's Commons Net library for transferring some application files(2KB to 200MB) from one AIX server to another through FTP protocol. I came to know that there is an another protocol exists i.e SDP(Socket Direct Protocol)…
0
votes
0 answers

Issues with using KafkaDirect for Kafka RDMA communication

KafkaDirect I'm attempting to install KafkaDirect from the GitHub repository to enable RDMA communication in Kafka. My environment is as follows: Ubuntu 20.04 Cluster : Node1, Node2, Node3 Mellanox ConnectX-3 InfiniBand KafkaDirect is an adaptation…
0
votes
0 answers

Unknown symbol __nvme_submit_sync_cmd (err -22)

I tried to install the MLNX-OFED Driver (version 5.5-1.0.3.2) for ubuntu20.04 on a linux kernel versioned 5.4.0-124 generic but encountered the following problem when trying to install the mod nvme-rdma: problem when trying to modprobe nvme-rdma I…
0
votes
1 answer

How does SEND bandwidth improve when the registered memory is aligned to system page size? (In Mellanox IBD)

Operating System: RHEL Centos 7.9 Latest Operation: Sending 500MB chunks 21 times from one System to another connected via Mellanox Cables. (Ethernet controller: Mellanox Technologies MT28908 Family [ConnectX-6]) (The registered memory region…
Vaishakh
  • 67
  • 5
0
votes
1 answer

What is the difference between OFED, MLNX OFED and the inbox driver

I'm setting up Infiniband networks, and I do not fully get the difference between the different software stacks. OFED https://www.openfabrics.org/ofed-for-linux/ MLNX OFED…
Jounathaen
  • 803
  • 1
  • 9
  • 23
0
votes
0 answers

How can DAPL offer more functionality than OFA does if DAPL relies solely on OFA as the only layer beneath it?

In my understanding if a system that only has underlying Infiniband connectivity (i.e. not iWarp or anything else which DAPL could use as an alternative) then DAPL exists solely as an abstracted layer on top of OFA/Infiniband. If this is the case,…
Brayme Guaman
  • 175
  • 2
  • 12
0
votes
1 answer

Error using verbs Memory Windows (ibv_alloc_mw)

I am trying to use memory windows and I am getting EPERM (errno=1) when calling ibv_alloc_mw (with both types of MWs). I have mellanox ConnectX-3 cards and the following OFED: ofed_info | head -n 1 MLNX_OFED_LINUX-3.2-2.0.0.0 (OFED-3.2-2.0.0): It…
JC1
  • 657
  • 6
  • 21
0
votes
1 answer

RDMA Fast Memory Registration (FMR)

I'm developing a system that uses RDMA extensively (on Mellanox hardware) and would like to be able to register memory regions more efficiently/faster. I have taken a look into Fast Memory Registration and I have a few questions: Is FMR going away?…
JC1
  • 657
  • 6
  • 21
0
votes
1 answer

rdma connection manager driver pattern

I'm using the OFED 3.18r2 implementation of Infiniband drivers for my application. In particular I'm using the rdma connection manager wrapper functions. To understand better what's going on under the hood I'm used to look at the source code. Doing…
Antonio
  • 35
  • 4
0
votes
1 answer

Infiniband SDP EGAIN error while using a TCP non-blockling socket

I'm using Mellanox Connext-X 3 QDR cards on RHEL 6.2. I've OFED 1.5.4 because it includes SDP. I get EAGAIN error message when using SDP in LD_PRELOAD mode for a TCP app that configures the socket in non-blocking mode. Any thoughts?
Sumant
  • 4,286
  • 1
  • 23
  • 31