Questions tagged [offloading]

This tag is for questions about software which utilize mechanisms for reducing workload from the CPU. This can be done by aggregating work before further processing is done and/or processing some of the workload in a dedicated hardware.

This tag is for questions about software which utilize mechanisms for reducing workload from the CPU. This can be done by aggregating work before further processing is done and/or processing some of the workload in a dedicated hardware.

Common offloads are network stack offloads such as LRO, GRO, TSO, etc. Other offloads are CPU offloads such as Intel's AES-NI for IPSec.

Offloads can be multi-layered, for example OVS (Open Virtual Switch) has a service for identifying and creating steering rules for packets. The user space service will offload the steering to the kernel software. Specific hardware can process the steering itself, so the kernel might offload to the hardware.

Common questions about offloads

  • How does specific offloads work?
  • How to enable offloads for specific cases?
  • What is the benefit of using specific offload?
111 questions
10
votes
1 answer

Why is GRO more efficient?

Generic Receive Offload (GRO) is a software technique in Linux to aggregate multiple incoming packets belonging to the same stream. The linked article claims that CPU utilization is reduced because, instead of each packet traversing the network…
user1202136
  • 11,171
  • 4
  • 41
  • 62
8
votes
1 answer

OpenMP offloading to Nvidia wrong reduction

I am interested in offloading work to the GPU with OpenMP. The code below gives the correct value of sum on the CPU //g++ -O3 -Wall foo.cpp -fopenmp #pragma omp parallel for reduction(+:sum) …
Z boson
  • 32,619
  • 11
  • 123
  • 226
8
votes
1 answer

How to use GCC 5.1 and OpenMP to offload work to Xeon Phi

Background We have been trying unsuccessfully to use the new GCC 5.1 release to offload OpenMP blocks to the Intel MIC (i.e. the Xeon Phi). Following the GCC Offloading page, we've put together the build.sh script to build the "accel" target…
grumpy_robot
  • 141
  • 1
  • 6
7
votes
1 answer

What exactly do the rx-vlan-offload and tx-vlan-offload ethtool options do?

The ethtool manpage only gives a nebulous explanation: rxvlan on|off Specifies whether RX VLAN acceleration should be enabled txvlan on|off Specifies whether TX VLAN acceleration should be enabled What exactly do the…
Christian
  • 1,499
  • 2
  • 12
  • 28
6
votes
1 answer

OpenMP runtime does not sees my GPU devices

I am trying to do some OpenMP offloading to the GPU on my local machine which is employed with a GTX 1060 graphic card. All of my CUDA and Cublas examples run just fine. However, when I tried to run some OpenMP offloading it simply does not work. In…
Addman
  • 341
  • 1
  • 5
  • 13
6
votes
3 answers

How do I use the GPU available with OpenMP?

I am trying to get some code to run on the GPU using OpenMP, but I am not succeeding. In my code, I am performing a matrix multiplication using for loops: once using OpenMP pragma tags and once without. (This is so that I can compare the execution…
Josiah
  • 63
  • 1
  • 5
5
votes
1 answer

nvptx gcc (9.0.0/trunk) for openmp 4.5 off-loading to (gpu) device cannot find libgomp.spec

I've been trying to install the OpenMP 4.5 off-loading to Nvidia GPU version of gcc for a while and so far no success, although I'm getting closer. This time, I followed this script, where I have made two changes: First I specified the trunk version…
Nigel Overmars
  • 213
  • 2
  • 11
5
votes
0 answers

How can I check if offloading to AMD gpu is working, using OpenMP

I am trying to use OpenMP to offload to an AMD GPU, I have read in the OpenMP 4.5 specification that target device represents the device onto which code and data may be offloaded, but I cannot tell if the offloading has been successful, or if it has…
Hamza
  • 61
  • 9
5
votes
3 answers

Force Grails/Weblogic To Only Redirect Using HTTPS protocol

I'm using Grails (2.2.2) on a project and my application issues undesirable http redirects instead of https redirects. We currently have an F5 load balancer in front of Oracle Weblogic. The F5 is offloading our SSL from Weblogic. The F5 only accepts…
legomania
  • 51
  • 4
4
votes
4 answers

how much work should we do in the database?

how much work should we do in the database? Ok I'm really confused as to exactly how much "work" should be done IN the database, and how much work had to be done instead at the application level? I mean I'm not talking about obvious stuff like we…
Timothy
  • 59
  • 1
4
votes
0 answers

Gcc offload compilation options

I'm trying to build the simplest OpenMP or OpenACC C++ program with GPU offload using gcc-10, CUDA 11 on Ubuntu 18.04 and this CMakeLists.txt file (or OpenMP version): cmake_minimum_required(VERSION 3.18) project(hello VERSION…
Paul Jurczak
  • 7,008
  • 3
  • 47
  • 72
4
votes
2 answers

clang compiler being able to offload OpenMP region to GPU

I read that clang compiler can offload OpenMP regions to GPUs. However, I am confused on the way to compile the code with clang. The clang version that is installed in our cluster is 3.9.0 (tags/RELEASE_390/final 288133). The code I want to offload…
armando
  • 1,360
  • 2
  • 13
  • 30
4
votes
1 answer

How to configure GCC for OpenMP 4.5 offloading to Nvidia PTX GPGPUs

With gcc 7.1 released, we can now configure gcc for openmp 4.5, offloading to Nvidia PTX GPGPUs. That's what they say in the release note (approximately). So my question is, is there any special flags to activate this configuration when compiling…
chedy najjar
  • 631
  • 7
  • 19
3
votes
1 answer

OpenMP offloading on GPU, 'simd' specificities

I was wondering how to interpret the following OpenMP constructs: #pragma omp target teams distribute parallel for for(int i = 0; i < N; ++i) { // compute } #pragma omp target teams distribute parallel for simd for(int i = 0; i < N; ++i) { …
Etienne M
  • 604
  • 3
  • 11
3
votes
0 answers

OpenMP offloading data race

I'm currently working on a project to invert matrices on the GPU using OpenMP. However, when normalizing a row of a matrix I have a data race. The code looks like this: #pragma omp target data map(tofrom: matrix[0:dim*dim], iden[0:dim*dim])…
DoodleSchrank
  • 43
  • 1
  • 6
1
2 3 4 5 6 7 8