Questions tagged [kepler]

A family of NVIDIA GPUs that can be used for graphics or compute purposes

Kepler, named after the famous scientist Johannes Kepler, is a name referring to a family of NVIDIA GPUs which appear in the GeForce (graphics), Quadro (professional graphics) and Tesla (compute) product families.

Kepler GPUs offer higher performance and other new features over previous previous NVIDIA GPU families (such as Fermi) including compute capabilities such as Hyper-Q and Dynamic Parallelism. The Kepler instruction set offers new instructions not found in previous families, such as funnel shift. Current Kepler GPUs have a compute capability of cc 3.0 or 3.5.

NVIDIA Home page

Wikipedia article link

65 questions
12
votes
1 answer

How do I use Nvidia Multi-process Service (MPS) to run multiple non-MPI CUDA applications?

Can I run non-MPI CUDA applications concurrently on NVIDIA Kepler GPUs with MPS? I'd like to do this because my applications cannot fully utilize the GPU, so I want them to co-run together. Is there any code example to do this?
dalibocai
  • 2,289
  • 5
  • 29
  • 45
8
votes
2 answers

Load/Store Units (LD/ST) and Special Function Units (SFUs) for the Kepler architecture

In the Kepler architecture whitepaper, NVIDIA states that there are 32 Special Function Units (SFUs) and 32 Load/Store Units (LD/ST) on a SMX. The SFU are for "fast approximate transcendental operations". Unfortunately, I don't understand what this…
user2267896
  • 173
  • 2
  • 9
4
votes
1 answer

Are GPU Kepler CC3.0 processors not only pipelined architecture, but also superscalar?

In the documentation for CUDA 6.5 has written: http://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#ixzz3PIXMTktb 5.2.3. Multiprocessor Level ... 8L for devices of compute capability 3.x since a multiprocessor issues a pair of…
Alex
  • 12,578
  • 15
  • 99
  • 195
4
votes
2 answers

Memory coalescing in global writes

In CUDA devices, is coalescing in global memory writes as important as coalescing in global memory reads? If yes, how can it be explained? Also are there differences between early generations of CUDA devices and most recent ones regarding this…
Farzad
  • 3,288
  • 2
  • 29
  • 53
3
votes
1 answer

"Global Load Efficiency" over 100%

I have a CUDA program in which threads of a block read elements of a long array in several iterations and memory accesses are almost fully coalesced. When I profile, Global Load Efficiency is over 100% (between 119% and 187% depending on the input).…
Farzad
  • 3,288
  • 2
  • 29
  • 53
3
votes
1 answer

Why is the initialization of the GPU taking very long on Kepler achitecture and how to fix this?

When running my application the very first cuda_malloc takes 40 seconds which is due to the initialization of the GPU. When I build in debug mode this reduces to 5 seconds and when I run the same code on a Fermi device, it takes far less than a…
ikku100
  • 809
  • 1
  • 7
  • 16
2
votes
2 answers

Unable to Install Kepler React - TypeScript

I installed Kepler.g like this: npm i kepler.gl it got added to my package.json: "kepler.gl": "^2.1.2" However, if I try to import: import keplerGlReducer from "kepler.gl/reducers"; I get an error that Could not find a declaration file for module…
user12541823
2
votes
1 answer

How to make geoJson files and visualise them

How can I create geoJson files on Mac? I tried touch new.geojson after which I copied the data into the file but I don't think it gives me the correct file type since I am not able to load the file on Kepler.gl I tried to upload this data but it…
user13101751
2
votes
1 answer

Coalesced access vs broadcast access to a global memory location on GPU

I have an application where I need to broadcast a single (non-constant, just plain old data) value in global memory to all threads. The threads only need to read the value, not write to it. I cannot explicitly tell the application to use the…
Michael Carilli
  • 371
  • 2
  • 12
2
votes
1 answer

Do I really need MPS when running multiple MPI ranks on a single GPU, or Kepler's Hyper-Q itself is enough?

Basically I would like to run multiple MPI ranks on a single GPU (NVidia K20c), and I am aware of the existence of MPS and Kepler's Hyper-Q. However, my question is, is Hyper-Q itself enough for my need? Or I have to use MPS? According to the above…
rsm
  • 103
  • 1
  • 6
2
votes
1 answer

Why does the GK110 have 192 cores, and 4 warps?

I wanted to get a feel for Kepler's architecture, but it doesn't make sense to me. If a warp is 32 threads, and 4 of them get scheduled/executed, that would mean 128 cores are in use and 64 are left idle. In the whitepaper it said something about…
user3831417
2
votes
1 answer

Concurrent, unique kernels on the same multiprocessor?

Is it possible, using streams, to have multiple unique kernels on the same streaming multiprocessor in Kepler 3.5 GPUs? I.e. run 30 kernels of size <<<1,1024>>> at the same time on a Kepler GPU with 15 SMs?
Jordan
  • 305
  • 3
  • 13
2
votes
1 answer

Calculating the x,y,z position of solar bodies using ruby

I am using the equations and data for Mars here http://ssd.jpl.nasa.gov/txt/aprx_pos_planets.pdf and the solution for the eccentric anomaly kepler equation given here at the top of page…
humanbeing
  • 1,617
  • 3
  • 17
  • 30
2
votes
3 answers

Kepler CUDA dynamic parallelism and thread divergence

There is very little information on dynamic parallelism of Kepler, from the description of this new technology, does it mean the issue of thread control flow divergence in the same warp is solved? It allows recursion and lunching kernel from device…
HooYao
  • 554
  • 5
  • 19
1
vote
0 answers

What's wrong with kepler reducer?

I'm a junior developer. I tried to use kepler, but have this error: Module not found: Error: Can't resolve 'kepler.gl/reducers' File include next code: import keplerGlReducer from "kepler.gl/reducers"; import {createStore, applyMiddleware} from…
1
2 3 4 5