0

Since CUDA 7.5/8.0 and devices with Pascal GPUs CUDA supports the half precision (FP16) datatype out of the box. Additionally, many of the BLAS calls inside CUBLAS support the half precision types, e.g. the GEMM operation available as cublasHgemm. My problem is that the host does not support half precision types. Is the an already implemented solution like cublasSetMatrix which does the conversion during the upload to the device? Or is it necessary to create a tricky implementation by composition a float upload with a CUDA kernel doing the truncation to float?

M.K. aka Grisu
  • 2,338
  • 5
  • 17
  • 32

1 Answers1

2

There is no function currently provided by the CUDA toolkit which converts float quantities to half quantities in the process of copying data from host to device.

It is possible to convert from float to half either in host code or device code. There would be advantages and disadvantages of doing it in either place.

Furthermore, there is a cublas<t>gemmEx function available that may be of interest, which can have differing datatypes for input and output (and computation).

Some other half resources that may be of interest:

Community
  • 1
  • 1
Robert Crovella
  • 143,785
  • 11
  • 213
  • 257
  • Christian Rau has developed a very high quality host IEEE 754 half precision library today might be useful here – talonmies Mar 30 '17 at 16:17