I'm writing some platform specific optimizations and while I'm aware of the fact that I could parse the vendor string in the host code and send that to the kernel using the -D
option, it is perhaps more convenient to detect the vendor in the kernel directly, without host involvement (that way it is possible to optimize kernels even without access to host source code, ...).
So far, I have come up with the following:
#ifdef __NV_CL_C_VERSION
/**
* @def NVIDIA
* @brief defined when compiling on NVIDIA GPUs
*/
#define NVIDIA
#endif // __NV_CL_C_VERSION
#if defined(__WinterPark__) || defined(__BeaverCreek__) || defined(__Turks__) || \
defined(__Caicos__) || defined(__Tahiti__) || defined(__Pitcairn__) || \
defined(__Capeverde__) || defined(__Cayman__) || defined(__Barts__) || \
defined(__Cypress__) || defined(__Juniper__) || defined(__Redwood__) || \
defined(__Cedar__) || defined(__ATI_RV770__) || defined(__ATI_RV730__) || \
defined(__ATI_RV710__) || defined(__Loveland__) || defined(__GPU__) || \
defined(__Hawaii__)
#define AMD
/**
* @def AMD
* @brief defined when compiling on AMD GPUs
* @note This list was originally found at https://github.com/magnumripper/JohnTheRipper/wiki/Predefined-macros-in-OpenCL-(standard-and-proprietary) and copied shamelessly. It is most definitely incomplete and contains the troubling __GPU__.
* @note AMD also defines __CPU__ when compiling for CL_DEVICE_TYPE_CPU.
*/
#endif // ...
Any additions or corrections? Anyone knows what Intel defines?