I have an array of float (fp32) and want to convert it to an array of fp16 and Bfloat16 and vice versa (converting 16 bit floats to fp32) . Are there any vectorized instructions that allow me to do that ?
Also if i had like to compute some math operations directly on 16 bit floats, do hardware support those , for example computing dot product (sum, difference) between two arrays of fp16 or bFloat16