Mixed-Precision Programming with CUDA 8 | NVIDIA Technical Blog
Update, March 25, 2019: The latest Volta and Turing GPUs now incoporate Tensor Cores, which accelerate certain types of FP16 matrix math. This enables faster and easier mixed-precision computation…