Compute Workload Analysis displays the utilization of different compute pipelines. I know that in a modern GPU, integer and floating point pipelines are different hardware units and can execute in parallel. However, it is not very clear which pipeline represents which hardware unit for the other pipelines. I also couldn't find any documentation online about abbreviations and interpretations of the pipelines.
My questions are:
1) What are the full names of ADU, CBU, TEX, XU? How do they map to the hardware?
2) Which of the pipelines utilize the same hardware unit(e.g. FP16, FMA, FP64 uses floating point unit)?
3) A warp scheduler in a modern GPU can schedule 2 instructions per cycle(using different pipelines). Which pipelines can be used at the same time(e.g FMA-ALU, FMA-SFU, ALU-Tensor etc.)?
P.s.: I am adding the screenshot for those who are not familiar with Nsight Compute.