2

I want to train a CNN using mammography (greyscale) DICOM images that have a 14-bit color range per pixel. To train the CNN I want to reduce the images to 8-bit. I tried to do that by simply calculating:

scaled_img = (img_arr / 2**14) * 255

However, the pixel distribution for an expletory input image is not close to being equally distributed as visible in below image:

Image

So with the linear transformation above, I lose a lot of information, resulting in the breast on the output image being completely white (no or minimal color graduation observable).

I'm looking for a smart way to reduce the bit-depth but keep most of the information.

What I tried so far is to set all the pixels that are greater than the 95% quantile to the 95% quantile and then instead of dividing by 2**14 I divide by the 95% quantile.
This way I cut out rare grayscale information but increase the contrast in the 8-bit image of the reduction because a smaller color range is mapped. With the reduction we map 16384 (14 bit) to 256 (8 bit) resulting in 64 grayscale values mapped to 1 grayscale value, with the 95% quantile (6000) it results in only ~23 grayscale values to 1 grayscale value.

My approach is tedious and also loses information. I'm looking for an algorithm that can easily be applied on many images. I did find this, but it only works with 12-bit images.

Amit Joshi
  • 15,448
  • 21
  • 77
  • 141
Kaschi14
  • 116
  • 5
  • 1
    What speaks against handling the full grayscale range of 14 bit in your CNN? – Markus Sabin Sep 19 '22 at 08:23
  • I have to use pretrained CNNs as I don't have enough training images, and afaik there are no pretrained CNNs with 14-bit support. But if we scale the pixel values between 0 and 1 which is done anyways, it would not matter right? But it would effect the computational cost right? – Kaschi14 Sep 19 '22 at 08:38
  • Scaling the fully range between 0 and 1 would at least preserve the full grayscale range without any loss. Of course, there may be a performance impact in using floating point numbers rather than integers. Nevertheless, it makes me (being a developer and a potential patient at the same time) a bit nervous to train a CNN with data that is different from what has been acquired while I was exposed to x-rays – Markus Sabin Sep 19 '22 at 09:32
  • 2
    DNN inputs are always floats (in theory), and they're always scaled to approach a normal distribution of (0,1), to keep within the "interesting" range of activation functions. no matter the bit depth, if you view the entire range, you'll not see small graduations _yourself_. you as a human use *windowing* to reveal specific information. -- investigate "hdr tone mapping", that's a locally adaptive, nonlinear operation, and it's probably easy to explain to humans because they know what it looks like. that WILL destroy absolute values though. – Christoph Rackwitz Sep 19 '22 at 11:29
  • if you need to maintain an absolute scale AND convey the full resolution of your data, you could look into encodings ("embeddings"?) that give the network all those 14 bits somewhat literally. I think a sin/cos encoding at various scales is popular. that works similar to binary numbers, but less "hard". think of it like "grey code", but smooth instead of on/off. similar idea to https://en.wikipedia.org/wiki/Neural_coding#Position_coding but more efficient – Christoph Rackwitz Sep 19 '22 at 11:30

0 Answers0