-3

Have 40 DICOM and 40 PNG images (data and their masks) for a Fully CNN that are loaded into my Google Drive and have been found by the notebook via the print(os.listdir(...)), as evidenced below in the first block of code where all the names of the 80 data in the above sets are listed.

Also have globbed all of the DICOM and PNG into img_path and mask_path, both with lengths of 40, in the second block of code that is below.

Now attempting to resize all of the images to 256 x 256 before inputting them into the U-net like architecture for segmentation. However, cannot load them via the nib.load() call, as it cannot work out the file type of the DCM and PNG files, even though for the latter it will not error but still produce an empty set of data like the last block of code yields.

Assuming that, once the first couple of lines inside the for loop in the third block of code are rectified, pre-processing should be completed and I can move onto the U-net implementation.

Have the current pydicom running in the Colab notebook and tried it in lieu of the nib.load() call, which produced the same error as the current one.


#import data as data
import pydicom
from PIL import Image
import numpy as np
import glob
import imageio
print(os.listdir("/content/drive/My Drive/Images"))
print(os.listdir("/content/drive/My Drive/Masks"))

pixel_data = []
images = glob.glob("/content/drive/My Drive/Images/IMG*.dcm");
for image in images:
    dataset = pydicom.dcmread(image)
    pixel_data.append(dataset.pixel_array)
#print(len(images))
#print(pixel_data)

pixel_data1 = [] ----------------> this section is the trouble area <-------
masks = glob.glob("content/drive/My Drive/Masks/IMG*.png");
for mask in masks:
    dataset1 = imageio.imread(mask)
    pixel_data1.append(dataset1.pixel_array)
print(len(masks))
print(pixel_data1)

['IMG-0004-00040.dcm', 'IMG-0002-00018.dcm', 'IMG-0046-00034.dcm', 'IMG-0043-00014.dcm', 'IMG-0064-00016.dcm',....] ['IMG-0004-00040.png', 'IMG-0002-00018.png', 'IMG-0046-00034.png', 'IMG-0043-00014.png', 'IMG-0064-00016.png',....]

0 ----------------> outputs of trouble area <--------------

[]

import glob
img_path = glob.glob("/content/drive/My Drive/Images/IMG*.dcm")
mask_path = glob.glob("/content/drive/My Drive/Masks/IMG*.png")
print(len(img_path))
print(len(mask_path))

40

40

images=[]
a=[]
for a in pixel_data:
    a=resize(a,(a.shape[0],256,256))
    a=a[:,:,:]
    for j in range(a.shape[0]):
        images.append((a[j,:,:]))

No output, this section works fine.

images=np.asarray(images)
print(len(images))

10880

masks=[]               -------------------> the other trouble area <-------
b=[]
for b in masks:
    b=resize(b,(b.shape[0],256,256))
    b=b[:,:,:]
    for j in range(b.shape[0]):
        masks.append((b[j,:,:]))

No output, trying to solve the problem of how to fix this section.

masks=np.asarray(masks) ------------> fix the above section and this
print(len(masks))                     should have no issues

[]

Amit Joshi
  • 15,448
  • 21
  • 77
  • 141

2 Answers2

1

You are trying to load the DICOM files again using nib.load, which does not work, as you already found out:

for name in img_path:
    a=nib.load(name)  # does not work with DICOM files
    a=a.get_data()
    a=resize(a,(a.shape[0],256,256))

You already have the data from the DICOM files in the pixel_data list, so you should use these:

for a in pixel_data:
    a=resize(a,(a.shape[0],256,256))  # or something similar, depending on the shape of pixel_data
    ...
MrBean Bremen
  • 14,916
  • 3
  • 26
  • 46
  • Worked! Got an array of size 10880 on the np.asarray(images) call. How might I go about performing the same procedure on the PNG files, considering I cannot use pydicom to append the pixel data in the for loop contained within the first block of code? Would cv2 work in a similar fashion with the for loop and then I can do another for loop with the resize() call? –  Apr 27 '20 at 19:18
  • Once again, you need a library that can load PNG files. I'm not familiar with nibabel, but it seems that it can only load NIFTI files, so this will not work. You may use Pillow, check out the answers to [this question](https://stackoverflow.com/questions/31386096/importing-png-files-into-numpy). – MrBean Bremen Apr 27 '20 at 19:45
  • Tried using a for loop, in similar vein of the one in the first block of code, using imageio.imread() to create a dataset from the glob.glob of all the PNG files and then append said dataset to an empty pixel data array created before the for loop. End up with zero elements for the glob call and an empty array for the pixel_data1. Looks like the problem is with the glob call but it will not recognize the directory with the imageio.imread() and cv2.imread() calls. –  Apr 27 '20 at 21:37
  • In your code above you showed that the glob call yielded 40 paths, so I don't understand what you are saying here. You did `for path in mask_paths`, right? Can you show your code? – MrBean Bremen Apr 28 '20 at 04:35
  • So, again, you iterate over the filenames instead of over the pixel data. You have the pixel data already collected in `pixel_data1`, so just do the same as in the first case (e.g. iterate over the _pxel data_, not over file names again). Ok, sleep time... – MrBean Bremen Apr 28 '20 at 19:52
0

Your last loop for mask in masks: is never executed because two lines about it you set masks = [].

It looks like it should to be for mask in mask_path:. mask_path is the list of mask file names.

Dave Chen
  • 1,905
  • 1
  • 12
  • 18