4

I'm currently trying to converting the images from a .mat file to .jpg file downloaded from this site- BrainTumorDataset. All the files contained in the directory are .mat files, now I want to convert all the files in .jpg format via python for making a project(Brain Tumor Classification using Deep Neural Net) via CNN. I searched in google but then I didn't get anything from there, only some topics on how to load .mat file in python but that also didn't help me. I found an answer in StackOverflow but this didn't work with this dataset and also the answer is for loading .mat image in python but I want to convert .mat images in .jpg format.

coding_ninza
  • 385
  • 7
  • 19

3 Answers3

5

I managed to convert one image, use a loop to convert all.

Please read the comments.

import matplotlib.pyplot as plt
import numpy as np
import h5py
from PIL import Image

#reading v 7.3 mat file in python
#https://stackoverflow.com/questions/17316880/reading-v-7-3-mat-file-in-python

filepath = '1.mat';
f = h5py.File(filepath, 'r') #Open mat file for reading

#In MATLAB the data is arranged as follows:
#cjdata is a MATLAB struct
#cjdata.image is a matrix of type int16

#Before update: read only image data.   
####################################################################
#Read cjdata struct, get image member and convert numpy ndarray of type float
#image = np.array(f['cjdata'].get('image')).astype(np.float64) #In MATLAB: image = cjdata.image
#f.close()
####################################################################

#Update: Read all elements of cjdata struct
####################################################################
#Read cjdata struct
cjdata = f['cjdata'] #<HDF5 group "/cjdata" (5 members)>

# In MATLAB cjdata = 
# struct with fields:
#   label: 1
#   PID: '100360'
#   image: [512×512 int16]
#   tumorBorder: [38×1 double]
#   tumorMask: [512×512 logical]

#get image member and convert numpy ndarray of type float
image = np.array(cjdata.get('image')).astype(np.float64) #In MATLAB: image = cjdata.image

label = cjdata.get('label')[0,0] #Use [0,0] indexing in order to convert lable to scalar

PID = cjdata.get('PID') # <HDF5 dataset "PID": shape (6, 1), type "<u2">
PID = ''.join(chr(c) for c in PID) #Convert to string https://stackoverflow.com/questions/12036304/loading-hdf5-matlab-strings-into-python

tumorBorder = np.array(cjdata.get('tumorBorder'))[0] #Use [0] indexing - convert from 2D array to 1D array.

tumorMask = np.array(cjdata.get('tumorMask'))

f.close()
####################################################################

#Convert image to uint8 (before saving as jpeg - jpeg doesn't support int16 format).
#Use simple linear conversion: subtract minimum, and divide by range.
#Note: the conversion is not optimal - you should find a better way.
#Multiply by 255 to set values in uint8 range [0, 255], and covert to type uint8.
hi = np.max(image)
lo = np.min(image)
image = (((image - lo)/(hi-lo))*255).astype(np.uint8)

#Save as jpeg
#https://stackoverflow.com/questions/902761/saving-a-numpy-array-as-an-image
im = Image.fromarray(image)
im.save("1.jpg")

#Display image for testing
imgplot = plt.imshow(image)
plt.show()

Note:
Each mat file contains a struct named cjdata.
Fields of cjdata struct:

cjdata = 

struct with fields:

      label: 1
        PID: '100360'
      image: [512×512 int16]
tumorBorder: [38×1 double]
  tumorMask: [512×512 logical]

When converting images to jpeg, you are loosing information...

Rotem
  • 30,366
  • 4
  • 32
  • 65
  • it's working but is there any way to retain the information of the image and another problem is plt.show() is showing nothing except "
    " no image is showing otherwise you code works fine to convert the data and save as jpg
    – coding_ninza Dec 06 '19 at 14:59
  • and please also add a code to access all the elements of 'cjdata' – coding_ninza Dec 06 '19 at 15:06
  • add the code as second part to access all the element of 'cjdata' because when I do "[key for key in f.keys()]" it only shows ['cjdata'] but I can't access the elements – coding_ninza Dec 06 '19 at 15:07
  • I updated my code to read all elements of `cjdata`. `f['cjdata']` is an HDF5 group, and you can access the group members using `get`. The solution got a little messy, because in MATLAB scalar is stored as 1x1 matrix, and a string is stored as character array... As to figure size, it's just due to plotting parameters. – Rotem Dec 06 '19 at 22:32
1

Here is how you can use a loop to convert all images.

from os import path
import os
from matplotlib import pyplot as plt
import numpy as np
import h5py
from PIL import Image
import re
import sys
from glob import glob


dir_path = path.dirname(path.abspath(__file__))
path_to_mat_files = path.join(dir_path, "*.mat")
found_files = glob(path_to_mat_files, recursive=True)
total_files = 0


def convert_to_png(file: str, number: int):
    global total_files
    if path.exists(file):
        print(file, "already exist\nSkipping...")
    else:
        h5_file = h5py.File(file, 'r')
        png = file[:-3] + "png"
        cjdata = h5_file['cjdata']
        image = np.array(cjdata.get('image')).astype(np.float64)
        label = cjdata.get('label')[0,0]
        PID = cjdata.get('PID')
        PID = ''.join(chr(c) for c in PID)
        tumorBorder = np.array(cjdata.get('tumorBorder'))[0]
        tumorMask = np.array(cjdata.get('tumorMask'))
        h5_file.close()
        hi = np.max(image)
        lo = np.min(image)
        image = (((image - lo)/(hi-lo))*255).astype(np.uint8)
        im = Image.fromarray(image)
        im.save(png)
        os.system(f"mv {png} {dir_path}\\png_images")#make sure folder png_images exist
        total_files += 1
        print("saving", png, "File No: ", number)
        
for file in found_files:
    if "cvind.mat" in file:
        continue
    convert_to_png(file, total_files)
print("Finished converting all files: ", total_files)
Simas Joneliunas
  • 2,890
  • 20
  • 28
  • 35
  • I try ur code in my colab notebook. It gives me the msg - "Finished converting all files: 0". I don't know why it is not working. Here is my colab notebook link: 'https://colab.research.google.com/drive/1l2kvysdmc_jdvxfX4Yi7tk-8HpNuAUoW?usp=sharing' – Samrat Alam Jan 17 '21 at 06:04
0

Here is a MATLAB code that can convert all images in a folder to a different format:

% Define the source and destination folders
src_folder = 'src';
dst_folder = 'dst';

% Get a list of all image files in the source folder
files = dir(fullfile(src_folder, '*.mat'));

% Loop through each file
for i = 1:length(files)
    % Load the .mat file
    load(fullfile(src_folder, files(i).name));

    % Convert the data to uint8
    example_matrix = im2uint8(example_matrix);

    % Construct the destination file name
    [~, name, ~] = fileparts(files(i).name);
    dst_file = fullfile(dst_folder, [name '.png']);

    % Try to save the image
    try
        imwrite(example_matrix, dst_file);
        disp(['Image ' name ' saved successfully']);
    catch
        disp(['Error saving image ' name]);
    end
end

in some case it produce a error example_matrix.

This error solved by this code

 % Convert the data to uint8
 I = reshape(uint16(linspace(0,65535,25)),[5 5])


% Convert the data to uint8
example_matrix = im2uint8(1);

This code defines the source folder (src_folder) and the destination folder (dst_folder). Then, it uses the dir function to get a list of all .mat files in the source folder.

The code loops through each file, loads the .mat file, converts the data to uint8, and constructs the destination file name. Finally, it tries to save the image using the imwrite function. If the image is saved successfully, it displays a message indicating the image was saved successfully. If an error occurs, it displays an error message.

Note that you should replace "src" and "dst" with the actual names of your source and destination folders, respectively.

M Azam Khan
  • 302
  • 3
  • 15