0

I have written a code in python to convert dicom (.dcm) data into a csv file. However, if I run the code for more than once on my database directory, the data is automatically getting lost/deleted. I tried searching in 'recycle bin' but could not find the deleted data. I am not aware of the process of what went wrong with the data.

Is there anything wrong with my code? Any suggestions are highly appreciated.

here is my code:

import xlsxwriter
import os.path
import sys
import dicom
import xlrd
import csv


root = input("Enter Directory Name: ")
#path = os.path.join(root, "targetdirectory")
i=1

for path, subdirs, files in os.walk(root):
    for name in files:

        os.rename(os.path.join(path, name), os.path.join(path,'MR000'+ str(i)+'.dcm'))
        i=i+1


dcm_files = []
for path, dirs, files in os.walk(root):
    for names in files:
        if names.endswith(".dcm"):
            dcm_files.append(os.path.join(path, names))

print (dcm_files)

with open('junk/test_0.csv', 'w', newline='') as csvfile:
        spamwriter = csv.writer(csvfile, delimiter=',',
                                quotechar='|',
                                quoting=csv.QUOTE_MINIMAL)

        spamwriter.writerow(["Folder Name","File Name", "PatientName",
                             "PatientID", "PatientBirthDate","SliceThickness","Rows"])

        for dcm_file in dcm_files:
            ds = dicom.read_file(dcm_file)
            fileName = dcm_file.split("/")
            spamwriter.writerow([fileName[1],fileName[2], 
                                 ds.get("PatientName", "None"), 
                                 ds.get("PatientID", "None"), 
                                 ds.get("PatientBirthDate", "None"), 
                                 ds.get("SliceThickness", "None"),
                                 ds.get("Rows", "None")])
joaquin
  • 82,968
  • 29
  • 138
  • 152
Asif
  • 11
  • 3
  • 1
    what about if you run the code once ? And there is something left ?. Note you are renaming your files, so they 'dissappear'... – joaquin Jan 20 '18 at 18:40
  • If I run the code once, it just renamed the files from its original name but nothing is lost, if I run the code second time, I start losing the files. – Asif Jan 20 '18 at 18:49
  • What do you mean with 'I start losing files' ? How many do you lose, how many are left and how their names are changed ? – joaquin Jan 20 '18 at 20:33
  • If an initial db is about 60mb, it becomes around 40mb after second iteration but I don't think there is a pattern of lost files. Files names are not changed but files are lost, for example, if I have MR0001-10 then probably 2,5,6 etc files are lost. – Asif Jan 20 '18 at 20:52

1 Answers1

1

You have something like the following scenario:

After 1st iteration, you end with the files: MR0001.dcm, MR0002.dcm, MR0003.dcm... In 2nd iteration, there are the following changes:

os.rename('some_file',  'MR0001.dcm')
os.rename('MR0001.dcm', 'MR0002.dcm')
os.rename('MR0002.dcm', 'MR0003.dcm')
os.rename('MR0003.dcm', 'MR0004.dcm')
...

So at the end there is only a file 'MR0004.dcm'.

Add the following line just below renaming:

print( os.path.join(path, name), '-->', os.path.join(path,'MR000'+ str(i)+'.dcm'))

Then you will see, what exactly files are renamed.

Bartłomiej
  • 1,068
  • 1
  • 14
  • 23
  • did you reproduce this scenario in practice ?. I ran the code and files do not dissapear. In the case a new file (some_file) is created between runs, then you get a FileExistError because you can not create MR0001.dcm if it already exists. – joaquin Jan 20 '18 at 20:45
  • In that case I used os.replace instead of os.rename. In Linux the code runs with os.rename but shows error with pycharm in Windows. I used os.replace while running in Windows and files disappear if you run more than 3/4 times. – Asif Jan 20 '18 at 20:56
  • @joaquin, I have repeated it just now ( [pastebin](https://pastebin.com/c4JGUy82) ), and I have realized that the order of the files in the directory doesn't have to be alphabetical. But still, the files are overwritten wothout any troubles. – Bartłomiej Jan 20 '18 at 21:09
  • @Bartłomiej Is that on Linux ? I see `os.replace` in windows removes everything but one file on the second iteration. `os.rename` donot – joaquin Jan 20 '18 at 21:14
  • That's right, Linux, Python 3.5.2. By the way, I wasn't aware of the differences between Windows and Linux before. See [this answer](https://stackoverflow.com/a/35202735/7708542). – Bartłomiej Jan 20 '18 at 21:28