1

I trying to read dicom file with python3 and pydicom library. For some dicom data, I can't get data correctly and get error messages when I tried to print the result of pydicom.dcmread.

However, I have tried to use python2 and it worked well. I checked out the meta information and compared it with other dicom files which can be processed, I didn't find any difference between them.

import pydiom

ds = pydicom.dcmread("xxxxx.dicom")
print(ds)

Traceback (most recent call last):
  File "generate_train_data.py", line 387, in <module>
    tf.app.run()
  File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 125, in run
    _sys.exit(main(argv))
  File "generate_train_data.py", line 371, in main
    create_ann()
  File "generate_train_data.py", line 368, in create_ann
    ds_ann_dir, case_name, merge_channel=False)
  File "generate_train_data.py", line 290, in process_dcm_set
    all_dcms, dcm_truth_infos = convert_dicoms(dcm_list, zs)
  File "generate_train_data.py", line 179, in convert_dicoms
    instance_num, pixel_spacing, img_np = extract_info(dcm_path)
  File "generate_train_data.py", line 147, in extract_info
    print(ds)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 2277-2279: ordinal not in range(128)

Anyone has come across the same problem?

Tom
  • 21
  • 1
  • 2
  • 1
    Can you share dicom file you have troubles with? – Alderven Feb 01 '19 at 09:51
  • 1
    https://stackoverflow.com/questions/9942594/unicodeencodeerror-ascii-codec-cant-encode-character-u-xa0-in-position-20 might be of help – g_uint Feb 21 '19 at 07:06
  • 1
    Possible duplicate of [UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 20: ordinal not in range(128)](https://stackoverflow.com/questions/9942594/unicodeencodeerror-ascii-codec-cant-encode-character-u-xa0-in-position-20) – g_uint Feb 21 '19 at 07:07
  • 1
    @Alderven I'm sorry for the late reply. Since the data is sensitive, I can't share it here. – Tom Feb 22 '19 at 04:48

2 Answers2

1

I believe the cause of the problem is that Python (for me it only happened in Python 3 running on Centos 7.6 Linux printing to a terminal window on MacOS) is not able to figure out how to print a string that contains a non-ascii character because of the setting of the locale. You can use the locale command to see the results. Mine started out with everything set to "C". I set the LANG environment variable to en_US.UTF-8. With that setting it worked for me. In csh this is done using

setenv LANG en_US.UTF-8

In bash use:

export LANG=en_US.UTF-8

My problem resulted from having 'µ' in the Series Description element. The file was an attenuation map from a SPECT reconstruction on a Siemens scanner. I used the following Python code to help figure out the problem.

#! /usr/bin/env python3
import pydicom as dicom
from sys import exit, argv

def myprint(ds, indent=0):
    """Go through all items in the dataset and print them with custom format
    Modelled after Dataset._pretty_str()
    """
    dont_print = ['Pixel Data', 'File Meta Information Version']

    indent_string = "   " * indent
    next_indent_string = "   " * (indent + 1)

    for data_element in ds:
        if data_element.VR == "SQ":   # a sequence
            print(indent_string, data_element.name)
            for sequence_item in data_element.value:
                myprint(sequence_item, indent + 1)
                print(next_indent_string + "---------")
        else:
            if data_element.name in dont_print:
                print("""<item not printed -- in the "don't print" list>""")
            else:
                repr_value = repr(data_element.value)
                if len(repr_value) > 50:
                    repr_value = repr_value[:50] + "..."
                try:
                    print("{0:s} {1:s} = {2:s}".format(indent_string,
                                                   data_element.name,
                                                   repr_value))

                except:
                    print(data_element.name,'****Error printing value')

for f in argv[1:]:
    ds = dicom.dcmread(f)
    myprint(ds, indent=1)

This is based on the myprint function from]1

The code tries to print out all the data items. It catches exceptions and prints "****Error printing value" when there is an error.

ericfrey
  • 31
  • 1
  • 7
1

Can you give an example for such a dicom file? When running the pydicom example with python 3.7 it's working perfectly:

import matplotlib.pyplot as plt
import pydicom
from pydicom.data import get_testdata_files
filename = get_testdata_files("CT_small.dcm")[0]
ds = pydicom.dcmread(filename)
plt.imshow(ds.pixel_array, cmap=plt.cm.bone)

enter image description here

It's also working with the sample dicom files from the Medical Image Samples.

Noam Peled
  • 4,484
  • 5
  • 43
  • 48
  • 1
    Thanks! I'm sorry for the late reply. Since the data is sensitive, I can't share it here. Actually, I ran the source code in the same data. On one machine, python2 worked, but python3 didn't work, on another machine, python3 worked but python2 didn't work. I'm wondering if the reason is in the local environment setting. – Tom Feb 22 '19 at 04:57