• Encoding of VectorGridData (OF VR)

    From Cristiano@21:1/5 to All on Thu May 20 04:16:03 2021
    Hi all,

    I'm trying to get to grips with the encoding required for VectorGridData:

    http://dicom.nema.org/medical/dicom/current/output/chtml/part03/sect_C.20.3.html#sect_C.20.3.1.3

    The OF VR requires "A stream of 32-bit IEEE 754:1985 floating point words".

    As far as I can tell there's no specified encoding required of the standard. Hence I wasn't sure what encoding to use when trying to decode VectorGridData.

    Luckily I have a couple of Deformable DICOM Registration Objects (DROs) at my disposal. I'm assuming that they have been properly encoded as they seem to have been generated by a popular system (MIM Software).

    I initially tried using the chardet package to determine the encoding:

    chardet.detect(vgd1)['encoding']

    where

    vgd1 = Dro1.DeformableRegistrationSequence[1]\
    .DeformableRegistrationGridSequence[0]\
    .VectorGridData
    and

    Dro1 = pydicom.dcmread('filepath_of_DRO.dcm')

    but that returned "None". Luckily I was more successful with the json package:

    json.detect_encoding(vgd1)

    which returned 'utf-8'. But when I tried to decode it using either:

    vgd1.decode('UTF-8')

    or

    str(vgd1, 'UTF-8')

    I get errors for both DROs. For vgd1:

    print(vgd1[0:20], '\n')
    print(vgd1.decode('UTF-8'))

    b"(RSA\xf0\x19CAx\xd4\x97\xc1\xd0~'A8\x84AA"
    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf0 in position 4: invalid continuation byte

    and vgd2:

    print(vgd2[0:20], '\n')
    print(vgd2.decode('UTF-8'))

    UnicodeDecodeError: 'utf-8' codec can't decode byte 0xaa in position 4: invalid start byte

    Can someone please help point me in the right direction?

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From David Gobbi@21:1/5 to All on Thu May 20 08:17:00 2021
    When pydicom returns the VectorGridData as a "bytes" object, then those bytes are just the raw raw floating-point data. Asking the json module to figure out the text encoding isn't going to help, because these bytes don't store text, and the answer that
    json gives will be meaningless.

    The reason pydicom returns "bytes" is because the "bytes" type is the most widely compatible buffer type available in Python. In Python, a buffer simply represents a chunk of the computer's memory (search for Python Buffer Protocol for more information).

    Are you familiar with numpy? In numpy, you can create an array from a buffer (with the understanding that your "bytes" object is a buffer that stores floats from the DICOM file):

    vgd1_array = numpy.frombuffer(vgd1, dtype='f')

    All in all, this has very little to do with DICOM itself, it's really a question about pydicom.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)
  • From Cristiano@21:1/5 to david...@gmail.com on Thu May 20 13:53:25 2021
    Thanks David. Yes I had come across numpy.frombuffer() (and numpy.tobytes() to go the other way) but had failed to get sensible results due to a misunderstanding in what it was that I was actually doing.

    By the way, it seems that the array package can also do this: array.array('f', bytes_object), but I use numpy a fair bit so it makes sense to stick with it.


    On Thursday, 20 May 2021 at 16:17:02 UTC+1, david...@gmail.com wrote:
    When pydicom returns the VectorGridData as a "bytes" object, then those bytes are just the raw raw floating-point data. Asking the json module to figure out the text encoding isn't going to help, because these bytes don't store text, and the answer
    that json gives will be meaningless.

    The reason pydicom returns "bytes" is because the "bytes" type is the most widely compatible buffer type available in Python. In Python, a buffer simply represents a chunk of the computer's memory (search for Python Buffer Protocol for more information)
    .

    Are you familiar with numpy? In numpy, you can create an array from a buffer (with the understanding that your "bytes" object is a buffer that stores floats from the DICOM file):

    vgd1_array = numpy.frombuffer(vgd1, dtype='f')

    All in all, this has very little to do with DICOM itself, it's really a question about pydicom.

    --- SoupGate-Win32 v1.05
    * Origin: fsxNet Usenet Gateway (21:1/5)