Monday, August 5, 2019

DICOM Cyber security threats: Myths and Truths.

A report by Cylera labs identified a potential cyber security threat in DICOM files that are exchanged
on media such as CD, DVD, flash or through email, as well as through DICOM web service communications (DICOMWeb).

The threat was taken seriously enough by the DICOM committee that it issued a FAQs document to address this potential issue. This threat exploits the additional header that is created for media, email and web exchange. Before discussing the potential threat and what to do about it, let’s first discuss what this header looks like and how it is used.

Media exchange files have an additional header, aka the File Meta header which consists of :
1.  A 192 byte pre-amble
2.  The characters DICM to identify that the following is encoded in DICOM format
3.  Additional information that is needed to process the file, such as the file type, encoding (transfer syntax), who created this file, etc. 
4.   The regular DICOM file.  



This additional information (3) is encoded as standard DICOM tags, i.e. Group 0002 encoding. After the Group 0002 encoding, the actual DICOM file which normally would be exchanged using the DICOM communication protocol will start. This encapsulation is commonly referred to as “part10” encoding because it is defined in part 10 of the DICOM standard. 

The potential cyber security threat as mentioned in the article involves the 192 byte preamble as there are no real rules about what it might contain and how it is formatted. The definition of this area is that it is for Application Profile or implementation specified use. The initial use was for early ultrasound readers, but more recently it is generally used for TIFF file encoding so that a file could have “dual personality” i.e. it can be decoded by a TIFF reader as well as a DICOM reader. The DICOM reader will simply skip the pre-amble and process it accordingly. In case of a TIFF encoding, the preamble will have the TIFF identifiers, i.e. 4 bytes that contain “MM\x00\x2a” or “II\x2a\x00” and additional instructions to decode the file structure. This application seems to have some traction with pathology vendors who are very slow implementing the DICOM whole slide image file set as described by David Clunie in a recent article, or could be used potentially by researchers. If not used by a specific implementation, all bytes in this preamble shall be set to 00H as can be seen in the figure.

The definition of this preamble was identified as a “fundamental flaw in the DICOM design” in the Cylera article mentioned earlier. This assertion was made due to the fact that attackers could embed executable code within this area. This would allow attackers to distribute malware and even execute multi-stage attacks.

In my opinion, this “flaw” is overrated. First of all, the preamble was designed with a specific purpose in mind, allowing multiple applications to access and process the files, and, if not used accordingly, it is required to be set to zero’s. Furthermore, a typical DICOM CD/DVD reader would import the DICOM file, stripping off the complete meta-header (preamble, DICM identifier and Group 0002), potentially coerce patient demographics and study information such as the accession number, and import it in the PACS.

If for whatever reason, the import software would want to copy the DICOM file as-is, i.e. including the meta-header, it could check for presence of non-zero’s in the preamble, and if found, either reject or quarantine the file or overwrite it with zeros. The latter would impact potential “dual-personality” files, but it could check for presence of the TIFF header and act accordingly by making an exception for those very limited use cases (how many people are using pathology and/or research applications today?). Last but not least, don’t forget that we are only discussing a potential flaw with DICOM part-10 files that are limited to exchange media, which means that there is nothing to fear for the regular DICOM exchange between your modalities, PACS and view stations, as these files don’t have the meta-file.

But, to be honest, anything in a file which is “for implementation,” specific use, or is proprietary is potentially subject to misuse. There are Z-segments defined in HL7, private tags in DICOM and even a “raw data” file storage in DICOM that can contain anything imaginable. These additional structures were not design flaws but rather defined for very specific business reasons. The good news is that HL7 FHIR will do away with Z-segments as it is replaced with strictly defined extensions defined by conformance rules, but in the meantime we will be dealing with proprietary extensions for many years. Consequently, you better know where your messages originate and whether the originator has its cyber security measures in place.

In conclusion, the possibility of embedding malware in the DICOM preamble is limited to media exchange files only, which, if present, is easily detectable and is in almost every case stripped off anyway prior to importing these. There are definitely vulnerabilities with any “implementation specific” or proprietary additions to standard file formats. Knowing the originator of your files and messages is important, if there is any suspicion, run a virus scanner, have the application strip off and/or replace any proprietary information, and never ever run an executable that could be embedded within these files.