Monday, August 5, 2019

DICOM Cyber security threats: Myths and Truths.

A report by Cylera labs identified a potential cyber security threat in DICOM files that are exchanged
on media such as CD, DVD, flash or through email, as well as through DICOM web service communications (DICOMWeb).

The threat was taken seriously enough by the DICOM committee that it issued a FAQs document to address this potential issue. This threat exploits the additional header that is created for media, email and web exchange. Before discussing the potential threat and what to do about it, let’s first discuss what this header looks like and how it is used.

Media exchange files have an additional header, aka the File Meta header which consists of :
1.  A 192 byte pre-amble
2.  The characters DICM to identify that the following is encoded in DICOM format
3.  Additional information that is needed to process the file, such as the file type, encoding (transfer syntax), who created this file, etc. 
4.   The regular DICOM file.  

This additional information (3) is encoded as standard DICOM tags, i.e. Group 0002 encoding. After the Group 0002 encoding, the actual DICOM file which normally would be exchanged using the DICOM communication protocol will start. This encapsulation is commonly referred to as “part10” encoding because it is defined in part 10 of the DICOM standard. 

The potential cyber security threat as mentioned in the article involves the 192 byte preamble as there are no real rules about what it might contain and how it is formatted. The definition of this area is that it is for Application Profile or implementation specified use. The initial use was for early ultrasound readers, but more recently it is generally used for TIFF file encoding so that a file could have “dual personality” i.e. it can be decoded by a TIFF reader as well as a DICOM reader. The DICOM reader will simply skip the pre-amble and process it accordingly. In case of a TIFF encoding, the preamble will have the TIFF identifiers, i.e. 4 bytes that contain “MM\x00\x2a” or “II\x2a\x00” and additional instructions to decode the file structure. This application seems to have some traction with pathology vendors who are very slow implementing the DICOM whole slide image file set as described by David Clunie in a recent article, or could be used potentially by researchers. If not used by a specific implementation, all bytes in this preamble shall be set to 00H as can be seen in the figure.

The definition of this preamble was identified as a “fundamental flaw in the DICOM design” in the Cylera article mentioned earlier. This assertion was made due to the fact that attackers could embed executable code within this area. This would allow attackers to distribute malware and even execute multi-stage attacks.

In my opinion, this “flaw” is overrated. First of all, the preamble was designed with a specific purpose in mind, allowing multiple applications to access and process the files, and, if not used accordingly, it is required to be set to zero’s. Furthermore, a typical DICOM CD/DVD reader would import the DICOM file, stripping off the complete meta-header (preamble, DICM identifier and Group 0002), potentially coerce patient demographics and study information such as the accession number, and import it in the PACS.

If for whatever reason, the import software would want to copy the DICOM file as-is, i.e. including the meta-header, it could check for presence of non-zero’s in the preamble, and if found, either reject or quarantine the file or overwrite it with zeros. The latter would impact potential “dual-personality” files, but it could check for presence of the TIFF header and act accordingly by making an exception for those very limited use cases (how many people are using pathology and/or research applications today?). Last but not least, don’t forget that we are only discussing a potential flaw with DICOM part-10 files that are limited to exchange media, which means that there is nothing to fear for the regular DICOM exchange between your modalities, PACS and view stations, as these files don’t have the meta-file.

But, to be honest, anything in a file which is “for implementation,” specific use, or is proprietary is potentially subject to misuse. There are Z-segments defined in HL7, private tags in DICOM and even a “raw data” file storage in DICOM that can contain anything imaginable. These additional structures were not design flaws but rather defined for very specific business reasons. The good news is that HL7 FHIR will do away with Z-segments as it is replaced with strictly defined extensions defined by conformance rules, but in the meantime we will be dealing with proprietary extensions for many years. Consequently, you better know where your messages originate and whether the originator has its cyber security measures in place.

In conclusion, the possibility of embedding malware in the DICOM preamble is limited to media exchange files only, which, if present, is easily detectable and is in almost every case stripped off anyway prior to importing these. There are definitely vulnerabilities with any “implementation specific” or proprietary additions to standard file formats. Knowing the originator of your files and messages is important, if there is any suspicion, run a virus scanner, have the application strip off and/or replace any proprietary information, and never ever run an executable that could be embedded within these files.

Is it an Image or a Document? Discussing the “grey area” of overlap between images and documents.

There is a major increase in images to be managed by enterprise imaging systems. It is critical to
decide on how to format the images and documents (DICOM or native?) and where to manage them (EMR, PACS/VNA, document management system, other? Below are some thoughts and recommendations you might consider.

Digital medical imaging used to be confined to radiology and cardiology, and on a smaller scale to oncology. Images were created, managed and archived within these departments. If you wanted to see them you would need to access the image management system (PACS) for that department.
Over the past decade, new image sources started to appear, for example, images taken during surgery through a scope, videos recorded by the gastroenterologists of endoscopic procedures, ophthalmologists recorded retinal images, and pathologists began using digital pathology imaging. Point of care (POC) ultrasound also began to be used increasingly, and now there are intelligent scanning probes available that can connect to a smart phone or tablet.

As the sources of imaging grow, the volume of imaging is growing exponentially. Talking with informaticists at major hospitals, it seems there are new image sources every week, whether it is in the ER where people are taking pictures for wound care or during surgery to assist anesthesiologists.
Good examples of the type of imaging that typically takes place outside the traditional radiology and cardiology domain can be seen at a recent webcast on encounter-based imaging workflow. In his presentation, Ken Persons from the Mayo clinic talks about the fact that they have literally 100’s of alternate imaging devices that create tens of thousands of images per month that need to be archived and managed.

Departments that never recorded images before are now doing this, such as videos from physical therapy recording changes in gait after back surgery. In addition to this avalanche of images generated by healthcare practitioners, soon there will be images taken by patients themselves that need to be kept, e.g. of a scar after surgery after they are being sent home. This will replace in-person follow up exams which will save time, effort and be more efficient. Managing these images has become a major challenge and has shifted from departmental systems to enterprise image management systems, i.e. from PACS to VNA’s.

How is non-image data managed? Textual data such as patient demographics, orders, results and billing information is exchanged, while connecting 100+ computer systems in a typical mid-size hospital, through interface engines. Over the past 5-10 years, Hospital Information Systems (HIS) and departmental systems dedicated to radiology (RIS), cardiology (CIS) and other departments, are being replaced by Electronic Medical Record systems (EMRs) and information is accessed in a patient-centric manner.

A physician now has a single log-on to the EMR portal and can access all the clinical text-based information as well as images. Textual information can be stored and managed by an EMR, e.g. for a lab result as discrete information in its database, or linked to as a document, e.g. a scanned lab report or a PDF document. In addition to these documents being managed in the EMR, they can also be managed and stored in a separate document management system with an API to the EMR for retrieval.

There is no single solution for the problem of where to manage (i.e. index and archive) diagnostic radiology reports. Their formats vary widely as discussed in a related post discussing report exchange on CD’s. In addition to standardized formats such as DICOM SR’s and Secondary capture, additional formats appeared including XML, RTF, TXT and native PDF’s. Not only do the diagnostic report formats differ, but also where they are managed. The reports could have been stored in departmental systems (RIS) or in some cases by a broker. A case in point is the AGFA (initially MITRA) broker (now called Connectivity Manager) that functions as a Modality Worklist provider, and in many institutions also is used to store reports. In addition, reports could reside temporarily in the Voice Recognition System, with another copy in the RIS, EMR and PACS. This causes issues with ensuring amendments and changes to these documents stay in sync at various locations.

Before the universal EMR access, many radiology departments would scan in old reports so they could be seen on the radiology workstation, in addition to scanning patient waivers and other related information into their PACS. This is still widely practiced, witnessed by the proliferation of paper scanners in those departments. These documents are converted to DICOM screen-saves (Secondary Capture), or, if you are lucky, as DICOM encapsulated PDF’s which are much smaller in file size than the Secondary Captures. With regard to MPEG’s, for example swallow studies, a common practice is to create so-called Multiframe Secondary Capture DICOM images. All of this DICOM “encapsulation” is done to manage these objects easily within the PACS, which provides convenient access for a radiologist.

The discussion about images and documents poses the question on what the difference is between an image and a document, which would also determine if the “object” is accessed from an image management system (PACS/VNA), which infers that it is in a DICOM format, or from a document management system (a true document management system, or RIS, EMR) which either assumes a XDS document format (using the defined XDS metadata) or some other semi-proprietary indexing and retrieval system. Note that there are several VNA’s that manage non-DICOM objects, but for the purpose of this discussion, it is assumed that a PACS/VNA manages “DICOM-only” objects.
In most cases, the difference between images and documents is obvious, for example, most people agree that a chest X-ray is a typical example of an image, and a PDF file is a clear example of a document, but what about a JPEG picture taken by a phone in the ER, or an MPEG video clip of a swallow study? A document management system can manage this, or, alternatively, we can “encapsulate” it in a DICOM wrapper and make it an image similar to an X-ray, with the same metadata, being managed by a PACS system.

What about an EKG? One could export the data as a PDF file, making it a document or alternatively maintain the original source data for each channel and store it in a DICOM wrapper so it can be replayed back in a DICOM EKG viewer. By the way, one can also encapsulate a PDF in a DICOM wrapper, which is called an “encapsulated PDF” and manage it in a PACS. Lastly, one could take diagnostic radiology reports and encapsulate them as a DICOM Structured report and do the same for a HL7 version 3 CDA document, e.g. a discharge report, and encapsulate it in a DICOM wrapper and store it in the PACS.

All of which shows that there is a grey area with overlap between images and documents, whereby many documents and other objects could be considered either images, or a better word is DICOM objects and managed by the PACS, or alternatively considered documents and managed by a document management system. Imagine you would implement an enterprise image management and document management system, what would your choices be with regard to these overlapping objects?
 Here are my recommendations:
1. Keep PDF’s as native PDF documents, UNLESS they are part of the same imaging study. For example, if you have an ophthalmology study that includes several retinal images and the same study also creates pdf’s, it would be easier to keep them together which means encapsulating the PDF as a DICOM object. But if you have a PDF for example, from a bone densitometry device, without any corresponding images, I suggest storing it as a PDF.
2.  Use the native format as much as possible:
a. There is no reason to encapsulate a CDA in a DICOM or even a FHIR document object, conversions often create loss of information and are often not reversible. Keep them as CDA’s.
b. Manage JPEG’s and MPEG’s (and others, e.g. TIFF etc.)  as “documents.” As a matter of fact, by using the XDS meta-data set to manage these you are better off because you also are able to manage information that is critical in an enterprise environment such as “specialty” and “department,” which would not be available in the DICOM metadata.
c. Use DICOM encoded EKG’s instead of the PDF screenshots.
d. Stay away from DICOM Secondary Capture if there is original data available, remember that those are “screenshots” with limited information, specifically, don’t use the Screen-Captured dose information from CT’s but rather the full fidelity DICOM Structured Reports which have many more details.
3. Stop scanning documents into the PACS/VNA as DICOM secondary capture and/or PDF’s, they don’t belong there, they should be in the EMR and/or document system.

An EMR is very well suited to provide a longitudinal record of a patient, however, none of the EMR’s I know of will store images. Images are typically accessed by a link from the EMR to a PACS/VNA so that they can be viewed in the same window as the patient record on a computer or mobile device. In contrast, documents are often stored in the EMR, but these are typically indexed in a rudimentary manner and most users hate to go through many documents that might be attached to a patient record to look for the one that has the information they are looking for. A better solution for document access is to have a separate enterprise document management system, which should be able to do better job managing these.

Some VNA’s are also capable of managing documents in addition to images, preferably using the XDS infra-structure. As a matter of fact, if you are NOT using the XDS standard, but a semi-proprietary interface instead to store JPEG’s, MPEG’s and all types of other documents, you might have a major issue as you will be locked into a particular vendor with potential future data migration issues.

Also, be aware of the differences between XDS implementations. The initial XDS profile definitions were based on SOAP messaging and document encapsulation, the latest versions include web services, i.e. DICOMWeb-RS for images and FHIR for documents. Web services allow images or documents to be accessed through a URL. Accessing information through web services is how pretty much all popular web-based information delivery happens today e.g. using Facebook, Amazon, and many others. It is very efficient and relatively easy to implement.

Modern healthcare architecture is moving towards deconstructing the traditional EMR/PACS/RIS silo’s to allow for distributed or cloud-based image and information management systems. From the user perspective, who accesses the information through some kind of a computer based portal or mobile device, it does not really matter where the information is stored, as long as there is a standard “connection” or interface that allows access to either an image or document using web services.

Right now is the perfect time to revisit your current architecture and reconsider how and where you manage and archive images and documents. Many hospitals have multiple copies of these objects stored in a format that does not make sense at locations that were dictated by having easy access to the data without considering whether they really belonged there. Instead of cluttering the current systems, especially when planning for the next generation of systems that are going to be FHIR and DICOMWeb enabled, it is important to index and manage your images and documents at the location where they belong in a format that makes sense.

Thursday, August 1, 2019

SIIM19 part 2: Standards update.

As the representatives for the various standards committees (DICOM, FHIR, IHE) reiterated during the recent 2019 SIIM conference in Denver, there are several new interoperability standards available that could make your life easier, but if the user community does not ask for them in their RFP’s and during regular vendor discussions, there is no incentive for these to be implemented.
Obviously, if you don’t know what to ask for, it gets difficult, therefore here is a synopsis of the new DICOM standards developments covered during the SIIM19 conference:

1.       Multi-energy CT imaging – CT scanners are getting equipped to acquire images using different X-ray energy spectra, which then are processed, subtracted, etc. to provide a different clinical perspective. When the initial CT DICOM metadata was defined in the early 1990’s, there were no multi-spectral CT scanners available or even thought of, therefore, to encode this with the “old” CT data requires a lot of customization and proprietary encoding, hence the need for a new series of objects. 

Remember that it does not only require the acquisition devices to support this new standard, which seems to be the least of the worry given the experience with adapting recent new DICOM objects, but more importantly, the PACS/VNA back-ends and especially the PACS and enterprise viewers will need to support it as well. There are 4 additional “families” of CT objects defined, i.e. for image encoding, material quantification, labeling and visualization.

2.       Contrast administration – Most US institutions have implemented an X-ray radiation dose recording and management system, motivated by the US federal requirements to put the dose information in each CT radiology report. The next area for potential legislative requirements and implementation is the contrast administration and corresponding management as contrast can also be detrimental to a person’s body.

The DICOM contrast agent administration reporting capability will facilitate this. The implementation is very similar to the dose reporting, i.e. it will be recorded in a dedicated Structured Report which provides details about the contrast which was programmed at the injector device and what is actually delivered.

3.       3-D printing – The RSNA hosted a big pavilion showing 3-D models and applications, initially for surgery planning, but eventually for implants. This is a new upcoming area, its management is currently shared between surgery and radiology. There is a need to retain and archive these 3-D “print files” and also for standard interfaces to the various 3-D printers. The DICOM standard added an encapsulation of these print files, called STL (an abbreviation of "stereolithography"). STL is file format native to the stereolithography CAD software created by 3D Systems and is also supported by many other software packages; it is widely used for rapid prototyping, 3D printing and computer-aided manufacturing. The 3-D model usage codes defined by DICOM include those used for:
a.       Educational purposes, such as training, patient education, etc.
b.       Tool fabrication for medical procedures such as radiation shields, drilling guides, etc.
c.       External prosthetics
d.       Whole or partial implants
e.       Surgery simulation
f.        Procedure planning
g.       Diagnostics
h.       Quality Control

4.       DICOMWeb – DICOMWeb provides a protocol alternative to the traditional DICOM protocol that is very effective in exchanging information using webservices and therefore is more suitable for mobile applications than the “traditional” DICOM protocol. There are equivalent services of the traditional DICOM Store, Move, and Find by using STOW, WADO and QIDO as well as the capability for bulk transfer (pixel data only) and metadata (header data) only. The webservices have been re-documented by cleaning up the existing documentation. In addition, a new enhancement has been defined to exchange thumbnails, so now instead of selecting the first image of a series as a source for the thumbnail, one can select an image that is representative of a series of images.

5.       Security – Cybersecurity is a big issue because of a recent publication about the possibility of using metadata that contains malicious data to store images on a CD. The Security Working Group together with the MITA cybersecurity people have issued a publication about this issue with precautions, (see press release). The metadata aka preamble could contain an executable; therefore, one is encouraged to use a virus scan and also disable running any executables from the media.

6.       Consistent protocols – for use by XA and MR are important in case a radiologist wants to compare a study with previous ones and also to compare studies that were created in different organizations. A DICOM extension allows for storing these protocols so they can be reused.

7.       Artificial Intelligence (AI) – is getting a lot of attention. Guidelines on how to include AI annotations and how to incorporate these into the workflow are defined. Assuming that the annotations are encoded in a DICOM Structured Report, there is a JSON representation of the DICOM SR defined.

8.       Dermatology – revitalized to address dermoscopy, which uses surface microscopy to evaluate skin lesions and can be used for early detection of skin cancer. It is an extension to the regular photography file definitions with new codes that are added.

9.       Ultrasound – has been revitalized to come up with a proposal to track transducers. This is somewhat of a challenge as not all of these probes are “intelligent” and can exchange a unique identifier. It is important to track transducers for infection control.

As mentioned earlier, if the user community does not request these new features, there is little chance that they will be implemented in a timely fashion by the manufacturers. A rule of thumb that I recommend is that one includes in the RFP an automatic upgrade for all new DICOM features within a reasonable time (e.g. 3 years) unless federal and/or state requirements require it to be sooner such as is the case for dose reporting (and might be for contrast administration).