Sunday, July 29, 2012

How Would You Store 1 Billion Images?

The Mayo Clinic in Rochester, MN had 1.3 billion images in its enterprise archive as reported by Ken Persons during the SIIM conference in June. Each day approximately another million images are added. They have gone through 13 data migrations so far, and are doing another three right now covering 27 departments.
Even though 99 percent of hospitals today are not at this level of digitalization and image production, it makes sense to look at institutions like the Mayo Clinic to find out what they learned handling an archiving system on this scale, as the time will come, even if only 5 or 10 years from now, that many institutions will face similar challenges. Just wait until pathology begins to convert to digital archiving, as a typical department handling 30,000 procedures could easily create 100 Tbytes/year.
Managing this amount of data and number of migrations could only be feasible using a Vendor Neutral Archive or VNA. The folks at Mayo hate the term VNA as much as anyone else, that is why they talk about “enterprise archive” as there is no commonly agreed upon functionality for a VNA, even though I tried to define such functionality in this white paper (see link).
One of the major challenges Ken reported at that meeting is ensuring that a new PACS is made “aware” of the historical data in the enterprise archive so that priors are pre-fetched as needed. There are several options on how this can be accomplished, the first one being a “brute force” method, which requires all of the data to be pushed to the PACS to be re-archived, or the images from a specified number of months to be archived and re-indexed. This is clearly unacceptable and defeats the main purpose of having a VNA.
Another option is a one-time PACS database update with all of the available exam content. This is basically a migration of the database only, leaving the archived images in their enterprise location. A third option is to perform a query by the PACS of the enterprise archive to discover any studies that are relevant. The fourth, or “order driven” option is to pre-fetch as needed based on order information. Critical is the migration of the study description so that the relevant priors can be retrieved. If the performance of the retrieval is acceptable and if it is done in the background, I would guess that the “query method” is probably most preferable, followed by the “order driven” method.
One of the major discoveries the folks at the Mayo Clinic made is that there are a lots of pictures, i.e. conventional photographs made as well as videos for all kind of clinical purposes, ranging from documenting a certain gait of people who have trouble walking, to documenting skin lesions. The challenge is to archive all these clips and photos, which are typically stored on CD’s, DVD’s and archived on various computers and laptops, and should be part of the electronic health record as well. I would assume that if you walk around different departments in your institution, you too will find a lot of those types of images as well.
One of the observations I made when talking with the Mayo folks is the fact that they don’t use a commercial viewer to access the images in their enterprise archive. They have their own viewer and even though they benchmark this viewer every couple of years against available commercial viewers, it appears that they can’t buy what is needed to satisfy their physicians with regard to functionality. It is true that their viewer is not just a radiology imaging viewer, rather it is capable of displaying all of the various image types in their enterprise archive. I would argue that it does not take a lot of effort to create such a viewer. I would encourage vendors, however, to find out what is needed to satisfy the Mayo Clinic folks, not only would it result in a customer licensing tens of thousands of your viewers, but it would also provide the capabilities that very likely might be needed for every other customer whose imaging archives begin to grow on the scale of the Mayo Clinic.
In conclusion, it makes sense to find out how large institutions such as the Mayo Clinic are dealing with the exponential increases in image production and how they facilitate all the different specialties and departments in their enterprise archive in order to be prepared as your institution begins going through the same growth process.