Abstract:

The production of medical imaging data has grown tremendously in the last decades. Nowadays, even small institutions produce a considerable amount of studies. Furthermore, the general trend in new imaging modalities is to produce more data per examination. As a result, the design and implementation of tomorrow’s storage and communication systems must deal with big data issues. The research on technologies to cope with big data issues in large scale medical imaging environments is still in its early stages. This is mostly due to the difficulty of implementing and validating new technological approaches in real environments, without interfering with clinical practice. Therefore, it is crucial to create test bed environments for research purposes. This study proposes a methodology for creating simulated medical imaging repositories, based on the indexing of model datasets, extraction of patterns and modelling of study production. The system creates a model from a real-world repository’s representative time window and expands it according to on-going research needs. In addition, the solution provides distinct approaches to reducing the size of the generated datasets. The proposed system has already been used by other research projects in validation processes that aim to assess the performance and scalability of developed systems.