Simulated 3D Ultrasound LV Cardiac Images for Active Shape Model Training Constantine Butakoff, Simone Balocco, Sebastian Ordas, Alejandro F. Frangi Computational Imaging Lab, Department of Technology, University Pompeu Fabra, pg. de Circumavallacio 8, 08003 Barcelona, Spain. ABSTRACT In this paper a study of 3D ultrasound cardiac segmentation using Active Shape Models (ASM) is presented. The proposed approach is based on a combination of a point distribution model constructed from a multitude of high resolution MRI scans and the appearance model obtained from simulated 3D ultrasound images. Usually the appearance model is learnt from a set of landmarked images. The significant level of noise, the low resolution of 3D ultrasound images (3D US) and the frequent failure to capture the complete wall of the left ventricle (LV) makes automatic or manual landmarking difficult. One possible solution is to use artificially simulated 3D US images since the generated images will match exactly the shape in question. In this way, by varying simulation parameters and generating corresponding images, it is possible to obtain a training set where the image matches the shape exactly. In this work the simulation of ultrasound images is performed by a convolutional approach. The evaluation of segmentation accuracy is performed on both simulated and in vivo images. The results obtained on 567 simulated images had an average error of 1.9 mm (1.73 ± 0.05 mm for epicardium and 2 ± 0.07 mm for endocardium, with 95% confidence) with voxel size being 1.1 × 1.1 × 0.7 mm. The error on 20 in vivo data was 3.5 mm (3.44 ± 0.4 mm for epicardium and 3.73 ± 0.4 mm for endocardium). In most images the model was able to approximate the borders of myocardium even when the latter was indistinguishable from the surrounding tissues. Keywords: active shape model, ultrasound segmentation, automatic landmarking
1. INTRODUCTION Ultrasound is known to be the fastest, least expensive and least invasive screening modality for imaging the heart. Because of the 3D structure and deformation of the heart muscle during the cardiac cycle, analysis of irregularly shaped cardiac chambers or description of valve morphology using 2D images is inherently limited. Developments in 3D echocardiography started in the late 1980s. During the last two decades it evolved from free-hand scanning, replaced later by mechanical scanning of several planes using 2D transducer, to 3D matrix phased-array transducers that are able to acquire the whole 3D volume in real time. The appearance of the new modality brought in new challenges and the need for new analysis tools, many of which rely on correct myocardium segmentation. But the quality of the data is not sufficient yet (as compared for example to 2D US) due to the frequency dependent speckle noise, attenuation and motion artifacts and most of all due to poor spatial resolution of the hardware. As a matter of fact, the suboptimal quality forced many studies to reject up to one third of the data.1, 2 Nevertheless the new US segmentation approaches appear in the literature. Let us briefly overview the most recent works dealing with 3D US segmentation. An automatic endocardium segmentation and tracking based on a fuzzy model was proposed by Sanchez-Ortiz et al.3 in 2002. The algorithm integrates a spatio-temporal model into fuzzy clustering to identify endocardial surface. The tracking results on nine good quality images were reported to be similar to manually traced contour. In the same year a different approach that uses a catalog of manually delineated shapes was proposed by Song et al.4 The Further author information: (Send correspondence to A.F.F.) C.B.: E-mail:
[email protected], Telephone: +34 935 42 1364 S.B.: E-mail:
[email protected] S.O.: E-mail:
[email protected] A.F.F.: E-mail:
[email protected], Telephone: +34 935 42 1451
authors opted for a statistical shape model based on convex combinations of surface models from a catalog. The latter consisted of 86 manually delineated shapes. Initialized by seven points (apex, two valves, and four points on endocardial boundary) the model is adapted to an image using pixel appearance and pixel class prediction probabilities, computed off-line by generalized expectation maximization algorithm. Each image consists of five views obtained by a rotating 2D US transducer. The results on 20 training and 25 testing images showed an average error for epicardium and endocardium segmentation of approximately 3 mm. In another work from 2005, by Kuo et al.,5 a deformable cylinder was used to approximate endocardial cavity. The goal of the authors was primarily to evaluate the advantages of 3D US imaging over 2D, so the cylinder was fit to images based on operator specified points. The same year a levelset based segmentation was proposed by Angelini et al.6 The evaluation was performed on 64 × 64 × 512 volumes acquired by a matrix phased array transducer from ten patients. Segmenting endocardium the authors faced difficulties with correct estimation of endocardium volume, possibly due to the data truncation (when the LV does not fit into the field of view (FOV) of the transducer), but the ejection fraction (EF) measurements were reported to lie within inter- and intra-observer variability (the latter is compared to a number of other studies). Finally we would like to mention a work by Hong et al.7 from 2006. Their approach somewhat resembles that of Song et al.4 in that it also uses a set of prototype shapes. The resulting shape is a Nadaraya-Watson kernel-weighted average of the prototypes. The authors propose to use 2D Haar-like features to detect myocardium and require manual annotation in four-chamber (A4C) view to reduce the search space of the optimization algorithm. The testing was performed on two sets of 160×144×208 volumes acquired by a matrix phased array transducer; the sets contained 44 end-diastolic (ED) and 40 end-systolic (ES) volumes. The reported errors with respect to manual delineations were on average about 3 voxels, although voxel dimensions were not specified. Considering the above mentioned methods one can note that they either use a predefined surface model which is matched to a 3D volume or derive the surface from the segmentation. The benefit of using a predefined surface model is that it simplifies establishing links with the cardiac anatomy and allows to easily correlate data between different studies or patients. It is also easy to subdivide such a model into 16 segments as defined by the the American Heart Association. The latter will allow to better correlate the results of the algorithm in question to the other algorithms implemented in current echocardiographic systems. From the algorithmic point of view, by imposing shape regularity constraints on the predefined shape model, instantiated to the data, would allow to robustly recover the correct shape even in the areas of ill-defined borders. Another interesting problem not addressed by most papers is the segmentation of epicardium. Being able to segment both epi- and endocardium could provide an interesting insight into myocardial deformation and wall thickening, which is already being measured in 2D US. Apart from that, we would like to note that it is nearly impossible to compare the performance of different algorithms proposed in the literature. The 3D US technology is evolving very quickly in an attempt to provide higher spatial and temporal resolution and there is a lot of competition between US equipment manufacturers, who are releasing every time more and more powerful ultrasound transducers. In this article we would like to consider LV segmentation by a 3D ASM.8, 9 In one of the previous studies we have already constructed an LV statistical shape model from 90 high-quality MRI studies including common pathologies.10, 11 The characteristic property of ASM is that usually it has to be adapted to the specific imaging modality. So if we are to apply the ASM to US images of a given ecograph we must train it on US images, ideally acquired by the same ecograph. Generally, such training involves manual contour delineations, but in 3D US it is nearly impossible due to poor quality of the data. Moreover it is a time-consuming and impractical way to obtain a representative training set. The alternative we would like to consider is construction of a training set using simulated 3D US volumes. Being able to generate any LV shape using the shape model of ASM, we can generate the corresponding 3D volume such that the modeled cardiac walls will lie on the image exactly where expected. For simulations we are using an established convolution method12 extended to 3D.
2. SIMULATING THE US DATA In this study we have followed the convolutional approach that ignores the geometry of the transducer.12–14 As opposed to the Field II simulator developed by Jensen et al.,15 ignoring geometry allows to speed up simulations but at the cost of realism of the simulated images. For the simulation it is assumed that the imaging system
Figure 1. A sample slice of simulated volume: LV surface model (a), corresponding echogenicity model (b) and the result of convolution (c) with overlaid LV shape.
has a linear, space-invariant point spread function (PSF) and the transducer is linear. Let t (x, y, z) be an echogenicity model (Fig. 1b) of the object being imaged (Fig. 1a). The x, y and z are lateral, elevation and axial coordinates. First, subresolution variations in object impedance should be introduced by adding Gaussian white noise G (σn ; x, y, z) with zero mean and variance σn . T (x, y, z) = t (x, y, z) · G (σn ; x, y, z)
(1)
The 3D ultrasonic echo dataset V (x, y, z) can then be obtained by a convolution V (x, y, z) = h (x, y, z) ∗ T (x, y, z)
(2)
h (x, y, z) = h1 (x, σx ) · h1 (y, σy ) · h2 (z, σz ) £ ¡ ¢¤ h1 (u, σu ) = exp −u2 / 2σu2 £ ¡ ¢¤ h2 (v, σv ) = sin (2πf0 v/c) exp −v 2 / 2σv2
(3)
where
(4) (5)
c is the speed of sound in the tissue (usually 1540 m/s) and f0 is the center frequency of the transducer. The image of the envelope-detected amplitude, A (x, y, z) (shown in Fig. 1c), is given by ¯ ¯ ¯ ¯ A (x, y, z) = ¯V (x, y, z) + iVb (x, y, z)¯
(6)
where Vb (x, y, z) is the Hilbert transform of V (x, y, z).
3. EXPERIMENTAL SETUP The training and testing sets of simulated images contained 567 volumes each. The voxel size was taken equal to 1.1 × 1.1 × 0.7 mm. The size of each volume was 160 × 144 × 208 voxels. The volumes were generated by changing the following parameters: 1. Deforming the shape along the first principal mode of variation from −3 to 3 standard deviations (SD) in seven steps. 2. Changing the amplitudes of the echogenicity model. The tissue surrounding LV was assumed uniform and was allowed to change between 20 and 70 (on a scale of 256 gray-level values). The amplitudes for endocardial cavity were taken equal to those of the tissue surrounding LV minus 10. Myocardial amplitudes were varied from 100 to 200 for it usually appears brighter than the surrounding tissues. All the amplitudes have been varied in three steps. 3. Changing the variance of PSF in axial direction σz from 0.3 m2 to 0.7 m2 in three steps.
Figure 2. Simulation examples corresponding to different tissue amplitudes, varying from low contrast image in (a) to high contrast one in (c).
4. Changing the variance of PSF in lateral and elevation directions σx and σy simultaneously from 0.5 m2 to 1 m2 in three steps. The above values for PSF parameters were established experimentally to make the simulated images as similar as possible to those obtained by the 3D matrix phased-array transducer. The shape was deformed only along the first principal component to reduce the number of simulations but still be able to simulate the most significant shape variations. With the same goal of reducing the number of simulations the amplitudes of the surrounding tissues and the endocardial cavity were set simultaneously with the latter being a bit smaller (as they typically appear on images). The rest of parameters were fixed to the following values: f0 = 3 MHz, c = 1540 m/s and the variance of the Gaussian noise σn = 1.5 m2 . An example of images corresponding to different parameters can be seen in Fig. 2. The 20 in vivo studies were acquired by a Philips Sonos 7500 system (Philips, Andover, USA) with a x4 matrix phased-array transducer. Each of these datasets was manually segmented by placing landmarks on both endo- and epicardium. The size of the voxel was around 1.1 × 1.1 × 0.7 mm. The size of each volume was 160 × 144 × 208 voxels. The segmentation was performed by the ASM approach borrowed from Ordas et al.,8 where one appearance model is constructed for each LV sector. The utilized model, demonstrated in Fig. 1a, has 33 sectors as defined by American Heart Association (16 on endocardial surface and 17 on epicardial). The segmentation was performed at two resolution levels (original size and eight times reduced), with nine samples of the normalized image gradient per profile in appearance model and 15 iterations per resolution. The model was initialized by 4 points: one at the apex and three on the endocardial border of the basal slice (to specify correct scale and orientation of the shape).
4. RESULTS AND DISCUSSION The point-to-surface error was used to evaluate the accuracy of segmentation. It is computed by averaging the distances of each vertex of one surface to the closest point on the other surface. The root mean square error (RMSE) is computed according to v v u u N N u1 X X 1 u 1 t RM SE = d2 + t d¯2 (7) 2 N i=1 i N i=1 i where N is the number of vertices, di are the distances from vertices of the first surface to the second surface and d¯i are the distances from vertices of the second surface to the first. We would like to start the evaluation by estimating the segmentation accuracy on simulated US images. A number of sample segmentation results are presented in Fig. 5. In general, the mean RMSE on simulated images
Figure 3. One of the images corresponding to the shape model deformed to 3 standard deviations along the first principal component and the corresponding segmentation. RMSE = 4.5 mm.
Figure 4. Error distributions in the simulated test set.
with 95% confidence level was 1.9±0.05 mm with a standard deviation of 0.8 mm (1.73 ± 0.05 mm for epicardium and 2 ± 0.07 mm for endocardium), which is comparable to voxel dimensions. The histogram of errors is shown in Fig. 4 where only 68 images had an error larger than 3 mm, 90% of which corresponded to the shape model deformed along the first principal component to 3 standard deviations and the remaining 10% corresponded to the shape model deformed to 2 standard deviations. As it can be seen from Fig. 3, in the case of our shape model, the mentioned deformations represent very strong shrinking of the heart in the axial direction. The image displayed in Fig. 3 corresponds to the largest segmentation error of 4.5 mm. The evaluation on in vivo data was less accurate and the quality of the data caused certain problems described below. The average error for in-vivo segmentation was 3.59±0.3 mm with the standard deviation of 0.9 mm (3.44 ± 0.4 mm for epicardium and 3.73 ± 0.4 mm for endocardium). These errors are similar to those of Song et al.4 and Hong et al.7 (assuming that their voxel dimensions are similar to ours as it is with the volume size). Most of the imprecision was concentrated in the middle and apical slices. Some of the common problems are illustrated in Fig. 6. Actually only two first cases (a) and (b) come from the testing set, the case (c) corresponds to a patient with asynchrony, but due to majority of myocardium being indistinguishable from the surrounding tissues it could not be manually landmarked. All three images (and all the testing images) suffer from holes in myocardium. The image (b) has a reverberation artifact on the lateral wall which confused the model and the image (c) had almost all the myocardium blended in with the surrounding tissues. Also in most of the images due to a limited field of view (FOV) a part of the lateral wall was left outside. It is worth to mention that in many cases ASM, just based on a piece of myocardium in the basal slice and more of it on the septal, anterior and posterior walls, was able to approximate the missing myocardium. Another problem that seems to be more algorithm specific is the inability to segment the septal wall correctly in some images as in Fig. 6a. We tried different initializations but the model converges to the same result. We think that it is due to pyramidal structure of the US data, which (on purpose) is not taken into account in our simulations and we are currently working on incorporating that knowledge into the appearance model of the ASM.
Figure 5. Latero-septal and antero-posteral views of three segmented simulated images.
Figure 6. Latero-septal (top) and antero-posteral (bottom) views of three segmented in vivo images, and some of the problems: (a) incorrect septum segmentation; (b) misguided segmentation due to a reverberation on the lateral wall; (c) myocardium indistinguishable from the surrounding tissues.
5. CONCLUSIONS In this article we wanted to present preliminary results of training ASM on 3D simulated images. The results demonstrate the feasibility of the approach, although certain problems still have to be solved. It was proposed to construct the appearance model for the ASM using ultrasound images simulated by a convolutional approach, which does not include geometrical information about the transducer. The segmentation of 567 simulated volumes produced an average point-to-surface error of 1.9 mm (1.73 ± 0.05 mm for epicardium and 2 ± 0.07 mm for endocardium), which is comparable to the voxel size (1.1 × 1.1 × 0.7 mm). The segmentation of 20 in vivo
data was less precise, especially in middle and apical part where the myocardium was either badly defined or indistinguishable from the surrounding tissues. The average error was 3.5 mm (3.44 ± 0.4 mm for epicardium and 3.73 ± 0.4 mm for endocardium), which correlates to other US segmentation studies. We believe that these results can be more accurate and we are working on improving the ASM appearance model as well as starting to simulate more realistic US images.
ACKNOWLEDGMENTS This work was partially supported by MEC TEC2006-03617, ISCIII FIS2004/40676, and CDTI CENIT-CDTEAM grants. The work of AFF is supported by the Spanish Ministry of Education & Science under a Ramon y Cajal Research Fellowship. The CILab is part of the ISCIII CIBER-BBN (CB06/01/0061).
NOMENCLATURE ∗
Convolution operator
σn
Variance of G (σn ; x, y, z)
c
The speed of sound in the tissue, typically 1540 m/s
f0
The center frequency of a transducer
G (σn ; x, y, z)
Gaussian white noise with zero mean and variance σn
h (x, y, z)
Ultrasound point spread function
i
Imaginary unit
t (x, y, z)
Echogenicity model of the object being imaged
V (x, y, z)
The 3D ultrasonic echo dataset
Vb (x, y, z)
Hilbert transform of V (x, y, z)
ASM
Active Shape Model
FOV
Field of View
LV
Left Ventricle
PSF
Point Spread Function
RMSE
Root mean square error
SD
Standard deviation
US
Ultrasound
REFERENCES 1. N. P. Nikitin, C. Constantin, P. Loh, J. Ghosh, E. I. Lukaschuk, A. Bennett, S. Hurren, F. Alamgir, A. L. Clark, and J. G. Cleland, “New generation 3-dimensional echocardiography for left ventricular volumetric and functional measurements: comparison with cardiac magnetic resonance.,” Eur J Echocardiogr 7(5), pp. 365–372, 2006. 2. N. G. Bellenger, M. I. Burgess, S. G. Ray, A. Lahiri, A. J. S. Coats, J. G. F. Cleland, and D. J. Pennell, “Comparison of left ventricular ejection fraction and volumes in heart failure by echocardiography, radionuclide ventriculography and cardiovascular magnetic resonance. are they interchangeable?,” Eur Heart J 21, pp. 1387–1396, 2000.
3. G. I. Sanchez-Ortiz, G. J. T. Wright, N. Clarke, J. Declerck, A. P. Banning, and J. A. Noble, “Automated 3-d echocardiography analysis compared with manual delineations and spect muga,” IEEE Transactions on Medical Imaging 21(9), pp. 1069–1076, 2002. 4. M. Song, R. M. Haralick, F. H. Sheehan, and R. K. Johnson, “Integrated surface model optimization for freehand three-dimensional echocardiography,” IEEE Transactions on Medical Imaging 21(9), pp. 1077– 1090, 2002. 5. J. Kuo, B. Z. Atkins, K. A. Hutcheson, and O. T. von Ramm, “Left ventricular wall motion analysis using real-time three-dimensional ultrasound.,” Ultrasound Med Biol 31(2), pp. 203–211, 2005. 6. E. D. Angelini, S. Homma, G. Pearson, J. W. Holmes, and A. F. Laine, “Segmentation of real-time threedimensional ultrasound for quantification of ventricular function: A clinical study on right and left ventricles,” Ultrasound in Medicine & Biology 31(9), pp. 1143–1158, 2005. 7. W. Hong, B. Georgescu, X. S. Zhou, S. Krishnan, Y. Ma, and D. Comaniciu, Lecture Notes in Computer Science, vol. 3954, ch. Database-Guided Simultaneous Multi-slice 3D Segmentation for Volumetric Data, pp. 397–409. Springer, 2006. 8. S. Ordas and A. F. Frangi, “Automatic quantitative analysis of myocardial wall motion andthickening from long- and short-axis cine mri studies,” in Proceedings of the 27th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS), 2005. 9. B. P. F. Lelieveldt, A. F. Frangi, S. C. Mitchell, H. C. VanAssen, S. Ordas, J. H. C. Reiber, and M. Sonka, Mathematical Models of Computer Vision: The Handbook, ch. 3D Active Shape and Appearance Models in medical image analysis, pp. 471–487. Springer, 2005. 10. A. F. Frangi, D. Rueckert, J. A. Schnabel, and W. J. Niessen, “Automatic construction of multiple-object three-dimensional statistical shape models: Application to cardiac modeling,” IEEE Trans Med Imaging 21(9), pp. 1151–1166, 2002. 11. S. Ordas, L. Boisrobert, M. Bossa, M. Laucelli, M. Huguet, S. Olmos, and A. F. Frangi, “Grid-enabled automatic construction of a twochamber cardiac pdm from a large database of dynamic 3d shapes,” in IEEE International Symposium of Biomedical Imaging, pp. 416–419, 2004. 12. Y. Yu and S. T. Acton, “Speckle reducing anisotropic diffusion,” IEEE Transactions on Image Processing 11(11), pp. 1260–1270, 2002. 13. P. Abbott and M. Braun, “Simulation of ultrasound image data by a quadrature method,” in Proc. Eng. Phys. Sci. Med. Health Conf., 209, 1996. 14. J. C. Bambre and R. J. Dickinson, “Ultrasonic B-scanning: A computer simulation,” Phys. Med. Biol. 25, pp. 463–479, 1980. 15. J. A. Jensen, “Field: A program for simulating ultrasound systems,” Medical & Biological Engineering & Computing, Supplement 1, Part 1 34, pp. 351–353, 1996.