Performance of 3D integral imaging with position ...

Viewer
Transcript

Performance of 3D integral imaging with position uncertainty Behnoosh Tavakoli1, Mehdi Danesh Panah1, Bahram Javidi1, and Edward Watson2 1

Dept. of Electrical and Computer Engineering, U-2157, University of Connecticut, Storrs, CT USA 06269-2157 2 U.S. Airforce Research Laboratory, Wright Patternson Air Force Base, OH, 45433 1 [email protected]; [email protected]

Abstract: We present the theoretical and simulation results on the analysis of Synthetic Aperture Integral Imaging (SAII) technique and its sensitivity to pickup position uncertainty. SAII is a passive three dimensional imaging technique based on multiple image acquisitions with different perspective of the scene under incoherent or natural illumination. In practical SAII applications, there is always an uncertainty associated with the position at which each sensor captures the elemental image. We present a theoretical analysis that quantifies image degradation in terms of Mean Square Error (MSE) metric. Simulation results are also presented to identify the parameters affecting the reconstruction degradation and to confirm the analysis. We show that in SAII with a given uncertainty in the sensor locations, the high spatial frequency content of the 3D reconstructed images are most degraded. We also show an inverse relationship between the reconstruction distance and degradation metric. To the best of our knowledge, this is the first time that the effects of sensor position uncertainty on 3D computational reconstruction in synthetic aperture integral imaging systems have been quantitatively analyzed. ©2007 Optical Society of America OCIS codes: (100.3010) Image reconstruction techniques; (110.6880) Three-dimensional image acquisition; (100.6890) Three-dimensional image processing; (999.9999) Synthetic aperture imaging.

References and Links 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.

S. A. Benton, ed., Selected Papers on Three-Dimensional Displays (SPIE Optical Engineering Press, Bellingham, WA, 2001). B. Javidi and F. Okano, eds., Three Dimensional Television, Video, and Display Technologies (Springer, Berlin, 2002). T. Okoshi, Three-dimensional Imaging Techniques (Academic Press, New York, 1976). B. Javidi, S.-H. Hong, and O. Matoba, “Multi dimensional optical sensors and imaging systems,” Appl. Opt. 45, 2986-2994 (2006). M. G. Lippmann, “Epreuves reversibles donnant la sensation durelief,” J. Phys. 7, 821–825 (1908). Y. A. Dudnikov, ‘‘On the design of a scheme for producing integral photographs by a combination method,’’ Sov. J. Opt. Technol. 41, 426–429 (1974). H. E. Ives, “Optical properties of a Lippmann lenticuled sheet,” J. Opt. Soc. Am. 21, 171–176 (1931). P. Sokolov, “Autostereoscpy and Integral Photography by Professor Lippmann’s Method,” (Moscow State Univ. Press, Moscow, Russia, 1911). F. Okano, H. Hoshino, J. Arai, I. Yuyama, “Real time pickup method for a three-dimensional image based on integral photography,” Appl. Opt. 36, 1598-1603 (1997). A. Stern and B. Javidi, “Three-dimensional image sensing, visualization, and processing using integral imaging,” Proc. IEEE 94, 591-607 (2006). B. Wilburn, N. Joshi, V. Vaish, A. Barth, A. Adams, M. Horowitz, M. Levoy, "High performance imaging using large camera arrays," Proc. of the ACM 24, 765-776 (2005). J. S. Jang and B. Javidi, "Three-dimensional synthetic aperture integral imaging," Opt. Lett. 27, 1144-1146 (2002). http://www.opticsinfobase.org/abstract.cfm?URI=ol-27-13-1144 B. Burckhardt, “Optimum parameters and resolution limitation of integral photography,” J. Opt. Soc. Am. 58, 71–76 (1968). H. Hoshino, F. Okano, H. Isono, and I. Yuyama, “Analysis of resolution limitation of integral photography,” J. Opt. Soc. Am. A 15, 2059-2065 (1998).

#84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11889

15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29.

S. -H. Hong, J. -S Jang, and B. Javidi, “Three-dimensional volumetric object reconstruction using computational integral imaging.” Opt. Express 12, 483-491 (2004). A. Stern and B. Javidi, "3-D computational synthetic aperture integral imaging (COMPSAII)," Opt. Express 11, 2446-2451 (2003). Y. Igarishi, H. Murata, and M. Ueda, “3D display system using a computer-generated integral photograph,” Jpn. J. Appl. Phys. 17, 1683–1684 (1978). L. Erdmann and K. J. Gabriel, “High resolution digital photography by use of a scanning microlens array,” Appl. Opt. 40, 5592–5599 (2001). S. Kishk and B. Javidi, “Improved resolution 3D object sensing and recognition using time multiplexed computational integral imaging,” Opt. Express 11, 3528-3541 (2003). R. Martínez-Cuenca, G. Saavedra, M. Martinez-Corral and B. Javidi, "Enhanced depth of field integral imaging with sensor resolution constraints," Opt. Express 12, 5237-5242 (2004). J.-S. Jang and B. Javidi, “Improved viewing resolution of three-dimensional integral imaging by use of nonstationary micro-optics,” Opt. Lett. 27, 324-326 (2002). J. Hong, J. -H. Park, S. Jung, and B. Lee, "Depth-enhanced integral imaging by use of optical path control," Opt. Lett. 29, 1790-1792 (2004) http://www.opticsinfobase.org/abstract.cfm?URI=ol-29-15-1790. S. -W. Min, J. Kim, and B. Lee, "Wide-viewing projection-type integral imaging system with an embossed screen," Opt. Lett. 29, 2420-2422 (2004) M. Martínez-Corral, B. Javidi, R. Martínez-Cuenca, and G. Saavedra, “Integral imaging with improved depth of field by use of amplitude modulated microlens array,” Appl. Opt. 43, 5806-5813 (2004). Y. S. Hwang, S. -H. Hong, and B. Javidi, "Free View 3-D Visualization of Occluded Objects by Using Computational Synthetic Aperture Integral Imaging," J. Display Technol. 3, 64-70 (2007) http://www.opticsinfobase.org/abstract.cfm?URI=jdt-3-1-64. S. Yeom, B. Javidi, and E. Watson, “Photon counting passive 3D image sensing for automatic target recognition,” Opt. Express 13, 9310-9330 (2005). Y. Frauel and B. Javidi, “Digital three-dimensional image correlation by use of computer-reconstructed integral imaging,” Appl. Opt. 41, 5488-5496 (2002). J. Arai, M. Okui, M. Kobayashi, and F. Okano, "Geometrical effects of positional errors in integral photography," J. Opt. Soc. Am. A 21, 951-958 (2004) http://www.opticsinfobase.org/abstract.cfm?URI=josaa-21-6-951 N. Mukhopadhyay, Probability and Statistical Inference (Marcel Dekker, Inc. New York, 2000).

1. Introduction There has been an increasing interest in three dimensional (3D) sensing and imaging systems among researchers in recent years. There have been new venues that 3D imaging has been proven to be useful [1-4]. Among different techniques of 3D imaging and display, Integral Imaging (II) is based on the idea of integral photography [5-9] in which multiple 2D intensity images are acquired from different locations with different perspectives from a 3D scene. This is in contrast with holographic techniques which are based on complete wavefront recording and reconstruction [2,3]. In II perspective views can be generated by microlens array or by translating an imaging device over a synthetic aperture (SAII) for applications requiring longer range or better 3D reconstruction resolution [10-14]. The 2D images captured, known as elemental images, convey the directional and intensity information and can be used for optical or computational visualization of the scene data. [16-20] There have been efforts to improve the resolution, viewing angle, depth of focus and eye accommodation of the Integral Imaging technique in both aspects of pickup and display. In the pickup side, methods such as time multiplexing computational integral imaging [19] and Synthetic Aperture Integral Imaging (SAII) [11,12] are proposed to increase the resolution of optical or computational reconstructions. In addition, techniques such as nonstationary micro optics [21], optical path control [22], use of embossed screen [23] or amplitude modulation of individual lenslets [24] are investigated to improve resolution or depth of focus. In computational reconstruction of the scene, the captured 2D elemental images are reversely back projected on the object plane at the desired distance from the pickup plane, i.e. reconstruction plane, taking into account the position from which they have been taken. By doing this, the light cones originally emanated from that specific reconstruction plane would again overlap on the same spot, thus creating a focused image. However, for light cones emitted from any other plane, the overlap would not be on the same spot, thus creating a blurred image. In fact, the further the origin of light rays from the reconstruction plane, the more defocused they appear after reconstruction. It is also possible to reconstruct a 3D scene from different viewing angles or behind the occlusion [25]. Computational reconstruction of #84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11890

elemental images has proved to be promising in a variety of applications including 3D object recognition, occlusion removal and multiple viewing point generation [25-27]. In practical II systems, the position of the imaging sensor is known only to within a degree of uncertainty. This raises the question of sensitivity of image reconstruction to the extent of positioning uncertainty in the pickup stage. In this paper we present a mathematical analysis on the sensitivity of 3D computational reconstruction quality of Synthetic Aperture Integral Imaging (SAII) technique to pickup position errors. The problem of position errors in lenslet based integral imaging was considered previously in [28]. The assumption of the analysis in [28] is that position changes appear as deterministic (known) shifts in the position of each lenslet during the optical reconstruction while the elemental images are kept in their original location when captured by the lenslets and sensor (camera). Their results indicate that the misalignment between sensor and lenslet produces an image split which becomes more sever by increasing the distance of reconstruction. Also, a shift appears between the intended reconstruction plane and the plane of least confusion. Our problem statement in this paper is very different from reference [28]. We consider stochastic (unknown) position errors for the sensor during the capture stage which is relevant for a sensor placed on a moving platform. In this case, the lens and the sensor are aligned while the position of the whole capture package (elemental image) is uncertain. We investigate the degradation of 3D computational reconstruction with statistical metrics. Due to the differences in the problem statement of this paper with that of reference [28], the results and conclusion that we derive in this paper for a moving platform sensor with unknown (stochastic) position errors are in complete contrast to that of reference [28] which considers deterministic (known) position changes while the elemental images are kept in their original location. In section 2 we give a brief explanation of concepts of integral imaging and computational reconstruction. Section 3 describes SAII and inspects its vulnerability to pickup position error, while in section 4 we present the general mathematical analysis of SAII computational reconstruction sensitivity as well as a Monte Carlo simulation for a point source imaged via SAII. Results of experiments with 3D objects are presented in section 5 and we conclude in section 6. 2. Computational reconstruction in integral imaging The pickup process of synthetic aperture integral imaging is illustrated in Fig. 1.

Computational 3D scene reconstruction

3D Scene Pickup Grid Fig. 1. Pickup process of three dimensional synthetic aperture integral imaging. The camera at each location captures the scene form a unique perspective, which is later used for 3D computational reconstruction.

One of the possible approaches in the 3D II reconstruction with a computer is to simulate the pinhole array from which the elemental images were taken. Since each elemental image conveys a unique perspective of the scene, the directional information is also recorded in an integral image as well as the 2D intensity information. The depth information in particular can

#84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11891

be extracted from the relative shift of an object between the elemental images. Therefore, a 2D scene can be reconstructed at a particular distance by properly propagating the rays back from their respective pickup locations. The collection of 2D scenes reconstructed at all distances then gives the full 3D scene. [15]. Alternatively, one can shrink the spatial grid, on which elemental images are taken, proportional to the desired reconstruction distance while keeping the size of elemental images constant. In effect, the pickup grid separation is divided by the magnification factor (z0/g); where z0 is the distance between the desired plane of reconstruction and the sensor along the optical axis; and g denotes the distance of the image plane from each lens [see Fig. 2.]. This approach is more appropriate for SAII in which each elemental image is captured on a full sensor and has a better resolution comparing to lenslet based elemental images. An additional merit of this approach on the computational side is that there would be no need to handle magnified images (very large matrices) for the reconstruction process. This greatly reduces the reconstruction time, required resources and computational burden. The magnification factor M, is given by z0/g. The original elemental image Okl(x,y) is flipped and shifted according to the following expression: ⎛

⎛

⎝

⎝

I kl ( x, y, z0 ) = Okl ⎜⎜ − x + ⎜1 +

1 ⎞ 1 ⎞ ⎞ ⎛ ⎟ S y l ⎟, ⎟ S x k , − y + ⎜1 + M⎠ M ⎠ ⎟⎠ ⎝

(1)

in which Sx and Sy denote the separation of sensors in x and y directions at the pickup plane respectively, whereas subscripts k and l signify the location of elemental image Okl in the pickup grid. The final reconstruction plane consists of partial overlap of flipped and shifted elemental images as: K

L

I ( x, y, z0 ) = ∑∑ I kl ( x, y, z0 ) / R 2 ( x, y ) ,

(2)

k =1 l =1

where K and L denote the number of elemental images acquired in the x and y directions; also R compensates for intensity variation due to different distances from the object plane to elemental image Okl on the sensor and is given by [see Fig. 2.].

[

R ( x, y) = ( z 0 + g ) + (Mx − S x k ) + (My − S y l ) 2

2

2

2

]

⎛ 1 ×⎜ ⎝M

2

⎞ + 1⎟ , ⎠

Sx

( Mx, My)

Nx

R

Ny

Nx (kS x , lS y )

y

Sy .

Ny

z0

x

(3)

z M= 0 g

.

.

.

.

.

.

.

.

Okl ( x, y) g

y x

Fig. 2. Schematic of II reconstruction process (left), Arrangement of elemental images (right)

In essence, Eq. (3) is the square of the distance of each pixel in the elemental image from the corresponding pixel in the reconstruction plane. However, for most cases of interest such as long range 3D imaging, Eq. (3) is dominated by the term (z0+g)2. This is due to the fact that

#84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11892

Sxk N x MN x or equivalently Mx − S x k ≤ . So, if + M 2 2 the viewing field is smaller than the reconstruction distance, MN x < z 0 , then the term (z0+g)2 would dominate in Eq. (3). This condition is equivalent to having an imaging sensor (CCD) which is small in dimensions comparing to the effective focal length of the imaging optics. Note that in computational reconstruction the adjacent flipped and shifted elemental images overlap such that for objects close to the reconstruction plane, the overlap of all elemental images would be aligned, whereas for objects located away from the reconstruction plane the overlap would be out of alignment resulting in a blurred reconstruction. Thus, with computational reconstruction one is able to get an in-focus image of an object at the correct reconstruction distance, while the rest of the scene appears out of focus. for reconstruction we always have x ≤

3. Synthetic Aperture Integral Imaging (SAII) Conventional II systems use an array of small lenses mounted on a planar surface (lenslet array) to capture elemental images on a single opto-electronic sensor. Each lens creates a unique perspective view of the scene at its image plane. As long as elemental images do not overlap in the image plane of the lenslet, one can capture all elemental images on a CCD at once. This technique has the merits of simplicity and speed. However, for objects close to the lenslet array, the elemental images may overlap in the image plane which requires one to use additional optics to project the separated elemental images on a sensor. Also, small aperture of lenslets creates low resolution or abberated elemental images. In addition, the pixels of the imaging device has to be divided between all elemental images which leads to low number of pixels per elemental image. Synthetic Aperture Integral Imaging (SAII) is a method which can potentially resolve some of the problems associated with conventional II. In this technique, each perspective view is acquired separately by a 2D imaging sensor located at the respective location on the pickup aperture. Thus, this method involves mechanical translation of one or more imaging devices on a grid and capturing elemental images at certain positions. This way, high resolution perspective views are captured from a synthetic aperture in which the imaging device scans. Since capturing the perspective views requires multiple acquisitions, SAII in this form is not suitable for imaging dynamic objects in which the movements are faster than the time required for a complete aperture scan. However, [11] proposes a solution to this problem by introducing an array of imaging devices (cameras) on a grid. In addition, since the elemental images can be captured with well corrected optics on a large optoelectronic sensor, the resolution and aberration of each elemental image can be enhanced dramatically comparing to lenslet based II. Almost all syntactic aperture techniques include information collection from different spatial locations. Thus, a practical concern in such methods is the uncertainty associated with the accuracy of spatial localization of the information collecting apparatus and its (usually) negative effect on the final processed data. However, the sensitivity of different synthetic aperture methods to such errors depends on the parameters of the system and thus is highly application specific. Along the same line, the sensitivity of SAII to uncertainty in pickup positions is a natural question that is of prominent importance. 4. Sensitivity analysis of SAII In order to analyze the effect of sensor positioning uncertainty at the pickup stage on the quality of the 3D reconstructed images, a random position error, Δp, is introduced in the position from which elemental images of an accurate SAII pickup are reconstructed. In other words, the ith elemental image, Oi(p), is shifted from its original location by Δp; where p=(x,y) denotes the position variable in the sensor plane. To quantify the image degradation in 3D computational reconstruction due to dislocation error in the pickup stage of SAII, we use the well known Mean Square Error (MSE) metric. #84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11893

This metric provides one with an estimate of average error per pixel at the reconstruction plane that should be expected for a specific probability distribution of sensor positioning errors. For ease of presentation we initially assume that all positioning errors for all elemental images lie along the same direction, for instance x. Since we eventually want to analyze random displacements which have independent components in x and y direction, extension of the results to the more realistic case of two dimensional position errors is straight forward. To avoid unnecessary complications in the equations, we use a single index rather than using two indices for each elemental image. However, we do not change the arrangement of elemental images from their 2D grid. Thus, the reconstruction process still remains a 2D mixing taking into account the position of each elemental image on the 2D pickup grid. According to Eq. (2), the reconstruction with dislocated elemental images can be written as:

I e ( x, y , z ) =

∑ I ( x + ΔMp , y) / R ,

K ×K

2

i

i =1

(4)

i

where K denotes the number of elemental images taken in each direction on the synthetic aperture; Δpi denotes the sensor location error associated with ith sensor during pickup and magnification factor, M, is given by z/g. We define the difference between Eqs. (2) and (4) as the error metric:

err (p, z ) = I (p, z ) − I e (p, z ) .

(5)

Since we assume a stochastic process is governing the dislocation of the individual sensors, we need to study the expected value of the error associated with a specific location error probability distribution. The average error resulting from different realizations of the location error distributions can be written as: 2 ⎫⎪ Δpi 1 ⎧⎪ K × K E I ( x , y ) I ( x , y ) − + ⎬, ⎨ i i 4 R M ⎪⎭ ⎪⎩ i

∑

E err ( x , y , z ) = 2

(6)

in which E(.) is the expectation operator; R is a function of system parameters and is dominated by (z+g) in Eq. (3). We proceed with the analysis in the Fourier domain where the stochastic spatial shifts of elemental images appear as phase terms. Thus err(p,z) can be written as:

Err (f , z ) = F .T {err (p, z )} = =

1 R2

∑ ~I − ~I e

K ×K ⎡ i =1

⎢ ⎣

i

i

1 R2

Δp − jf x i M

∑ ~I ( f , f ) − ~I ( f , f )e

K ×K ⎡ i =1

⎢ ⎣

i

x

y

i

x

− jf x

Δpi M

y

⎤ ⎥ ⎦.

(7)

⎤ ⎥ ⎦

The symbol ~ denotes the Fourier transformation; also f=(fx ,fy) stands for the couplet of ~ spatial frequencies in the x and y directions such that I = I ( f x , f y ) = F .T {I (p)} . Note that at this stage we assume that the random displacements are all in the x direction, thus the shift appears as a phase term exp(− jf x Δpi / M ) in Eq. (7) only incorporating fx. Henceforth, we use the term error spectrum for |Err(f,z)|2. We show in appendix A how one gets the following expression for the expected value of the error spectrum:

E Err (f , z ) = 2

∑

K ×K 1 ⎡ ~ 2 * − γ − γ ( 2 ) Ii + 1 − γ ⎢ R4 ⎣ i

2 K × K K ×K

∑ ∑

i≠ j

j ≠i

~ ~* ⎤ , Ii I j ⎥

(8)

⎦

for which γ = E{exp(− jf x Δp / M )} is the moment generating function of random variable Δp and can be derived for all basic probability distributions, e.g. Gaussian, uniform, Laplacian and etc [29]. Hereafter, we assume that the camera location error follows a zero mean #84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11894

Gaussian distribution with variance σ2, i.e. N(0, σ2). However, it is straight forward to assume other distributions. For a Gaussian random variable X~N(µ,σ2) the moment generating function is M X (t ) = E{e tX } = exp(μt + σ 2 t 2 / 2) [29], thus for Δp~N(0, σ2), γ becomes a real number, γ = E{exp(− jf x Δp / M )} = exp(− f x2σ 2 / 2M 2 ) . Essentially this parameter depends on the reconstruction distance and the random behavior of the dislocation errors and can be explicitly calculated for any given error distribution. Equation (8) can be further simplified to: 2

E Err (f , z ) =

∑

2(1 − γ ) K × K ~ × Ii R4 i

2

+

(1 − γ ) 2 K × K K × K ~ ~ * × ∑ ∑ Ii I j . i≠ j j≠i R4

(9)

For the general case when the dislocation has both x and y components, Eq. (9) can be extended to the following:

∑

2(1 − γ 2 ) K ×K ~ 2 (1 − γ 2 ) 2 K × K K × K ~ ~ * × × ∑ ∑ Ii I j . Ii + i≠ j j≠i R4 R4 i

2

E Err (f , z ) =

(10)

The total expected error is the area under the error spectrum for all spatial frequencies: According to Parseval’s theorem we have:

err (p, z ) dp = ∫ Err (f , z ) df , 2

∫

2

(11)

and since the expectation and integration operations can be exchanged, one has the following relationship for the MSE at distance z in the reconstruction plane:

MSE ( z ) = ∫ E err (p , z ) d p = ∫ E Err (f , z ) d f 2

1 = ( z + g )4

2

⎡ ∫ ⎢ 2 (1 − γ ⎣

2

K ×K

)∑ i

~ Ii

2⎤

⎥ df ⎦

+

⎡ ∫ ⎢ (1 − γ ⎣

2 2

)

K × K K ×K

∑ ∑

i≠ j

j≠i

~~ *⎤ Ii I j ⎥ d f

(12)

⎦

In the first term in the right hand side of Eq. (12), (1 − γ 2 ) acts as a weight for the energy spectrum of the shifted elemental images. As discussed earlier, in the case of a Gaussian positioning error, (1 − γ 2 ) = [1 − exp(− f x2σ 2 / M 2 )] , which is a high pass function. This means that the higher spectral components of elemental images contribute more significantly in the MSE of each plane comparing to the energy contained in the low spatial frequencies. In addition, at larger reconstruction distances, the stop band of this filter becomes wider and thus diminishes more of the spectral components of the elemental images and consequently reduces the resulting MSE. The bandwidth of this high pass filter depends solely on the ratio of variance of the positioning error probability distribution function, σ, and the magnification factor M. In particular, for a Gaussian sensor positioning error, i.e. Δp~N(0, σ2), the Full Width Half Maximum (FWHM) is given by:

FWHM = 2 M ln(2) / σ = 1.66 z / gσ .

(13)

The inverse relationship of FWHM with σ is confirmed by the fact that with better positioning accuracy (small σ), more spatial frequencies are going to be suppressed in Eq. (12) and thus the MSE would be reduced. This is the primary reason that we can expect to see more degradation near the edges of the objects in focus at the reconstruction plane and less degrading effect in areas containing low spatial frequencies or objects that are out of focus. Also, FWHM increases with reconstruction distance z, which means one should expect less degradation when reconstructing far objects. This important outcome has been verified and demonstrated in the experimental results section of this paper. Likewise, the second term in the right hand side of Eq. (12) also filters out the low spatial frequencies from the cross correlation of two distinct flipped and shifted elemental images. This has the same effect as

#84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11895

described for the first term with the only difference that (1 − γ 2 ) 2 has a larger FWHM comparing to (1 − γ 2 ) in the first term and thus contributes less in the total MSE. In order to realize how the uncertainty in the location of each sensor in the pickup process affects the 3D computational reconstruction statistically, we use the Monte Carlo simulation method to statistically estimate the deviation of the reconstructed images between the ideal 3D reconstructions and the ones with randomly dislocated sensors. We simulate integral imaging of a point source from a rectangular pickup grid using geometrical optics; however, for clarity we first do the analysis and simulation for a one dimensional linear array with K elemental images. Extension to 2D pickup arrays is straight forward. Without loss of generality, we assume the point source to be located at distance z from the pickup plane on the optical axis of the first elemental image, i.e. δ ( p) ; the distance between image plane and the lens is g [see Fig. 2]; also the camera is linearly translated with pitch Sp. Thus according to Eq. (1) the kth flipped and shifted elemental image is:

I k ( p ) = δ (− p + kS p (1 + g / z )) ,

(14)

where p is the position variable. According to Eq. (19) we have the following expression for the error spectrum:

E | Err ( f x , z ) |2 =

2(1 − γ ) 2(1 − γ ) 2 + K ( z + g )4 ( z + g )4

− jf S (1+ g / z ) k + jf S (1+ g / z ) l . e e ∑ ∑ k ≠l l ≠ k K K

x p

x p

(15)

Equation (15) can be rewritten as:

E | Err ( f x , z ) | 2 =

2 2γ (1 − γ ) 2(1 − γ ) 2 sin ( Kf x S p (1 + g / z )) . K + (z + g)4 ( z + g ) 4 sin 2 ( f x S p (1 + g / z ))

In particular, when the distribution governing the displacement of sensors is then one has γ = exp( − f x2σ 2 g 2 / 2 z 2 ) and Eq. (16) can be rearranged to:

Δp~ N(0, σ2),

[

2 2 2 2 2(1 − e − f x σ g / 2 z ) Ke − f x σ g / 2 z + ... 4 (z + g) 2

E | Err ( f x , z ) |2 =

(16)

2 2

2

(1 − e −

f x2σ 2 g 2

/ 2z

2

)

sin 2 ( Kf x S p (1 + g / z )) ⎤ sin 2 ( f x S p (1 + g / z ))

.

(17)

⎥ ⎥ ⎦

Note that the total MSE expected from such a random displacement of sensors is the total energy of error spectrum as in Eq. (12). We do Monte Carlo simulation to statistically compute the MSE in the case of imaging a point source with a one dimensional linear array in which the position of the sensor is altered about a perfect linear grid of pinholes according to the random variable Δp~N(0,1). The elemental images of the point source located at distance z are calculated through geometrical optics. The simulation is repeated for 1000 trials with different random positions. At each trial, the set of elemental images are used to reconstruct the point source at its respective distance and the results are compared with the result of reconstruction using correctly positioned sensors. In each trial, MSE is computed according to Eq. (6) and (11) and all the 1000 MSEs for each reconstruction distance are averaged. The total MSE computed from Monte Carlo simulation is compared to the MSE calculated using Eq. (17) with similar parameters as K=16, Sp=2.5mm and Δp~N(0,1) for the point source located at different distances from z=24cm to z=40cm. Figure 3 shows the agreement of simulation results and that of the mathematical analysis.

#84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11896

Fig. 3. Comparison of the MSE results of the Monte Carlo simulation and Eq. (17) for a point source located at z=24cm to z=40cm (M=12 to M=20) at g=2cm.

For the two dimensional Monte Carlo simulation, the same process is repeated with a 16×16 rectangular grid, pitch Sx=Sy=2.5mm and the position displacements follow Δp~N(0, 1). Figure 4 shows the combination of the reconstruction results form all 1000 trials at 24cm and 40cm.

36 pixels

(a)

66 pixels

(b)

Fig. 4. Result of the Monte Carlo simulation for the point source reconstruction from its dislocated elemental images. Point source is located at (a) z=40cm and (b) 24cm. For zero position errors, the plots would have been a single point.

The simulation results in Fig. 4 approximate the average impulse response at the corresponding reconstruction distance. If the position errors were zero, then these plots would be just a single point at the origin. The bell shape impulse response in Fig. 4 resembles the standard blurring by optics and detectors’ finite dimensions and has the similar effect on the reconstructed images. The blurring associated with the Airy disk of imaging optics is given in radians by αdiff=1.21λ/d in which d is the radius of the input aperture and λ is the wavelength of the illumination. On the other hand, the blurring associated with the pixilation of the sensor in radians is αpxl=c/f , where c and f are the sensor pixel size and focal length of imaging optics, respectively. The expression for blurring due to diffraction yields 17µm and 29µm for object distance of 24cm and 40cm respectively at λ=600nm and d=10mm. For pixel size of c=10µm and lens focal length of f=2cm, the blurring due to pixilation is 120µm and 200µm at the same object distances. When positioning error is introduced, the resulting blurring radii

#84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11897

can be measured as 2.5mm and 1.4mm by projecting the half maximum radii of the bell shapes in Fig. 4 to their respective reconstruction distances at 24cm and 40cm. It should be noted that these results are for the case in which a 40% Gaussian pitch error is introduced in the position of sensors. This comparison suggests that improvement in positioning systems are more effective in terms of reducing the total MSE of the reconstructed images rather than improvements related with the MTF of imaging optics or the sensor pixilation. As mentioned earlier, the exact impulse response depends on the plane in which the elemental images are reconstructed. Figure 4 shows that for a point source located 25cm from the sensor, the impulse response is wider than that of the case of a point source at 40cm, i.e. the blurring effect of sensor position uncertainty is more degrading for objects closer to the pickup plane. 5. Experimental results We present the results of SAII experiments to demonstrate the quantitative and qualitative degradation of computational 3D reconstructions by introducing sensor position uncertainty in the pickup process and also to verify the mathematical analysis performed earlier. Figure 5(a) shows a 2D image of the 3D scene used in the experiments. The scene is composed of two toy cars and a model helicopter located at different distances from the sensor. The depth of the scene is 16cm for which the closest and farthest objects are located 24cm and 40cm away form the center of the pickup grid. The scene is illuminated with diffused incoherent light.

(a)

(b) Fig. 5. (a). A 2D image of the 3D scene, (b). subset of elemental images for 3D scene in (a).

The experiment is performed by moving a.digital camera transversally in an x-y grid with the pitch of 5mm in both x and y directions. At each node, an elemental image is captured from the scene. The imaging sensor is 22.7×15.6mm and has a 10µm pixel pitch. Effective focal length of the camera lens is about 20mm; and elemental images are captured in a planar 16×16 grid. A subset of elemental images can be seen in Fig. 5(b) each conveying different perspective information. The dimension of the cars is about 5×2.5×2cm, whereas the helicopter is about 9×2.5×2cm in size. In Fig. 6 we show the 3D reconstruction of the scene in three different distances of the objects in Fig. 5(a) according to Eq. (2). As is clear, at each distance one of the objects is in focus while the others appear washed out. Monte Carlo simulation is used to study the degradation effect of sensor position uncertainty during the pickup process on the reconstructed images. Let the actual position of the camera at each node in planar pickup grid be {Sxk, Syl} about which sensor position is measured with error (ΔPx,ΔPy), which we model as two independent random variables, ΔPx,y~N(0,σ2). We use a rectangular grid with equal spacing in x and y directions, i.e. the pitch of Sx,y = Sp = 5mm. Thus, the measured position of the sensor for capturing elemental image Okl is represented by { S p (k + ΔPx / S p ) ,S p (l + ΔPy / S p )} where k,l=0,1,2,…,15.

#84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11898

(c)

(b)

(a)

Fig. 6. 3D scene reconstruction at distances (a) z=24 cm, (b) z=30 cm and (c) z=36 cm.

Since ΔPx,y follows a N(0,σ2), we define the fraction 100σ2/Sp to be the pitch error percentage. Note that ΔPx,y/Sp represents a normalized positioning error metric. We perform computational reconstruction with Eq. (2) to reconstruct a plane of the 3D scene at the specific distance z by utilizing the distorted camera positions. In order to quantify the degradation due to dislocation of cameras, the reconstruction results are compared with the ones using the correct positions on the equally spaced grid. Mean Square Error metric is computed using Eq. (6) to measure the difference of these images quantitatively. Figure 7 shows the result of reconstruction using known and random positions respectively at z=24cm with 30% pitch error.

(a)

(b) Fig. 7. Reconstruction at z= 24 cm using (a) original camera position, and (b) using distorted camera position with 30% pitch error.

Dislocated position of the camera is a random vector from which we choose 500 samples and utilize them in Eq. (2) to computationally reconstruct one plane of the 3D scene which is located at the distance z. As a result, we have 500 reconstructed images of that plane for which we calculate the MSE [Eq. (6)] by using the corresponding reconstruction with correct positions. This simulation is done for distances from 24cm to 40cm which corresponds to magnification, M, from 12 to 20. The pitch error was chosen to be 30% in both directions, i.e. σ2/Sp =0.3. Since ΔPx,y is a Gaussian random variable, such an error means that 70% of the time ΔPx,y remains less than 0.3 of the pitch (0.3Sp). Figure 8(a) shows the box-and-whisker diagram of the MSEs for z=24cm (M=12) to z=40cm (M=20). Figure 8(a) shows the statistical properties of MSEs at a specific distance z=z0 (M=M0). The blue box shows the variance of the MSE which is limited to its upper and lower quartiles and dotted blue line is limited to the smallest and largest computed MSEs. According to Monte Carlo simulation the average of these 500 MSEs is computed at each plane which is shown with the solid red line in Fig. 8(a). This average for each particular plane of the scene is a reasonable estimation of the error one can expect due to a 30% camera positioning error. Figure 8(a) also illustrates the decreasing trend of the average MSE with increasing magnification. Assuming that g is constant, magnification increase is identical to the reconstruction distance increase. Note that the variance of the error at each plane decreases when reconstruction distance increases, while its rate of decrease is greater than rate of decrease of the average MSE. This fact can be explained using Eq. (12) that shows MSE is inversely proportional to z.

#84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11899

Mean Square Error Mean Square Error

Magnification (M=z/g)

Mag n

ifica

tion (M= z/g

) Pitch Error σ / S p 2

Fig. 8. (a). box-and-whisker diagram of the MSEs for z=24cm (M=12) to z=40cm (M=20) when the pitch error is 30%. (b). Mean of MSE corresponding to 10%, 20%, 30% and 50% pitch errors for the distances from z=24cm to z=40cm i.e M=12 to M=20.

The simulation done for 30% pitch error is repeated for 10%, 20% and 50% pitch errors. According to each of these pitch errors a random position is generated by adding a random Gaussian variable with appropriate variance to the original position. We repeat the process 500 times and each time the MSE of the degraded and original images is computed. As a result for each pitch error, we have 500 MSEs at various planes of the scene. The average of MSE of 500 trials is computed for each plane at different pitch errors and the combined result is shown in Fig. 8(b). As discussed earlier in section 4, the contribution of all spatial frequencies is not the same in the total MSE. To illustrate this fact, the total MSE in three different ranges, i.e. 35cm38cm, 29cm-32cm and 24cm-27cm for the helicopter, yellow car and green car, is computed #84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11900

while the pitch error is 50%. The result are presented in the Fig. 9 which shows that the focus part of the image with more detail and higher spatial frequencies is distorted more significantly by the position error of the camera. Clearly, the maximum of error occurs around edges. This observation confirms the mathematical analysis result in Eq. (12) which suggests that the term (1-γ2) emphasizes contributions of higher spatial frequencies of the elemental images in MSE, while suppresses the low spatial frequencies contributions. Fig. 9 also verifies the fact that error is larger for objects closer to pickup plane (sensor).

(a)

(c)

(b)

Fig. 9. Total error for the range of distances from (a) z=35cm to38cm, (b) z =29cm to 32cm and (c) z=24cm to 27cm.

6. Conclusion In this paper, we have presented an analysis of the Synthetic Aperture Integral Imaging and sensitivity of 3D reconstruction to the uncertainty in positioning of sensor. Mean Square Error (MSE) is used as a metric to assess degradation of 3D reconstructions. Both experimental and theoretical results suggest that for a given positioning uncertainty, MSE decreases with increasing reconstruction distance. In addition, it has been shown that not all parts in a reconstruction plane contribute equally to the total MSE, but rather it is the objects close to the reconstruction distance that constitute the major portion of MSE. Through mathematical analysis, it has also been shown that the low spatial frequency components in the energy spectrum of the elemental images contribute less in the resultant MSE, whereas high spectral components are the more significant cause of degradation for a fixed positioning error distribution. From this study one can optimize the parameters associated with the hardware of a SAII system in order to achieve a tolerable degradation in the presence of sensor positioning error. We wish to thank Dr. Seok Y. Hwang for his assistance in optical experiments. Appendix A. We have the following expression for the error spectrum:

1 Err (f , z ) = 4 R 2

=

1 R4

∑

K×K

− jf x ~ Ii (1 − e

Δpi M

2

)

i

∑∑

Δp ⎡K ×K K × K ~ ~ − jf x i * Ii I j (1 − e M ⎢ j ⎢⎣ i

(A.1)

)(1 − e

+ jf x

Δp j

M

⎤ )⎥. ⎥⎦

Note that in Eq. (A.1), only Δp has a random nature. Thus, the expected value for the error spectrum depends on the behavior of this variable, i.e. the distribution governing the spatial dislocation of the sensors during pickup. Since expectation is a linear function, one can break down the error expectation as following:

#84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11901

{

E Err (f , z )

2

}= R1 ∑ ∑ ~I ~I E (1 − e 4

=

1 R4

1 R4

⎡ K ×K K ×K ⎢ j ⎢⎣ i

i

* j

⎧ ⎪ ⎨ ⎪ ⎩

− jf x

Δpi M

∑∑

Δp ⎡ K ×K K ×K ~ ~ ⎧ − jf x i Ii Ii* E ⎨2 − e M ⎢ i ⎢ i ⎩ ⎣

)(1 − e −e

+ jf x

+ jf x

Δpi M

Δp j M

⎫⎤ ⎬⎥ ⎥ ⎭⎦

Δp j Δp ⎡ K ×K K ×K ~ ~ ⎧ − jf x i + jf x ⎪ Ii I j* E ⎨(1 − e M )(1 − e M ⎢ ⎪ ⎢ ⎩ ⎣ i ≠ j j ≠i

∑∑

⎫ ⎪⎤ ⎬⎥ ⎪⎥ ⎭ ⎦

(A.2) + ... ⎫⎤ ⎪

)⎬⎥ ⎪ ⎭⎥ ⎦

Now, let γ = E{exp(− jf x Δp / M )} which is the moment generating function of random variable Δp [29]. Eq. (A.2) reduces to:

{

E Err (f , z )

2

}= R1 ⎡⎢ ∑ ∑ ~I ~I (2 − γ − γ )⎤⎥ + K ×K K × K

4

1 R4

⎣

* i i

i

i

∗

⎦

(A.3)

Δp j Δp ⎡ K ×K K ×K ~ ~ ⎧ − jf x i + jf x * ⎪ M Ii I j E ⎨(1 − e )(1 − e M ⎢ ⎪ ⎢⎣ i ≠ j j ≠i ⎩

∑∑

⎫ ⎪⎤ )⎬⎥ ⎪ ⎭⎥⎦

Note that Δpi denotes the sensor location error associated with ith sensor which is a random variable and the expected value of the random variable or a function of the random variable is constant which can come outside of the summation. In addition the sensor location errors are assumed to be independent, thus we have: ⎧ − jf x i + jf x ⎪ E ⎨(1 − e M )(1 − e ⎪ ⎩ Δp

Δp j M

Δp j Δp ⎫⎪ ⎫⎪ − jf x i ⎫ ⎧ + jf x ⎧ ⎪ )⎬ = E ⎨1 − e M ⎬ E ⎨1 − e M ⎬ ⎪⎭ ⎪⎭ . ⎩ ⎭ ⎪⎩

= (1 − γ )(1 − γ ∗ ) = 1 − γ

(A.4)

2

Consequently, the error spectrum expectation can be written as in Eq. (9).

#84944 - $15.00 USD

(C) 2007 OSA

Received 9 Jul 2007; revised 28 Aug 2007; accepted 28 Aug 2007; published 5 Sep 2007

17 September 2007 / Vol. 15, No. 19 / OPTICS EXPRESS 11902

Side scan sonar imaging system with boat position on display