Gaussian field consensus: A robust nonparametric ...

Viewer
Transcript

Pattern Recognition 74 (2018) 305–316

Contents lists available at ScienceDirect

Pattern Recognition journal homepage: www.elsevier.com/locate/patcog

Gaussian ﬁeld consensus: A robust nonparametric matching method for outlier rejection Gang Wang a,b,∗, Yufei Chen c, Xiangwei Zheng d a

Institute of Data Science and Statistics, Shanghai University of Finance and Economics, Shanghai 200433, China School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai 200433, China c College of Electronic and Information Engineering, Tongji University, Shanghai 201804, China d School of Information Science and Engineering, Shandong Normal University, Jinan 250014, China b

a r t i c l e

i n f o

Article history: Received 18 May 2017 Revised 27 July 2017 Accepted 18 September 2017 Available online 20 September 2017 Keywords: Gaussian ﬁeld Outlier rejection Mismatch removal Point matching Sparse approximation

a b s t r a c t In this paper, we propose a robust method, called Gaussian Field Consensus (GFC), for outlier rejection from given putative point set matching correspondences. Finding correct correspondences (inliers) is a key component in many computer vision and pattern recognition tasks, and the goal of outlier (mismatch) rejection is to ﬁt the transformation function that maps one feature point set to another. Our GFC starts by inputting a putative correspondence set which is contaminated by many outliers, and the main task of our GFC is to identify the underlying true correspondences from outliers. Then we formulate this challenging problem by Gaussian Field nonparametric matching model which bases on the exponential distance loss and kernel method in a reproducing kernel Hilbert space. Next, We introduce a local linear constraint based on the regularization theory to preserve the topological structure of the feature points. Moreover, the sparse approximation is used to reduce the search space, in this way, we can handle a large number of points easily. Finally, we test the GFC method on several real image datasets in the presence of outliers, where the experimental results show that our proposed method outperforms current state-of-the-art methods in most test scenarios. © 2017 Elsevier Ltd. All rights reserved.

1. Introduction The problem of recovering the underlying correspondence (i.e., inliers) from two or more images of the same 3D scene is a critical component in many computer vision and pattern recognition tasks, and it is a prerequisite in multiple applications, including image stitching [1], tracking [2], registration [3], object retrieval [4], structure from motion (SFM) [5], and stereo matching [6]. However, outlier makes them be challenging problems. For example, in retinal image registration [7], in order to assist doctors to diagnose a pathology area in a large view, it is necessary to ﬁrst remove outliers from the putative correspondences and then register images accurately. The putative matches contain many outliers, and the underlying correspondences are identiﬁed after rejecting the false matches. Where the image features are extracted by a certain feature detector such as the corner and represented by a local descriptor such as SIFT [8], SURF [9], and Shape Context [10], and the initial matching method is the Best Bin First (BBF) [11].

∗

Corresponding author. E-mail address: [email protected] (G. Wang).

https://doi.org/10.1016/j.patcog.2017.09.029 0031-3203/© 2017 Elsevier Ltd. All rights reserved.

Practical outlier rejection method in computer vision should have the following properties: 1) establishing reliable point correspondences between two or more images with a certain constraint (e.g., descriptor similarity constraint where the putative correspondences are matched with similar descriptors, geometric constraint which requires that the initial matches satisfy rigid, aﬃne, or even non-rigid transformation) which can regularize the illposed matching problem, 2) the ability to ﬁt the underlying correspondences by modeling a desirable transformation that maps one feature point set to another, 3) robustness to false matches due to imperfect key-point detection and matching, especially when handling multi-view, wide-baseline, and deformation images and 4) with tractable computational complexity when facing a large number of feature points. A well-known two-step strategy is widely used for feature matching problem, and common sense suggests that typical twostep methods like RANSAC [12] and MLESAC [13,14] are applicable for outlier rejection. In the ﬁrst step, according to the degree of freedom (DoF) of the given parametric model, a minimum subset of outlier-free putative correspondences is extracted to estimate the model parameters, such as selecting 3 pairs of feature points for an aﬃne model. The second step is designed to verify

306

G. Wang et al. / Pattern Recognition 74 (2018) 305–316

the above computed parametric model on whole sample set. This two-stage runs alternatively until meeting a given termination condition of the algorithm. These resample methods are easy to implement and popular in many applications, however, they also suffer from eﬃciency and robustness problems when facing nonparametric transformation models (with high DoF) and the number of false matches or outliers in the putative correspondences becomes large. In this work, we present a robust nonparametric matching method: Gaussian Field Consensus (GFC), for outlier rejection to address the aforementioned limitations. The proposed GFC is based on the Gaussian ﬁelds criterion [15] where a Gaussian mixturedistance is used to measure the rigid and aﬃne models from feature attributes (e.g., edge or shape), while GFC extends the Gaussian ﬁelds criterion to nonparametric model, and simpliﬁes the Gaussian mixture-distance to a single Gaussian model like vector ﬁeld learning with an exponential loss function. The motivation is that the Gaussian distance penalty can be approximately considered as a truncated form of the 2 loss penalty, in this way, the exponential loss model can handle a large number of outliers with paying low penalty [16]. More precisely, we ﬁrst formulate the outlier rejection problem as data ﬁtting using the modiﬁed Gaussian ﬁelds criterion. Inspired by Yuille and Grzywacz’s motion ﬁeld theory [17,18], the nonparametric transformation is modeled like a displacement function that drives the key-points of the frame one towards onto the frame two step by step. Fortunately, we can give the explicit expression (a linear combination of kernels) of the high DoF transformation by the Representer Theorem and kernel method in a reproducing kernel Hilbert space (RKHS). In order to avoid the overﬁtting problem and make matching well deﬁned, inspired by the locally linear embedding in nonlinear dimensionality reduction [19], we introduce a locally linear constraint to preserve the topological structure of the local neighborhood area in a low-dimensional manifold. Moreover, we use a sparse approximation strategy to pursue a suboptimal solution of the nonparametric transformation function in an RKHS, in this way, it keeps the balance between eﬃciency and accuracy of the proposed method when handling a large number of putative correspondences. Extensive experiments on several 2D real image datasets illustrate that our proposed GFC is more robust to outliers, even the outlier ratio up to 70%, and outperforms several state-of-the-art outlier rejection methods in most test scenarios. In our related previous works, we use the graph-Laplacian regularized context-aware Gaussian ﬁelds criterion for non-rigid registration [20], here we simplify and extend the registration model to remove outliers. In addition, we also formulate the point matching problem as learning a corresponding coherent vector ﬁeld using a mixture model of Gaussian and uniform distribution [21], where the manifold regularization is introduced to preserve the local neighborhood structure, then the optimal mapping function is obtained by solving a weighted Laplacian regularized least squares (LapRLS), while here we just use the simple Gaussian model and quasi-Newton method to solve the optimization problem. The main contribution of this paper includes the following aspects. First, we simplify the context-aware Gaussian ﬁelds registration model for outlier rejection by modifying the mixture of Gaussian distances to a single Gaussian model. Second, the locally linear constraint that relates to the manifold regularization is introduced to preserve local neighborhood structures, then we just use it as a constraint term for the objective function. Third, in our previous work, we used low-rank approximation [20] to speed up the computation, here we introduce a sparse approximation to express the transformation, and then the computational complexity becomes linear approximately by reducing the search space. The rest of the paper is organized as follows: In Section 2, we introduce the related work of outlier rejection methods. In Section 3, we present the proposed Gaussian ﬁeld consensus.

Section 4 details the algorithm implementation. Experimental setup, results, and comparative studies are reported in Section 5, followed by some concluding remarks in Section 6. 2. Related work Many methods exist for outlier rejection in computer vision and pattern recognition, particularly in the ﬁeld of image stitching [22,23], registration (retinal image [24,25], remote sensing image [26]). They aim to remove the false matches from the putative correspondences, i.e., recover the correct correspondence. Here, we brieﬂy overview the outlier rejection methods according to the pipeline (as shown in Fig. 1). Commonly, a robust feature matching framework consists of four parts: detecting key-points, representing local features, initial matching, and removing outliers. After initial matching, the putative correspondences are constructed by measuring local descriptors which are more eﬃcient than assigning the intractable aﬃnity matrix. There are several popular local feature descriptors, including SIFT [8], SURF [9], PIIFD [27] and shape context [10]. The initial matching algorithm BBF (Best Bin First) [11] is designed to eﬃciently ﬁnd an approximate solution to the nearest neighbor search problem in high-dimensional spaces. In practice, putative correspondences are always contaminated by false matches or outliers due to the imperfect feature extraction. Then extensive outlier rejection methods are presented to solve this matching problem. Least-Median of Squares (LMedS) [28] and M-estimator [29] are two robust regression methods in the statistics literature. The goal of these type of methods is to change the error loss criterion to reduce the undue inﬂuence of outliers. Although LMedS is robust and easy to handle outliers, it suffers from high computational complexity, especially when facing a large number of putative correspondences. While M-estimator estimates an indicator variable to indicate whether data is a false or a true correspondence. The two-step strategy is well used in the resampling methods which are hypothesis-and-verify methods. The most popular algorithm in the ﬁeld is RANSAC (RANdom SAmple Consensus) [12], its main idea is to generate a hypothetical model from a minimum subset of outlier-free putative correspondences repeatedly, and then verify each model on the whole set to select the best one. However, limitation occurs when facing nonparametric models. To overcome the limitation, many progressive RANSAC algorithms have been developed, such as maximum likelihood estimation sample consensus (MLESAC) [13] which uses likelihood to evaluate the potential hypotheses, progressive sample consensus (PROSAC) [30] whose samples are drawn from progressively larger sets of top-ranked correspondences, deformation RANSAC [31] which assumes that the distribution of true correspondences usually resembles a low-dimensional aﬃne subspace. Moreover, Sunglok et al. [32] have reviewed and evaluated the performance of RANSAC algorithm family. Moreover, Li and Hu [33] introduces a concept of correspondence function and a learning algorithm which uses the support vector machine regression method (SVR) to identify the underlying correspondences and reject outliers, although it outperforms RANSAC, the robustness becomes poor when facing a large number of outliers. Zhao et al. [34,35] presented a vector ﬁeld consensus (VFC) method based on robust vector ﬁeld learning, the learning problem is formulated as a mixture model. VFC can handle a large percentage of outliers, but the main shortcoming is its low computational eﬃciency. Subsequently, Ma et al. [36] use the sparse approximation strategy to overcome the limitation of VFC, then they presented the sparseVFC which can approximately reduce the computational complexity down to linear. Motion modeling method can get the feature correspondence robustly with bilateral functions [37] or grid-based motion statistics [38].

G. Wang et al. / Pattern Recognition 74 (2018) 305–316

307

Fig. 1. The pipeline of the robust feature matching. Input a pair of images (the ‘Model’ vs. the ‘Scene’ images), then output the identiﬁed true correspondences (i.e., inliers).

From the iterative point matching based methods [39], true correspondences can be identiﬁed after checking the nearest neighborhood. Furthermore, from the perspective of motion coherence, the model feature point set is mapped to the scene point set by a set of smooth mapping functions. Based on the motion ﬁeld coherence theory (MCT), many related methods have been proposed, including coherent point drift (CPD) [40] which assumes that one point set is modeled as a GMM and the other is the data, where the distributions of inliers and outliers in the mixture model are Gaussian and uniform respectively, Gaussian mixture model (GMM) [41] which leverages the closed-form expression for the 2 distance between Gaussian mixtures, mixture of asymmetric Gaussian model [42,43] which introduces the asymmetric Gaussian to represent the point set, robust L2E matching [44] which introduces the L2E for non-rigid transformation estimation. More precisely, the nonparametric transformation is parameterized by radial basis function (RBF), such as thin-plate spline (TPS) and Gaussian RBF (GRBF). Finally, outliers would be rejected after learning a coherent motion ﬁeld from the putative correspondences.

3. Gaussian ﬁeld consensus 3.1. Notations Bold capital letter denotes a matrix X, xn denotes the nth row of the matrix X. xnm denotes the scalar value in the nth row and mth column of the matrix X. 1m × n denotes a matrix with all ones, as well as 0m × n denotes a matrix with all zeros. I n×n ∈ Rn×n denotes an identity matrix. · denotes a 2 -norm. trace(X) denotes the trace of the matrix. det(X) returns the determinant of square matrix X. diag(x) is a diagonal matrix whose diagonal elements are x. X◦Y is the Hadamard product of matrices, and XY is the Kronecker product of matrices.

3.2. Problem formulation Given a putative correspondence set S = {(xn , yn )}N , let the n=1 model feature set be X = {x1 , . . . , xN } ∈ RN×d and the scene point set be Y = {y1 , . . . , yN } ∈ RN×d , where d is the data dimension, and the goal of feature matching is to ﬁt the transformation function f and reject outliers Sou , and then identify the underlying correspondences Sin by a given residual error threshold δ . Based on the Gaussian ﬁelds criterion, we consider the registration case without the putative correspondences, then the mixture

of Gaussian distances criterion can be obtained as follows:

N N

y m − ( Ax n + b ) 2 E (A, b) = exp − σ2 n=1 m=1

(1)

where the transformation f (xn ) = Axn + b is for rigid and aﬃne registration. The Gaussian ﬁelds criterion is different from using the 2 least squares loss function, where the former is a straightforward sum of exponential distances between X and Y, and likes a truncated form of the 2 loss: E2 (A, b) = N N 2 n=1 m=1 ym − (Axn + b) , hence the Gaussian ﬁelds criterion is robust to outliers without paying a high penalty [16]. However, the Gaussian ﬁelds criterion (1) cannot handle complex degraded cases, e.g., non-rigid deformation. Here we simplify the Gaussian ﬁelds criterion for outlier rejection follow the formulation of vector ﬁeld learning method with a given set of putative correspondences S and extend it with nonparametric transformation model. Thus, we can rewrite (1), and get the Gaussian ﬁeld consensus: N

yn − (xn + f (xn , θ ))2 E (θ ) = − exp − σ2 n=1

+ λ(θ )

(2)

where θ is the parameters of the non-rigid transformation f which is often unknown and challenging to model. The rightmost regularization term (θ ) is used to avoid overﬁtting and make the Gaussian ﬁeld consensus well-posed, and λ > 0 is a weight to control the tradeoff between the empirical risk error and the regularization term. Actually, f is a displacement function based on the motion ﬁeld theory, and updates in each iteration. Finally, we can get the optimal displacement after reaching convergence, i.e., θ = arg minθ E (θ ). Note that the exponential distance is measured only for points with the same index n, in this simpliﬁed criterion, we can recover the feature correspondence by using the initial matching and the following GFC ﬁtting alternately when the data points in X and Y are not aligned. The underlying true correspondences can be identiﬁed by deﬁning a threshold δ , then the Gaussian ﬁeld consensus outputs the true correspondences from the putative set according to the following criterion:

Sin = {(xi , yi ) : pi ≥ δ, i ∈ [1, N]}

pi = exp −

yi − (xi + f (xi , θ ))2 σ¯ 2

(3)

(4)

where pi denotes the Gaussian distance loss between observed data yi and the transformed xi , and σ¯ denotes the optimal parameter responding to θ . In our paper, the recovered correspondence

308

G. Wang et al. / Pattern Recognition 74 (2018) 305–316

where [X|1] denotes the homogeneous coordinate, and · F denotes the Frobenius norm. Similarly, we can obtain the locally linear constraint after the nonparametric transformation model:

(θ ) =

N

N

f ( xm , θ ) −

m=1

W mn f (xn , θ )2

n=1

N N N . = f ( xm , θ )2 − W mn f (xm , θ ) f (xn , θ ) m=1

m=1 n=1

= f ( X , θ ) I f ( X , θ ) − f ( X , θ ) W f ( X , θ ) = f ( X , θ ) ( I − W ) f ( X , θ )

(8) , θ )] ,

Fig. 2. Steps of locally linear constraint: 1) Assign neighbors to each point xm by using the K nearest neighbors. 2) Compute the weights Wm · that best linearly reconstruct xm from its neighbors {xg , xh , xi , xj , xk , xl }. The bottom ﬁgure denotes the local geometry of xm after the transformation f, where xm = f (xm , θ ).

set of Sin is the so-called consensus set in RANSAC [12], so we call the proposed method Gaussian Field Consensus (GFC). (Fig. 2)

Matching problem is ill-posed, especially for nonparametric models, so we introduce the regularization theory to make our matching problem well deﬁned. Inspired by the locally linear embedding (LLE) which preserves the local neighboring structure in a low-dimensional manifold [19], we can formulate the local structure by a linear combination of the neighboring points. Then we can obtain the representation of each point xm by combining its neighboring points N xm linearly:

Wmn xn =

xn ∈Nxm

N

Wmn xn

(5)

n=1

where weight matrix WN × N denotes the cients for each point in model point set denotes neighboring weights of xm . The 1, ∀m ∈ [1, N] and W mn = 0 if xn ∈ / N § )

linear combination coeﬃX, and the mth row of W constraints ( N n=1 W mn = on W guarantee that the

local geometric structure of each point xm can be only represented by its neighbors. Then we can obtain the reconstruction error of weight matrix W by the following cost function:

arg min (W ) = arg min W

W

subject to

N

N

xm −

m=1

N

W mn xn

2

n=1

W mn = 1, ∀m ∈ [1, N]

(6)

n=1

where the optimal weight Wmn can be obtained by solving the least squares problem [45]. The reconstruction errors after the aﬃne transformation according to the following equation: N m=1

Ax m −

N

ym − (xm + f (xm , θ ))2 E (θ ) = − exp − σ2 m=1 N

+λ

f ( xm , θ ) −

m=1

N

W mn f (xn , θ )xn 2

(9)

n=1

Deﬁnition 1. A Hilbert space generalizes the notion of Euclidean space and extends the vector algebra and calculus methods from the low-dimensional space high-dimensional spaces. Hilbert spaces are complete metric spaces with respect to the distance function induced by the inner product. Deﬁnition 2. A reproducing kernel Hilbert space (RKHS) is a Hilbert space of functions in which point evaluation is a continuous linear functional. Given two functions f and g in the RKHS are close in norm: f − g is small, then for all x, f(x ) − g(x ) is also small. Due to an important remark that an RKHS deﬁnes a corresponding reproducing kernel, and a reproducing kernel deﬁnes a unique RKHS conversely. Thus, we deﬁne a standard Mercer kernel K : X × X → Rd×d with an associated RKHS HK in norm · K , and then we can obtain the function expression by using the following Representer Theorem (the proof is given in APPENDIX Appendix A).

f ( xm ) =

N

K(xm , xn )αn

(10)

n=1

n=1

= ([X |1] − W [X |1] )

In order to estimate the deﬁned nonparametric displacement function f(x, θ ), we use the kernel method in the reproducing kernel Hilbert space (see Deﬁnitions: 1 and 2) to represent this model.

The objective function (9) can be rewritten in the following matrix form:

W mn Axn 2 A 2F

3.4. Nonparametric estimation of transformation

Theorem 1. The minimization of the objective function (9) has a unique solution, given by

W mn = 0 if xn ∈ / N§

(A ) =

N

Finally, the local neighboring structure can be preserved after each transformation by using this local geometric constraint.

3.3. Locally linear constraint

xm =

where f (X , θ ) = [ f (x1 , θ ), . . . , f (xN and I ∈ is the identity matrix. Here we assume that each feature point and its neighbors to lie on or close to a locally linear layout due to the uniformly distributed image feature points and a low degree of deformation. By substituting (8) into (2), the objective function can be rewritten as: RN×N

(7)

E (α ) = − exp −

diag(V V )

σ2

+ λ(α K (I − W )Kα )

(11)

G. Wang et al. / Pattern Recognition 74 (2018) 305–316

where V = Y − (X + Kα ), α = (α1 , . . . , αN ) , ∀i, αi ∈ R, the kernel-valued matrix (i.e., Gram matrix) K of a set of vectors x1 , . . . , xN in an inner product space is the Hermitian matrix of inner products, and whose entries are given by Kmn = K(xm , xn ) = exp (−βxm − xn 2 ). Note that both the Gaussian kernel and the thin plate spline (TPS) can be easily substituted into our GFC. The Gaussian ﬁelds criterion is differentiable and not convex in the neighborhood of the optimal transformation position [20]. Thus, the numerical optimization problem can be solved by employing the gradient-based quasi-Newton method with deterministic annealing strategy (annealing ratio is η). Hence, by the quasi-Newton method, the derivative of the ﬁnal objective function (11) with respect to the coeﬃcient α can be obtained as:

∂ E (α ) K (V ◦ (P 1 ) ) = + 2λK (I − W )Kα ∂α σ2 where P = exp

diag(V V )

σ2

(12)

, 1 is a vector with size of 1 × d. Then

the optimal transformation f = Kα can be obtained after convergence. 3.5. Sparse approximation for GFC Based on the Representer Theorem, we can obtain the optimal displacement function f by minimizing the objective function (11) in an RKHS HK . Observing that the number of α is growing linearly with the scale of putative correspondence set, especially for large values of N, constructing the Gram matrix K and solving the coeﬃcient α require much computation time. Thus, we use a sparse approximation strategy for GFC to reduce the computational requirement. Different from the low-rank kernel-valued matrix approximation [20,52] which needs to construct the Gram matrix ﬁrst, and then extract several eigenvectors by eigenvalue decomposition (EVD), here the sparse approximation is used to search a suboptimal solution f † from an RKHS HM , M N which means selecting sparse control points x˜m to interpolate a new search space. Thus, the preferred suboptimal solution can be deﬁned with much fewer basis functions as follows:

f † (x ) =

M

K (x, x˜m )α˜ m

(13)

m=1

The computational cost of the proposed GFC largely depends on the number of control points x˜m , which drives the numerical optimization. With the same amount of regularization strength, a dense spacing of control points allows more localized and ﬂexible nonrigid deformations than a sparse spacing of control points. To select control points x˜m , sparse random basis representation strategy is widely used in the literatures [36,46]. Although it is a very simply and eﬃcient strategy, in practice, the solution (i.e., the identiﬁed true correspondences) may be instability. In this paper, we choose control points by uniform sampling, then stable solution can be generated. It is worth noting that the control point set {x˜m }M may not be a subset of the input data X. m=1 Another issue is how to determine √ the number of control points. Normally, we set M N, M = N, and give a discussion in Section 4. Using the sparse approximation, the locally linear constraint term in (11) can be rewritten as: † † ˜ α ˜ ) = f (x˜) (I − W ) f (x˜) (

˜ ( I − W )K ˜α ˜ K ˜ =α

˜ is K(x˜i , x˜ j ), i, j ∈ [1, M]. where the entries of K

(14)

309

Substituting the uniform sparse approximation (13) and constraint term (A.2) into the objective function (11), then we can the suboptimal solution f† by minimizing the sparse GFC as follows:

˜ ) = − exp − E (α

˜ )2 ) diag(Y − (X + α

σ2

˜ α ˜) + λ(

(15)

where is an N × M matrix with elements K(xi , x˜ j ). The derivative of the sparse approximation objective function ˜ can be obtained as: (16) with respect to the coeﬃcient α

∂ E (α˜ ) (V˜ ◦ (P˜ 1 )) ˜α ˜ = + 2λK (I − W )K ∂ α˜ σ2 ˜ ), P˜ = exp where V˜ = Y − (X + α

diag(V˜ V˜ )

σ2

(16)

, and then we can

˜. obtain the suboptimal solution f = α Thus, the proposed Gaussian ﬁeld consensus (namely GFC) with locally linear constraint can be outlined in Algorithm 1, and its †

Algorithm 1: Gaussian ﬁeld consensus (GFC). Input: The putative correspondence set S Output: The true correspondences Sin Initialize: σ , β , λ, δ , η, K, and α = 0; repeat Search the K nearest neighbors for each point in X ; Calculate the reconstruction weight W by Eq. (6); Construct kernel-valued matrix K by Gaussian kenel K (xm , xn ) = exp(−βxm − xn 2 ); Compute the coeﬃcient α by Eqs. (11) and (12); Update the feature points by X¯ ← X + Kα; Anneal the Gaussian bandwidth by σ¯ ← η × σ ; until E (α ) converges; Determine Sin by Eqs. (3) and (4); return Sin . fast implementation by sparse approximation can be outlined in Algorithm 2. 4. Algorithm analysis 4.1. Computational complexity analysis The computational complexity of GFC consists of two parts: 1) searching K nearest neighbors requires O (K log N + N log N ) and calculating the reconstruction weight W requires O (K 3 N ), 2) estimating nonparametric transformation requires O (N 3 ), then the total computational complexity is approximately O (N log N + K 3 N + N 3 ) for Algorithm 1. By using the uniform sparse approximation, the computational complexity of nonparametric transformation estimation reduces down to O (M2 N ) since M N, hence

Algorithm 2 Sparse approximation for GFC.

310

G. Wang et al. / Pattern Recognition 74 (2018) 305–316

Fig. 3. Average matching accuracies by GFC on the Oxford aﬃne covariant regions datasets. The degradation (1v2, 1v3, 1v4, 1v5, 1v6) and outlier (from 0.1 to 0.7) cases are tested. Note that the higher the accuracies get the better the performance has. (Best view in color.)

the total time complexity becomes of O (N log N + K 3 N + M2 N ) for Algorithm 2. Thus, the proposed algorithm can be applied to handling large scale putative correspondence set. 4.2. Implementation and parameters setting In the optimization, we can use the deterministic annealing technique on the scale parameter σ to improve the algorithm escape the local minimum. More precisely, given a large initial value of σ for global rigid structure, and reducing its value with a given annealing rate η towards for locally non-rigid structure by σ ← η × σ . So that the annealing process is slow enough for the proposed GFC to be robust. The gradual reducing of σ leads to a coarse-to-ﬁne match strategy. Typically, the number of sparse control point set {x˜m }M m=1 mainly affects the computational time. Thus, how to choose the number and the positions of control points is intractable. Here, we determine control points by the uniformly random sampling. More precisely, the important parameter M can be seen as a tradeoff between the matching accuracy and √time complexity. Due to M N, √ 3 we set M = N, evenly set M = N for large scale of the given correspondence set. 1

We empirically set σ = det(X X /N ) 2d , β = 1.5, λ = 3, K = 5, δ = 0.5 and η ∈ [0.9, 0.98] throughout this paper. Moreover, we set Ni = 30, in order to reach a good stable local minima. 5. Experiments The proposed GFC has been implemented in Matlab and all the experiments are performed on an Intel Core i7 CPU 2.5GHz with 16GB RAM. 5.1. Experiment setup Data. To evaluate our GFC algorithm, we design a set of experiments on outlier rejection of real images: 1) Aﬃne Covariant Regions Datasets (ACRD1 ) [49] consist of 8 scenes (6 cases in each scene) involving bark (varying zoom & rotation), bikes (varying blur), boat (varying zoom & rotation), graf (varying viewpoint), leuven (varying light), trees (varying blur), ubc (varying JPEG compression), wall (varying viewpoint). 2) Wide Baseline Images (WBI) [48] contains 6 pairs of images which are with a large viewpoint variation, 1

http://www.robots.ox.ac.uk/∼vgg/data/data-aff.html.

as shown in Fig. 6. 3) WILLOW Object Class Dataset (WILLOW2 ) [50] contains 5 sets of real images with manually labeled groundtruth landmarks (10 points), and we use it to analysis the convergence qualitatively. Comparison. For the comprehensive comparison, the proposed GFC is compared with some state-of-the-art methods, and extensive experimental results are presented to demonstrate the superiority of the proposed GFC on the task of outlier rejection. Precisely, six approaches are included in our comparative study: Random Sample Consensus (RANSAC) [12], Identify Correspondence Function [33] based on the support vector regression (SVR), deformation RANSAC (DefRANSAC) [31], GMM [41], CPD [40], L2E [47], and VFC [34]. Precisely, RANSAC tries to sample three pairs of points and generate an outlier-free subset for aﬃne transformation estimation, then run 10 0 0 times to verify the estimated model. SVR mainly uses the support vector regression to deﬁne an identifying correspondence function, then reject mismatches by testing whether they are consistent with the learned function. DefRANSAC is a simple RANSAC-driven deformable registration technique that the distribution of true correspondences resembles a low-dimensional aﬃne subspace, then outliers can be identiﬁed accurately. GMM, CPD, L2E, and VFC drive the model features align onto the scene features by non-rigid transformation model, then outliers can be rejected by deﬁning the nearest distance threshold τ , and we set τ = 0.5 for them throughout our paper. All methods are implemented in Matlab and tested in the same environment. Pipeline. Following the robust feature matching pipeline in Fig. 1, we use the SURF [9] to extract key-points, and the local image descriptor PIIFD [27] to generate the putative correspondences by the descriptor constraint, then BBF [11] is used to measure the model features and scene features. Note that the open source VLFEAT toolbox [51] supports several implementations such as SIFT, BBF, and RANSAC, where the percentage of the outlier in the putative correspondence set is different by tuning the distance ratio threshold (set it to 1.5 throughout the experiments). For the robustness analysis, we add outliers which satisfy the uniform distribution into the true matches, and then obtain the datasets with different outlier ratios. Evaluation criterions. We use four criterions: precision, recall, F1 -measure, and accuracy to evaluate the matching results quantitatively, where precision = tptp , recall = tptp , accuracy = +fp +fn

2

http://www.di.ens.fr/willow/research/graphlearning.

G. Wang et al. / Pattern Recognition 74 (2018) 305–316

311

Fig. 4. Average matching accuracies by GFC and other methods on the Oxford aﬃne covariant regions datasets with 70% outliers in putative correspondence set.

Fig. 5. Average precision, recall and the F-measure by GFC and other methods on the Oxford aﬃne covariant regions datasets with 70% outliers in putative correspondence set.

tp+tn , tp+tn+fp+fn

precision·recall and F1 = 2 · precision . Note that they are all in [0, +recall 1], and the higher the better.

5.2. Results on real images 5.2.1. Results on aﬃne covariant regions datasets The Oxford aﬃne covariant regions datasets are used to test the outlier rejection performance of the proposed GFC with sparse approximation. The true correspondences are identiﬁed by the GFC with a given threshold δ , i.e.,

Sin : {exp − diag((Y −(X +ασ˜¯ )2 (Y −(X +α˜ )))

≥ δ}.

Due

to

Table 1 Average run time (in seconds) of the methods on the Oxford aﬃne covariant regions datasets with 70% outliers.

RANSAC [12] DefRANSAC [31] SVR [33] CPD [40] GMM [41] L2E [47] VFC [34] GFC

Bark

Bikes

Boat

Graf

Leuven

Trees

ubc

Wall

1.62 0.02 0.12 1.15 4.35 0.80 0.04 0.33

1.69 0.01 0.21 1.23 4.77 0.48 0.01 0.16

1.54 0.01 0.15 1.09 4.19 0.74 0.01 0.27

1.34 0.01 0.13 0.77 2.88 0.74 0.01 0.21

1.48 0.01 0.21 1.24 4.38 0.54 0.01 0.14

1.47 0.02 0.20 1.24 4.58 0.50 0.01 0.15

1.45 0.02 0.19 1.24 4.56 0.60 0.01 0.17

1.41 0.02 0.12 1.07 4.02 0.71 0.01 0.20

existing

the ground-truth Sgt , we can quantitatively evaluate the matching performance by comparing Sin with Sgt . In each scene, there are 5 pairs of images: 1v2, 1v3, 1v4, 1v5, 1v6. Note that the degree of degradation becomes larger from 1v2 to 1v6 which makes the matching intractable.

Fig. 3 shows the matching results on 1400 pairs of putative correspondence sets by GFC and the average accuracies over 5 random trials are plotted with degradation and outlier ratio. The GFC algorithm performs well particularly for the outlier test where accu-

312

G. Wang et al. / Pattern Recognition 74 (2018) 305–316

Fig. 6. Wide baseline images which are captured by Tinne Tuytelaars and Luc Van Gool [48].

Fig. 7. Average matching accuracies by GFC and other methods on the wide baseline images with different outlier ratios (from 0.05 to 0.25).

racies of GFC using nonparametric transformation are close to one (almost above 0.95). We can see that the poor accuracies almost drop in the area of ‘1v6’ and ‘0.7’ outlier ratio since the large degree of degradation and a large number of outliers make outlier rejection challenging. Fig. 4 shows the average accuracies by GFC and other methods on the test set with 70% outliers. The GFC gets the highest accuracies (close to one) than the other methods in most cases. In contrast, the accuracies of the standard RANSAC with the aﬃne model (implemented in VLFEAT toolbox) and SVR are less than 0.6, in other words, they are not robust to a high percentage of outliers. For registration-based methods, CPD and GMM perform better than RANSAC and SVR, and CPD outperforms GMM since the outliers are formulated into the mixture model, while GMM is sensitive to outliers. The state-of-the-art outlier rejection methods: DefRANSAC, L2E, and VFC perform very well under 70% outliers, one the one hand, they use the non-rigid transformation to solve the corre-

spondence problem, on the other hand, transformation estimator such as L2E is robust to outliers. However, our proposed sparse GFC with the estimated suboptimal displacement function f † is at least as accurate as DefRANSAC, L2E, and VFC. More precisely, the proposed method gets better performance than DefRANSAC, L2E, and VFC in most test scenarios (the number of test cases whose accuracies obtained by GFC are greater than or equal to the accuracies obtained by DefRANSAC, L2E, VFC is 38, 39, 33, respectively). This is because the robust Gaussian ﬁeld like a truncated 2 without paying much penalty for a large number of outliers. Fig. 5 shows the precision, recall, and F-measure by GFC and other methods, where each bar is averaged over {1v2, . . . , 1v6} in each scene. F-measure that combines precision and recall is the harmonic mean of precision and recall, here, we use the F1 measure which can be interpreted as a weighted average of the precision and recall, where an F1 measure reaches its best value at 1 and worst at 0. We can see that the GFC gives better precision,

G. Wang et al. / Pattern Recognition 74 (2018) 305–316

313

Fig. 8. Average precision, recall and the F-measure by GFC and other methods on the wide baseline images with 25% outliers.

Fig. 9. Matching results by GFC on WILLOW Object Class Dataset with 50% outliers. From top to bottom, the putative correspondences (initialization), the identiﬁed true correspondences after the 1 iteration, and the identiﬁed true correspondences after the 15 iterations. The yellow lines denote true correspondences, the red lines denote outliers, and the blue lines denote unidentiﬁed true correspondences. (For interpretation of the references to colour in this ﬁgure legend, the reader is referred to the web version of this article.)

recall, and F1 measure than the other methods, and at least as accurate as L2E and VFC. Table 1 lists the average run time of the evaluated methods on ACRD. The GFC algorithm with the uniform sparse approximation is computationally eﬃcient, where we set M = 15 in all the tests. Note that the average number of putative correspondence set is 468 and the outlier number is 327. For the outlier test, the GFC algorithm with nonparametric transformation model is slower than DefRANSAC, SVR, and VFC slightly, while it is faster than that registration-based CPD, GMM, and L2E. Thus, the GFC algorithm can be used for a large number of points with a small value of M. It is worth noting the RANSAC should run in a fraction of a second, but we run 10 0 0 times to verify the estimated model in this experiment. 5.2.2. Results on wide baseline images Wide baseline images show some views of scenes taken from substantially different viewpoints, as shown in Fig. 6. Note the large changes in scale in some parts of the images, the serious oc-

clusions, and the extreme foreshortening [48]. In this experiment, features are extracted by SURF Matlab toolbox, and we use the PIIFD descriptor represent each feature, in practice, we can get more reliable correspondences than SIFT in VLFEAT toolbox. Here the ground-truth is constructed manually by ourselves, more precisely, we carefully pick the true matches one by one in the putative set. Fig. 7 shows the average accuracies over 5 random trials on wide baseline images, and the outlier ratio is from 0.05 to 0.25. Here, the limitation of the original RANSAC (with aﬃne transformation) is lack of ability to handle wide baseline cases, while the RANSAC with deformation transformation and other comparison methods can get good results. By comparing with three stateof-the-art methods: DefRANSAC, L2E, and VFC, the GFC algorithm gives better accuracies (close to one) in most cases (the number of test cases whose accuracies obtained by GFC are greater than or equal to the accuracies obtained by DefRANSAC, L2E, VFC is 29, 27, 19, respectively).

314

G. Wang et al. / Pattern Recognition 74 (2018) 305–316

parameter M plays the role of a tradeoff between the accuracy and run time. 5.3. Results on 3D point set

Fig. 10. Choice of sparse parameter: M. The left image denotes the run times by GFC with different sets of M, and the right image denotes the alignment error (in pixel) by GFC with different sets of M after several iterations.

Fig. 8 shows that the GFC algorithm is at least as good as DefRANSAC, L2E, and VFC on both precision, recall, and F1 measure. Due to the number of putative correspondences is less than the Oxford datasets, and we can see the CPD and GMM perform well, this is because of the small scale of the outlier set. 5.2.3. Results on WILLOW object class dataset In the WILLOW object class dataset, although the feature points have a similar shape, the large changes in viewpoint, scale, and non-rigid deformation in the area of the object. In this experiment, we just show the results after iteration 1 and 15, respectively for qualitative analysis. For robustness test, 10 outliers are added to the putative correspondence set, the initialization is shown in Fig. 9. After the 1 iteration, almost true correspondences can be identiﬁed by GFC, where GFC ﬁnds all true correspondences on ‘face’ and ‘winebottle’, note that all outliers have been removed on all test images. Then after the 15 iterations, GFC identiﬁes all true correspondences without outliers. 5.2.4. Choice of sparse parameter: M Given a set of putative correspondences, we prefer to test the sparse parameter of GFC for the suboptimal solution in the feature matching problem. By comparing with origin GFC without sparse approximation, the sparse GFC can obtain a suboptimal solution with an appropriate sparse parameter M. Here, we use the registration error and run time to measure M. Fig. 10 shows the sparse

From the problem formulation of the GFC, we can use the method to handle 3D or high-dimensional data easily. To evaluate the performance of the proposed GFC, we select a pair of 3D points which are captured by a person surface with a pair of poses. Large scale and deformation exist in the data. In this experiments, we select 100 correspondences as the ground-truth and add 100 (outlier ratio is 50%) and 200 (outlier ratio is 66.7%) outliers into the putative correspondence set respectively. The results by the GFC are shown in Fig. 11. Note that the true correspondences spread over the whole area, such as head, hands, foots, and body. Fig. 11 (c) and (d) show the identiﬁed correspondences by GFC, and the precisionrecall is (94%, 100%) and (86%, 100%), respectively, where several false positive correspondences are contaminated since the complex pose deformation. When we add more outliers, such as 300 outliers (outlier ratio up to 75%), the precision and recall become (73%, 88%). 6. Conclusion In this work, we present a robust nonparametric matching method for outlier rejection, namely Gaussian Field Consensus (GFC). The proposed GFC is based on the Gaussian ﬁelds criterion where a Gaussian mixture-distance is used to measure the parametric models (e.g., rigid and aﬃne) from feature attributes, while GFC extends it to nonparametric model, and simpliﬁes the Gaussian mixture-distance to a single Gaussian model like vector ﬁeld learning with an exponential loss function. The motivation is that the Gaussian distance penalty can be approximately considered as a truncated form of the 2 loss penalty, in this way, the exponential loss model can handle a large number of outliers without paying much penalty. More precisely, 1) we simplify the context-aware Gaussian ﬁelds registration model for outlier rejection by modifying the mixture of Gaussian distances to a single Gaussian model.

Fig. 11. Results by GFC on 3D data points. (a) 3D point set. (b) An example of the putative correspondence set with 100 outliers. (c) Result by GFC when adding 100 outliers. (d) Result by GFC when adding 200 outliers. Note that yellow lines denote the true correspondences, while red lines denote the outliers. (For interpretation of the references to colour in this ﬁgure legend, the reader is referred to the web version of this article.)

G. Wang et al. / Pattern Recognition 74 (2018) 305–316

2) the locally linear constraint that relates to the manifold regularization is introduced to preserve local neighborhood structures, then we just use it as a constraint term for the objective function. 3) we introduce a sparse approximation to reduce the computational complexity. Finally, extensive experimental results on 2D real images and 3D point set illuminate that the GFC outperforms several state-of-the-art methods and is robust to outliers. Acknowledgment This work was supported by the National Natural Science Foundation of China under Grant Nos. 61703260 and 61573235, the Fundamental Research Funds for the Central Universities. Appendix A. Representer theorem and its proof Let X be a nonempty set and K a positive-deﬁnite real-valued kernel on X × X with corresponding RKHS HK . Given a training set {(xn , yn )} ∈ X × R, a strictly monotonically increasing real-valued function g : [0, ∞ ) → R, and an arbitrary empirical risk function E : (X × R2 )N → R ∪ {∞}, then for any f ∈ HK statisfying:

f = arg min f ∈HK

N

E (xn , yn , f (xn )) + g( f )

(A.1)

n=1

Hence, f admits a representation of the form:

f (· ) =

N

K(·, xn )αn

(A.2)

n=1

where αi ∈ R for all 1 ≤ i ≤ N. Proof: Firstly, we deﬁne a mapping ϕ : X → RX , where ϕ (x ) = K(·, x ). Due to K is a reproducing kernel, then ϕ (x )(x ) = K(x , x ) = ϕ (x ), ϕ (x ), where · , · is the inner product on HK . Given any x1 , . . . , xn , one can use orthogonal projection to decompose any f ∈ HK into a sum of two functions, one lying in span {ϕ (x1 ), . . . , ϕ (xN )}, and the other lying in the orthogonal complement: f = N m=1 ϕ (xm )αm + υ , where υ , ϕ (xm ) for all m. For any training point xn , we have

f ( xn ) =

N

ϕ (xm )αm + υ , ϕ (xn )

m=1

=

N

(A.3)

αm ϕ (xm ), ϕ (xn )

m=1

which we observe is independent of υ , and then the value of the empirical risk term E is independent of υ , and setting υ = 0 does not affect E, while it strictly decreasing the regularization term (since υ is orthogonal to N m=1 K (·, xm )αm and g is strictly monotonic). Consequently, we obtain the desired result (A.2). References [1] J. Jia, C.K. Tang, Image stitching using structure deformation, IEEE Trans. Pattern Anal. Mach. Intell. 30 (4) (2008) 617–631. [2] R. Kasturi, D. Goldgof, P. Soundararajan, V. Manohar, J. Garofolo, R. Bowers, M. Boonstra, V. Korzhova, J. Zhang, Framework for performance evaluation of face, text, and vehicle detection and tracking in video: data, metrics, and protocol, IEEE Trans. Pattern Anal. Mach. Intell. 31 (2) (2009) 319–336. [3] Y. Zheng, D. Doermann, Robust point matching for nonrigid shapes by preserving local neighborhood structures, IEEE Trans. Pattern Anal. Mach. Intell. 28 (4) (2006) 643–649. [4] J. Sivic, A. Zisserman, Eﬃcient visual search of videos cast as text retrieval., IEEE Trans. Pattern Anal. Mach. Intell. 31 (4) (2009) 591–606. [5] O. Chum, J. Matas, Optimal randomized ransac., IEEE Trans. Pattern Anal. Mach. Intell. 30 (8) (2007) 1472–1482. [6] H. Hirschmuller, Stereo processing by semiglobal matching and mutual information, IEEE Trans. Pattern Anal. Mach. Intell. 30 (2) (2007) 328–341.

315

[7] A. Can, C.V. Stewart, B. Roysam, H.L. Tanenbaum, A feature-based, robust, hierarchical algorithm for registering pairs of images of the curved human retina, IEEE Trans. Pattern Anal. Mach. Intell. 24 (3) (2002) 347–364. [8] D.G. Lowe, Distinctive image features from scale-invariant keypoints, Int. J. Comput. Vis. 60 (2) (2004) 91–110. [9] H. Bay, T. Tuytelaars, L.V. Gool, Surf: speeded up robust features, Comput. Vis. Image Understand. 110 (3) (2006) 404–417. [10] S. Belongie, J. Malik, J. Puzicha, Shape matching and object recognition using shape contexts, IEEE Trans. Pattern Anal. Mach. Intell. 24 (4) (2002) 509–522. [11] J.S. Beis, D.G. Lowe, Shape indexing using approximate nearest-neighbour search in high-dimensional spaces, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 1997, pp. 10 0 0–10 06. [12] M.A. Fischler, R.C. Bolles, Random sample consensus: a paradigm for model ﬁtting with applications to image analysis and automated cartography, Commun. ACM 24 (6) (1981) 381–395. [13] P.H.S. Torr, A. Zisserman, Mlesac: a new robust estimator with application to estimating image geometry, Comput. Vis. Image Understand. 78 (1) (20 0 0) 138–156. [14] B.J. Tordoff, D.W. Murray, Guided-mlesac: faster image transform estimation by using matching priors, IEEE Trans. Pattern Anal. Mach. Intell. 27 (10) (2005) 1523–1535. [15] F. Boughorbel, M. Mercimek, A. Koschan, M. Abidi, A new method for the registration of three-dimensional point-sets: the gaussian ﬁelds framework, Image Vis. Comput. 28 (1) (2010) 124–137. [16] J. Ma, J. Zhao, Y. Ma, J. Tian, Non-rigid visible and infrared face registration via regularized gaussian ﬁelds criterion, Pattern Recognit. 48 (3) (2015) 772–784. [17] A.L. Yuille, N.M. Grzywacz, The motion coherence theory, in: IEEE 2nd International Conference on Computer Vision (ICCV), IEEE, 1988, pp. 344–353. [18] A.L. Yuille, N.M. Grzywacz, A mathematical analysis of the motion coherence theory, Int. J. Comput. Vis. 3 (2) (1989) 155–175. [19] S.T. Roweis, L.K. Saul, Nonlinear dimensionality reduction by locally linear embedding, Science 290 (5500) (2000) 2323–2326. [20] G. Wang, Z. Wang, Y. Chen, Q. Zhou, W. Zhao, Context-aware gaussian ﬁelds for non-rigid point set registration, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 5811–5819. [21] G. Wang, Z. Wang, Y. Chen, X. Liu, Y. Ren, L. Peng, Learning coherent vector ﬁelds for robust point matching under manifold regularization, Neurocomputing 216 (2016) 393–401. [22] W.Y. Lin, S. Liu, Y. Matsushita, T.T. Ng, L.F. Cheong, Smoothly varying aﬃne stitching, in: Computer Vision and Pattern Recognition, 2011, pp. 345–352. [23] W.Y. Lin, M.M. Cheng, S. Zheng, J. Lu, N. Crook, Robust non-parametric data ﬁtting for correspondence modeling, in: textbfIEEE ICCV, 2013, pp. 2376–2383. [24] G. Wang, Z. Wang, Y. Chen, W. Zhao, Robust point matching method for multimodal retinal image registration, Biomed. Sig. Process. Control 19 (2015) 68–76. [25] G. Wang, Z. Wang, Y. Chen, Q. Zhou, W. Zhao, Removing mismatches for retinal image registration via multi-attribute-driven regularized mixture model, Inf. Sci. 372 (2016) 492–504. [26] J. Ma, H. Zhou, J. Zhao, Y. Gao, J. Jiang, J. Tian, Robust feature matching for remote sensing image registration via locally linear transforming, IEEE Trans. Geosci. Remote Sens. 53 (12) (2015) 6469–6481. [27] J. Chen, J. Tian, N. Lee, J. Zheng, R.T. Smith, A.F. Laine, A partial intensity invariant feature descriptor for multimodal retinal image registration., IEEE Trans. Biomed. Eng. 57 (7) (2010) 1707–1718. [28] A. Basu, I.R. Harris, N.L. Hjort, M. Jones, Robust and eﬃcient estimation by minimising a density power divergence, Biometrika 85 (3) (1998) 549–559. [29] P.J. Huber, Robust Stat., 1981. [30] O. Chum, J. Matas, Matching with prosac-progressive sample consensus, in: Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on, vol. 1, IEEE, 2005, pp. 220–226. [31] Q.-H. Tran, T.-J. Chin, G. Carneiro, M.S. Brown, D. Suter, In defence of ransac for outlier rejection in deformable registration, in: Computer Vision–ECCV 2012, Springer, 2012, pp. 274–287. [32] C. Sunglok, K. Taemin, Y. Wonpil, Performance evaluation of ransac family, in: Proceedings of the British Machine Vision Conference (BMVC), 2009. [33] X. Li, Z. Hu, Rejecting mismatches by correspondence function, Int. J. Comput. Vis. 89 (1) (2010) 1–17. [34] J. Zhao, J. Ma, J. Tian, J. Ma, D. Zhang, A robust method for vector ﬁeld learning with application to mismatch removing, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2011, pp. 2977–2984. [35] J. Ma, J. Zhao, J. Tian, A.L. Yuille, Z. Tu, Robust point matching via vector ﬁeld consensus, IEEE Trans. Image Process. 23 (4) (2014) 1706–1721. [36] J. Ma, J. Zhao, J. Tian, X. Bai, Z. Tu, Regularized vector ﬁeld learning with sparse approximation for mismatch removal, Pattern Recognit. 46 (12) (2013) 3519–3532. [37] W.Y.D. Lin, M.M. Cheng, J. Lu, H. Yang, M.N. Do, P. Torr, Bilateral functions for global motion modeling, in: European Conference on Computer Vision, 2014, pp. 341–356. [38] J.W. Bian, W.-Y. Lin, Y. Matsushita, S.-K. Yeung, T.-D. Nguyen, M.-M. Cheng, Gms: Grid-based motion statistics for fast, ultra-robust feature correspondence, Computer Vision and Pattern Recognition, 2017. [39] P.J. Besl, N.D. McKay, A method for registration of 3-d shapes, IEEE Trans. Pattern Anal. Mach. Intell. 14 (2) (1992) 239–256. [40] A. Myronenko, X. Song, Point set registration: coherent point drift, IEEE Trans. Pattern Anal. Mach. Intell. 32 (12) (2010) 2262–2275.

316

G. Wang et al. / Pattern Recognition 74 (2018) 305–316

[41] B. Jian, B.C. Vemuri, Robust point set registration using gaussian mixture models, IEEE Trans Pattern Anal Mach Intell 33 (8) (2011) 1633–1645. [42] G. Wang, Z. Wang, W. Zhao, Q. Zhou, Robust point matching using mixture of asymmetric Gaussians for nonrigid transformation, in: Computer Vision–ACCV 2014, Springer, 2015, pp. 433–444. [43] G. Wang, Z. Wang, Y. Chen, W. Zhao, A robust non-rigid point set registration method based on asymmetric gaussian representation, Comput. Vis. Image Understand. 141 (2015) 67–80. [44] J. Ma, W. Qiu, J. Zhao, Y. Ma, A.L. Yuille, Z. Tu, Robust l2e estimation of transformation for non-rigid registration, IEEE Trans. Sig. Process. 63 (5) (2015) 1115–1129. [45] H. Li, X. Huang, L. He, Object matching using a locally aﬃne invariant and linear programming techniques, IEEE Trans. Pattern Anal. Mach. Intell. 35 (2) (2013) 411–424. [46] E.J. Candes, T. Tao, Near optimal signal recovery from random projections and universal encoding strategies, IEEE Trans. Inf. Theory 52 (12) (2007) 5406–5425.

[47] J. Ma, J. Zhao, J. Tian, Z. Tu, A.L. Yuille, Robust estimation of nonrigid transformation for point set registration, in: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2013, pp. 2147–2154. [48] T. Tuytelaars, L. Van Gool, Matching widely separated views based on aﬃne invariant regions, Int. J. Comput. Vis. 59 (1) (2004) 61–85. [49] K. Mikolajczyk, T. Tuytelaars, C. Schmid, A. Zisserman, J. Matas, F. Schaffalitzky, T. Kadir, L. Van Gool, A comparison of aﬃne region detectors, Int. J. Comput. Vis. 65 (1–2) (2005) 43–72. [50] M. Cho, K. Alahari, J. Ponce, Learning graphs to match, in: IEEE International Conference on Computer Vision, 2013, pp. 25–32. [51] A. Vedaldi, B. Fulkerson, Vlfeat: An open and portable library of computer vision algorithms, in: Proceedings of the International Conference on Multimedia, ACM, 2010, pp. 1469–1472. [52] G. Wang, Q. Zhou, Y. Chen, Robust non-rigid point set registration using spatially constrained gaussian ﬁelds, IEEE Trans. Image Process. 26 (4) (2017) 1759–1769.

Gaussian field consensus: A robust nonparametric ...

match) rejection is to fit the transformation function that maps one feature point set to another. Our GFC starts by inputting a putative .... ciently find an approximate solution to the nearest neighbor search problem in high-dimensional ... Bold capital letter denotes a matrix X, xn denotes the nth row of the matrix X. xnm denotes ...

Download PDF

4MB Sizes 0 Downloads 200 Views

Report

Gaussian field consensus: A robust nonparametric ...

Recommend Documents