Noname manuscript No.
(will be inserted by the editor)
A Versatile Framework for Shape Description Raif M. Rustamov
the date of receipt and acceptance should be inserted later
Abstract
We present a shape description framework
the shape retrieval procedures must be adjusted to the
that generates a multitude of shape descriptors through
classes of shapes arising in a particular application. This
a variety of design and continuum of parameter choices.
is usually achieved by implementing a number of shape
Our parameter is a surface mesh, referred to as tem-
descriptors, and choosing a shape descriptor or a com-
plate, which is supplied at run time, and allows gen-
bination of them that produces optimal results. This
erating dierent shape descriptors for the same model.
time consuming process would benet immensely from
Our framework extracts a numerical shape descriptor
a shape description framework that automatically gen-
by computing a selected function on the model mesh,
erates a variety of shape descriptors through a natural
mapping (transferring) it to the template, expanding
control parameter. Such a framework would obviate the
the mapped function in terms of a basis on the tem-
need to implement these descriptors one by one, and al-
plate, and collecting the expansion coecients into a
low evaluating a larger spectrum of shape descriptors
vector. We investigate possible design choices for the
for optimality.
steps in the framework, and introduce novel approaches
While there has been an explosion in the number
that provide further freedom in generating a multitude
of 3D shape descriptors proposed for shape retrieval,
of previously unknown descriptors. We show that our
the vast majority of them are rigid constructs that can-
approach is a generalization of the way some of the ex-
not be modied except for ne-tuning parameters such
isting numerical descriptors are dened, and that for
as histogram bin number/size, descriptor vector length,
appropriate template choices one is able to reproduce
and etc. The proposed classications of shape descrip-
some of the well-known descriptors. Finally, we show
tors such as appearing in surveys [17, 5, 35] provide in-
empirically that design and parameter choices have non-
clusive frameworks for shape descriptors that are too
trivial eects on the descriptor's performance, and that
general/theoretical to oer an implementable tool. The
better retrieval results can be obtained by combining
Gromov-Hausdor distance based approaches [25, 26, 3]
descriptors obtained via dierent templates.
allow generating dierent descriptors but only through
3D shape retrieval, numerical shape descriptors
a limited number of design choices. Shape descriptors based on Morse theory [30, 2] are parametric with respect to the choice of a surface function, yet feature
1 Introduction Content based 3D shape retrieval is a problem of great importance that has many applications in computer graphics, 3D face recognition, CAD, medical imaging, and protein bioinformatics. Due to the diversity of shapes,
Raif M. Rustamov, Drew University, 36 Madison Ave, Madison, NJ 07940, USA E-mail:
[email protected]
based shape descriptors [5] do not fall within this class. The goal of our work is to introduce an implementable shape description framework that generates a dierent feature based shape descriptor for each setting of a parameter. Our parameter is a surface mesh to which we refer as the template. To compute our shape descriptor we input two meshes: one, the model, is the shape for which the descriptor is computed, and the other, the template, is the parameter that allows generating dierent shape descriptors for the same model. Our de-
Model (S)
2
Signal Processing
Template (T)
Mapping
Feature function Fig. 1
Mapped feature function
= C1
+ C2
+
+ C3
+ C4
+ ...
Expansion in terms of basis and the extracted descriptor
Steps used in our framework to extract the shape descriptor of a model.
scriptor is extracted in three steps as illustrated in Fig-
for basis construction are novel in the context of shape
ure 1. We start by computing a selected function on the
retrieval. Finally, we theoretically and experimentally
model mesh, feature function, that captures a property
evaluate the eect of design choices and parameter set-
relevant to shape description (e.g. Gaussian curvature,
tings on the generated shape descriptors (Section 9).
distance to the origin). Next, we map (e.g using projection) the computed function to the template mesh. Finally, we expand the mapped function in terms of a
2 Previous Work
function basis on the template mesh, and collect the expansion coecients in a vector to obtain a concise
Content based 3D shape retrieval has drawn consid-
numerical shape descriptor of the model.
erable attention of researchers, and led to the intro-
This approach was chosen for ve reasons. First, our
duction of a large variety of shape descriptors; we re-
framework generates a multitude of shape descriptors
fer the reader to the survey papers [17, 5, 30, 35, 2]. It
through a continuum of parameter (template) and a va-
can be seen that the vast majority of the descriptors
riety of design (feature function, mapping method, basis
are rigid constructs that cannot be modied except for
generation procedure) choices. Second, our approach is
ne-tuning parameters such as histogram bin size, de-
a generalization of the way some of the existing nu-
scriptor vector length, etc.
merical descriptors are dened; in fact, for appropriate
Survey papers [17, 5, 35] also introduce inclusive clas-
design and parameter choices one is able to reproduce
sications of the existing descriptors. However, these
some of the well-known descriptors. Third, the use of
classications are made at an abstract level, and can-
templates bypasses the dicult problem of nding cor-
not be used to build a framework to produce dierent
respondences between basis functions coming from dif-
descriptors through dierent choices of a natural pa-
ferent surfaces. Fourth, templates constitute a natural
rameter.
parameter a shape already in the database can be
A concrete general framework for shape description
used as a template. Fifth, templates constitute a pow-
is provided Gromov-Hausdor distance based approaches
erful parameter with non-trivial eects on the generated
[25, 26, 3]. Here dierent shape descriptors can be ob-
shape descriptor, a fact we demonstrate experimentally.
tained by varying the metric on the surface, cf. [4],
The main contribution of this paper is the proposed
the base distance within the Gromov-Hausdor the-
framework for automatic generation of a large spectrum
ory [24], and representation of the descriptor (e.g. his-
of shape descriptors (Sections 3, 4). Within this gen-
togram [28], persistence diagram [6]). However, the known
eral framework, we consider design choices for feature
choices within this framework constitute a discrete set
function (Section 5), mapping method (Section 6), ba-
and are best described as design choices.
sis construction (Section 7), and the metric used for
A truly parametric and concrete general framework
descriptor comparison (Section 8). Our use of interpo-
for shape characterization is rendered by Morse theory.
lation for mapping, and diusion wavelets and PCA
A systematic survey of the relevant methods is given
3 in [30, 2]. Here dierent shape descriptors can be ob-
The second component is mapping the feature func-
tained by applying Morse theory to dierent functions
tion to, perhaps, a dierent domain. The feature func-
on the model, and through the dierent representations
tion is used to construct a new function dened on
(Reeb graphs, persistence homology). The main param-
some predetermined domain; we call this domain as the
eter in this case is the function which can be supplied
mapping domain, and the new function as the mapped
by the user at the runtime. In fact, the AIM@SHAPE
feature function. The existing approaches use spheres,
Geometric Searching Engine provides such a function-
planes, the 3D space (surface's bounding volume), the
ality. While Morse theory techniques do provide gen-
surface itself as mapping domains. The mapped feature
eral framework, they rely on dierent ways of describ-
function is usually constructed by some kind of projec-
ing shapes. As a result, the realm of shape descriptors
tion. When the mapping domain is the surface itself,
they capture is disjoint from our framework which is
the mapped feature function is just set equal to the
geared towards feature-based shape description. Note
feature function.
that choosing dierent functions on the model is among
The third component is signal processing to extract
the degrees of freedom provided in our framework as
a concise, noise-robust numerical shape descriptor from
well.
the mapped feature function. The type of processing
Note that in cases when existing shape descriptors
depends on the mapping domain: for sphere one may
are reproduced by our framework, the template is one of
use the Spherical Harmonic Transform, for a plane or
the simple geometries (plane, sphere), because previous
a box volume 2D or 3D Fourier transform, ball vol-
methods depend on the classical signal processing tools,
ume 3D Zernike Transform. Essentially, in all of these
which limits the possible choices to simple geometries.
cases the mapped feature function is expanded in a se-
It is through the use of non-classical signal processing
ries in terms of the relevant basis, and the expansion
tools (Laplace-Beltrami eigenbasis, diusion wavelets)
coecients are used as the shape descriptor.
that we are able to escape the conne of the simple
To provide an example, let us analyze an existing
geometries, and employ any surface as a template, pro-
descriptor in the described terms. Consider the descrip-
ducing previously unknown descriptors. However, even
tor of [32] where rays are shoot from the origin (cen-
within the realm of classical signal processing it is pos-
ter of mass of the mesh) to determine the distance to
sible to construct a shape description framework with a
the farthest intersection point with the mesh. The rays
template parameter, if one limits the templates to ellip-
are parametrized by the unit sphere, and as a result
soids. Indeed, in such a scenario ellipsoidal harmonics
one obtains a function dened on the sphere. Next, the
can be used for signal processing. Ellipsoidal harmonics
spherical harmonic transform is applied to this function
were employed in the work of Mademlis et. al. [23] to
to extract the numerical shape descriptor. In our terms,
extract shape descriptors similar to spherical harmonic
the feature function is the distance from the mesh point
descriptors of [21]. However, they only t a single el-
to the origin; the mapping domain is the unit sphere;
lipsoid to the shape, and do not exploit the degree of
mapped feature function is obtained by projecting the
freedom provided by the choice of ellipsoid as a param-
feature function onto the sphere, and if two mesh points
eter.
project to the same sphere point, one resolves the ambiguity by selecting the larger function value; nally,
3 Rationale Before describing the details of our framework, we set up the context by highlighting three components that frequently go into the construction of numerical shape descriptors. This discussion motivates our approach, and claries that it is a generalization of the way the
the signal processing component is performed using the spherical harmonic transform. Let us conclude by noting that our second and third components roughly correspond to the object abstraction and numerical transformation components of [5].
4 Approach
majority of existing numerical descriptors are dened. The rst component is the extraction of a surface
We will now describe our shape description framework
feature. By surface feature we mean any function on the
following the component delineation of Section 3. The
surface that captures a property relevant to shape de-
framework inputs two surface meshes, model and tem-
scription. Examples include the constant function (re-
plate, and computes a numerical shape descriptor for
striction of the surface's characteristic function to the
the model.
surface itself ), distance to the center of mass, curvature,
Models in the search database are assumed to be
components of the normal vector, and many others. We
given as triangle meshes; we will use the terms sur-
refer to the selected function as the feature function.
face, surface mesh, and mesh interchangeably. We let
4
S
denote any surface in the database. All models in
5 Feature Functions
the database are assumed to be normalized to achieve translation, rotation and scale invariance of the result-
A variety of possibilities exist as a choice for the sur-
ing shape descriptor. The template is another surface
face feature distance to the center of mass, curvature,
T
components of the normal vector, and many others. We
that is used as the mapping domain. The template
f,
where
f :S→R
can be one of the surfaces in the database, or any other
denote the feature function by
manifold surface mesh chosen arbitrarily. Note that the
real-valued function on the model surface. Let us men-
is a
role of the template is to provide a parameter through
tion that when interpolation is employed for mapping,
which the resulting shape descriptor for the model can
f
be controlled.
In fact, it restricts to a constant function on the surface,
cannot be the characteristic function of the surface.
and the interpolation schemes we consider extend the The shape descriptor is generated in three steps (Figure 1). First (extraction), the feature function where
f,
f : S → R is a real-valued function on the model
surface, is computed. Second (mapping), we construct a new function
fˆ : T → R,
the mapped feature function,
based on the given function
f : S → R.
fˆ = c1 φ1 + c2 φ2 + · · · cN φN , where {φi }N i=1 orthogonal basis for real-valued functions on T .
into a series
The extracted shape descriptor is dened as the vector descT (S)
= (c1 , c2 , ..., cm ),
where
making impossible extraction of any meaningful information. Linear functions of the
x, y, z
coordinates can-
not be used with barycentric interpolation for the same reason they extend to a linear function everywhere.
Third (signal
processing), the mapped feature function is expanded is an
constant function to a constant function everywhere,
m ≤ N.
One advantage of this approach is that the use of a template ascertains that the extracted expansion coefcients are in correspondence and their comparison is meaningful. Indeed, it may be tempting to altogether
6 Mapping Our goal is to construct the mapped feature function using the feature function of the model surface, namely dene a new function function
fˆ : T → R
based on the given
f : S → R. We consider three possible choices:
projection, Shepard interpolation, and barycentric interpolation.
bypass mapping to the template, and expand the feature function in terms of some basis on the original surface. However, this would have required nding the
6.1 Projection
correspondence between basis functions of dierent sur-
For projection, a correspondence between the model
faces, which is a dicult problem. For example, see
and template surface points is established, and the fea-
[18] for problems encountered when trying to nd cor-
ture function values are transferred to the template
respondences between eigenfunctions of the Laplace-
using this correspondence. Since the correspondence is
Beltrami operators on dierent surfaces.
generally not one-to-one, a procedure to resolve the am-
A second advantage of this approach originates from the use of signal processing. First, for shape description it is crucial to reduce the amount of noise so that only the essential properties of the shape are captured. Second, compact numerical shape descriptors are naturally provided by signal processing. Both goals are achieved by expanding a given function in a series and using the expansion coecients of the low-frequency components as the shape descriptor. Keeping only the low-frequency part not only results in a compact descriptor, but also is essentially equivalent to smoothing the function making the descriptor more robust to noise. To realize the described approach one needs to choose
biguity is needed. An advantage of projection is that almost all of the existing shape descriptors use projection for the mapping step, which corroborates its usefulness in the practice of shape description. However, projection also has important disadvantages. First, since the correspondence used for projection is generally not one-to-one, the resulting mapped feature function can be discontinuous. Therefore, signal processing will be subject to the Gibbs eect, rendering the low-frequency expansion coecients the ones used as the shape descriptor inadequate for representing the mapped function. Second, when dierent templates are used, we expect projection to lead to high redundancy, limiting the gains of concatenating descriptors
a feature function, decide how the mapping is done, and
obtained from dierent templates. Indeed, if there are
determine what basis to use for signal processing. Since
no overlaps, the value sets of the mapped feature func-
the selection of the feature function is ultimately appli-
tions on various templates will be the same, with only
cation driven, we will concentrate mostly on mapping
dierence being how the values are distributed over the
and signal processing.
template surfaces.
5
i = 1, ..., n
6.2 Shepard Interpolation
are called barycentric coordinates if they
satisfy the following three properties: partition of unity, Shepard interpolation is a zeroth order polynomial precision interpolation scheme that can be used to extend a function from a set of points to the entire ambient Euclidean space. Although we are given a function
f :S→R
dened on the mesh, we can sample points
from the mesh to obtain a function on a set of points.
n X
bi (p) = 1,
i=1 ane combination
n X
bi (p)vi = p,
We use Shepard interpolation to extend this function
i=1
to the template, and designate the obtained function
and Lagrange property
as
fˆ. Given a set of scattered points
f (pi )
pi
where the values
are known, the Shepard interpolant [33, 15] is de-
where
P
wi (p)f (pi ) iP i wi (p)
wi
Due to these properties, the barycentric coordinates can be used for interpolation: given function values
ned as follows
fˆ(p) =
bi (vj ) = δij .
the mesh vertices, one extends it to the whole space by setting for each
, p ∈ R3
are user assigned weights. The most com-
monly used weights are of the following form
−α
wi (p) = |p − pi |
fˆ(p) =
n X
p ∈ R3 ,
bi (p)f (vi ).
i=1 The barycentric interpolant has the rst order precision meaning that the linear functions of coordinates
,
are reproduced, i.e. if where
α
is a positive parameter. Shepard interpolant
has the zeroth order precision meaning that the constant functions are reproduced, i.e. if function, then
fˆ is
f (vi ) on
f
is a constant
also constant.
f
x, y, z fˆ is
is a linear function, then
also linear. In our experiments, we use mean-value coordinates [10, 11, 20] as barycentric coordinates, because they are fastest to evaluate. We compute the feature function
pi we uniformly
values at the vertices of the model mesh, and then
sample the area of the model surface, and evaluate the
evaluate the mapped feature function at the vertices
feature function at these points. Then, the mapped fea-
of the template mesh using mean-value interpolation.
ture function is evaluated at the vertices of the template
Our implementation of mean-value coordinates follows
In our experiments, to obtain points
mesh
T
using the formula for Shepard interpolation.
the pseudo-code presented in [20].
The advantages of mapping via Shepard interpola-
Mapping with mean-value interpolation has the fol-
tion include the following. First, the relative positions
lowing benets. First, the relative positions and dis-
and distances between the template and the model sur-
tances between the template and the model surface in-
face inuence the interpolant greatly, which leads to
uence the mean-value interpolant greatly, which leads
less redundancy between interpolants obtained for dif-
to less redundancy between interpolants obtained for
ferent templates. Second, the obtained mapped feature
dierent templates. Second, the mean-value interpolant
fˆ is continuous. Third, it is easy to implement,
is continuous, and so signal processing does not suer
function
from the Gibbs eect. Third, in principle, a mesh can be
and fast to compute.
reconstructed if the mean-value coordinates are given everywhere in the space (due to the Lagrange property, one can reconstruct the mesh vertices; the trian-
6.3 Barycentric Interpolation
gle faces can be reconstructed by nding the triplets of
Barycentric interpolation is a rst order polynomial precision interpolation scheme that can be used to extend a function from a mesh to the entire ambient Euclidean space. Using barycentric interpolation, we extend the given feature function
f : S → R
to the
template, and designate the obtained function on the template as
fˆ.
non-zero barycentric coordinates such that all remaining barycentric coordinates are zero); consequently, one may reasonably expect that interpolation based on meanvalue coordinates implicitly injects even more shape information into the mapped feature function.
7 Signal Processing
Barycentric interpolation is based on barycentric coordinates. For an arbitrary triangulated surface mesh with vertices
v1 , v2 , ..., vn ,
the functions
bi : R3 → R,
Our goal is to construct an orthonormal basis for real functions on the template surface
T.
This basis is used
6 to obtain a numerical shape descriptor by series ex-
given by a matrix, and its eigenvectors are used to ex-
fˆ : T → R.
pand a discrete function given on the mesh. For a mesh
pansion of the mapped feature function
g
We will consider three alternatives: Laplace-Beltrami
function
eigenbasis (also called manifold harmonics), diusion
proximated as
wavelets, and optimal compression basis via principal component analysis (PCA).
∆g(pi ) ≈
, the value of
∆g(pi )
i
at the vertex
is ap-
1 X mij (g(pj ) − g(pi )), ri
(2)
j∈N (i)
j adjari as the point-areas. For the Laplacian we use, mij 6= 0 only if i and j are adjacent, and the equality mij = mji holds for all i, j .
where the summation is over all vertex indices 7.1 Laplace-Beltrami Eigenbasis
cent to vertex
As has been corroborated in the literature, most recently in [36], Laplace-Beltrami eigenfunctions provide a natural basis for signal processing on closed manifold surfaces. For example, the spherical harmonics used so often in shape retrieval are nothing but the LaplaceBeltrami eigenfunctions of the sphere. For a closed compact manifold surface
T,
let
∆
de-
note its Laplace-Beltrami dierential operator. Consider the equation
∆φ = λφ. λ for which the equation has a nontrivial solu∆; the solution φ is called eigenfunction corresponding to λ. Note that λ = 0
i.
It is customary to refer to
Let us stack up the values of the function at mesh vertices,
g(pi ),
to construct the column-vector
g.
Now,
the above formula can be written as a matrix-vector multiplication
∆g ≈ Lg.
The involved matrix
L
the
discrete Laplacian has the entries as follows
P k mik /ri Lij = −mij /ri 0
if if
i = j, iand j adjacent,
otherwise.
A scalar
Now the Laplace-Beltrami eigenvalues/eigenfunctions
tion is called an eigenvalue of
are replaced by the eigenvalues
an
of the matrix
is always an eigenvalue the corresponding eigenfunctions are constant functions.
λi
and eigenvectors
vi
L.
Since the point-areas
ri
associated with mesh ver-
tices vary from vertex to vertex, the discrete Laplacian
The eigenvalues of the Laplace-Beltrami operator
matrix
L
is not symmetric. However, nding
L's
eigen-
are non-negative and constitute a discrete set; we can
values and eigenvectors in all cases can be reduced to
λ1 = 0 ≤ λ2 ≤ λ3 ≤ . . . ≤ λi ≤ . . .. The eigenfunction corresponding to λi , normalized to have the unit L2 norm will be denoted by φi .
a symmetric eigenvalue problem. In fact, consider the
Since the Laplace-Beltrami operator is Hermitian,
L is similar to the symmetric matrix 1 1 K = R− 2 M R− 2 . Thus, the eigenvalues and eigenvectors of L can be evaluated from those of K : if wi is an −1 eigenvector of K with eigenvalue λi , then vi = R 2 wi is the eigenvector of L with the same eigenvalue λi . Another crucial point is that the eigenvectors of L
put them into non-decreasing order:
the eigenfunctions corresponding to its dierent eigen-
hφi , φj i =
´
φ φ = 0. Thus, the T i j eigenfunctions of Laplace-Beltrami operator give an or-
values are orthogonal:
thogonal basis for the space of functions dened on the surface. As a result, a given function
g
on the surface
can be expanded in terms of the eigenfunctions
g = c1 φ1 + c2 φ2 + · · · ,
R
with
Rii = ri ;
denote by M the maMij = mij if i 6= j , L = R−1 M . Now we can
trix whose entries are given by and
Mii = lii ri .
Notice that
easily check that
are orthogonal. However, the orthogonality here is in (1)
ˆ
R-inner
(dot) product:
Finally, the series expansion of the mesh function
gφi .
which is represented by vector
T It is known that low-frequency eigenfunctions (small
terms of the
hvi , vj iR = vTi Rvj = δij .
where the coecients are given by
ci = hg, φi i =
diagonal matrix
λ)
display less oscillation on the surface. Thus, similarly to the standard Fourier analysis, truncating the series
g=
X
g,
ci vi , ci = hg, vi iR = gT Rvi .
i In our experiments we use the Laplacian of [27]:
expansion results in smoothing. Put dierently, the ex-
are the cotangent weights, and points areas
pansion coecients corresponding to the beginning of
in terms of vertex Voronoi areas.
the series capture the essential properties of the function; for a discussion see [1].
g,
is given as
ri
mij
are given
An advantage of Laplace-Beltrami eigenbasis is that almost all of the existing shape descriptors use it for sig-
In the discrete setting the Laplace-Beltrami oper-
nal processing on simple domains such as spheres and
ator of a manifold triangle mesh without boundary is
planes, which corroborates its usefulness in the practice
7 of shape description. Another advantage is that keeping
As a result, in order to nd the expansion coecients,
the low-frequency part not only results in a compact de-
one only needs to take the inner product of the mapped
scriptor, but also is essentially equivalent to smoothing
feature function with these basis vectors. As with ex-
making the descriptor more robust to noise.
pansion in terms of the Laplace-Beltrami eigenbasis one needs to truncate the obtained series expansion in a way that keeps smoother (less oscillating) components.
7.2 Diusion Wavelets
In the context of diusion wavelets, this corresponds to
Diusion wavelets, [9], provide multi-scale bases for signal processing in a variety of settings. The setting we will be using is that of scattered points in the space. The following is a very rough summary of the construction in [9], which avoids the intricate theoretical details and is geared towards implementation. From the template surface mesh
T
we uniformly
(with respect to the area) sample a set of points Clearly, any real-valued function on as a vector in
D
R|P | .
P
P.
can be recorded
trix that satises certain properties, most notably its eigenvalues are in the
[0, 1] interval. The diusion oper-
ator can be constructed as follows: for each point, one nds a xed number of nearest neighbors and assigns
exp(−t ∗ dist2 ), point in question, t is
the corresponding matrix entry to be
dist
is the distance to the
a parameter; next,
D
is obtained by normalizing this
matrix, see [8]. Diusion wavelet construction uses the dyadic powers of
D
to construct subspaces
Vi
and
Wi
R
of
|P |
together with their orthogonal bases; the subscript of a subspace is called its level. These subspaces satisfy
R|P | = W0 ⊕ V0 , V0 = W1 ⊕ V1 , V1 = W2 ⊕ V2 ,
and
so on. In fact, in the implementation we worked with,
W0 = {0},
and
V0 = R|P |
with delta-function basis.
To obtain the basis of subspace
Vi+1 , one i D2 on
dyadic power of diusion operator
Vi ,
acts by the the basis of
and applies some threshold and a special locality
preserving orthogonalization. An important property of these bases is that dierent levels represent dierent scales together they provide a multi-scale basis of
R
|P |
The main advantage of the diusion wavelet basis is its multi-scale nature, which can be benecial in cases where features at dierent scales need to be extracted and/or dierently weighted (cf. Section 8). Another advantage is that keeping the terms that belong to higher level subspaces not only results in a compact descriptor, but also is essentially equivalent to smoothing, which makes the descriptor more robust to noise.
Next, we need a diusion operator
(that linearly acts on these vectors) that is a ma-
where
keeping the terms that belong to higher level subspaces.
. For example, the basis vectors of
V0
(in our
case delta functions) represent the smallest scale possible their support is just one vertex. Meanwhile, the basis vectors of subspaces of high level have more global nature, because they are obtained by many repetitive applications of the diusion operator in a sense, the function values are diused throughout the point cloud. Fixing some maximum level
M
that one desires to
7.3 Optimal Basis via PCA This is a data-driven basis: we rst compute the mapped feature function
fˆ for a subset of models in the database,
and then extract the PCA basis for representing these functions. This basis is then used for all models both in the database and the query sets. We sample from the template surface mesh
T
uni-
formly (with respect to the area) a set of points and we evaluate the mapped feature functions on Let
{fˆ(k) : P → R}K k=1
P, P.
be the mapped feature func-
tions for a subset of models in the database; notice that
R|P | . Our goal is to prom |P | duce an orthonormal set of vectors {bi }i=1 in R such (k) K ˆ that the average squared distance of {f }k=1 from the m subspace spanned by {bi }i=1 is minimal. Note that this m means that {bi }i=1 is the optimal (in L2 sense) set of m vectors to expand fˆ(k) . these can be seen as vectors in
The problem of nding the optimal set of vectors as above can be solved using the principal component analysis (PCA) [19], which proceeds as follows. First,
a= ˆ(k) . Second, shift all the functions by this avf k=1 (k) erage, u = fˆ(k) − a. Now compute the P × P coPK (k) (k) variance matrix C , with Cij = k=1 ui uj . Finally, m the sought set {bi }i=1 is given by the eigenvectors of C corresponding to the largest m eigenvalues. Since the covariance matrix is symmetric, bi are orthogonal; one
compute the average function value at each point,
1 K
PK
only needs to normalize these vectors to obtain an orthonormal set.
work with, one can write
The main benet of the PCA basis is its optimality
R|P | = W0 ⊕ W1 ⊕ W2 ⊕ · · · ⊕ WM ⊕ VM .
(provided that the subset of models used to extract the
Now gathering the bases of the subspaces appearing in
PCA basis is representative of shapes in the applica-
this direct sum, one obtains a multi-scale orthogonal (in
tion area) in representing the mapped feature function
R|P | .
using a given number of coecients. However, unlike
the standard inner product) basis for the entire
8 the Laplace-Beltrami eigenbasis and diusion wavelets,
features at larger scales, whereas for negative
here we do not have a guarantee that truncating the
smaller features would be emphasized.
t,
the
series expansion is equivalent to smoothing, which can
With a careful choice of shape descriptor compari-
in principle result in descriptors that are sensitive to
son metric and templates one can add the elements of
noise.
rotational invariance to our framework as follows. Starting from a single template one generates a set of templates by rotating the original one to dierent angles
8 Choice of Metric
that uniformly sample the rotational degree of freedom
In this section we consider the problem of comparing shape descriptors of dierent shapes. First we justify the use of Euclidean distance between shape descrip-
g and g 0 on the P template surface T have expansions g = i ci φi and P g 0 = i c0i φi in terms of some orthonormal basis {φi }. The squared L2 norm of the dierence between g and g 0 can be evaluated as !2 ˆ ˆ X (g − g 0 )2 = (ci − c0i )φi = tor vectors. Indeed, let two functions
T
T
=
X
=
X
i
(ci −
−
c0j )
(ci − c0i )(cj − c0j )δij =
i,j
proach. Now, one can either use pairwise matching of feature vectors when comparing to another shape in the database, or one can aggregate (e.g. add up, or combine in some symmetric way) these feature vectors in order to get a single rotationally invariant feature vector.
9 Evaluation factors. First, it is important for a framework to be in-
φi φj =
clusive in the sense that there should be such settings of
T
i,j
in this set one computes the feature vector using our ap-
To evaluate the proposed framework we look for two
ˆ c0i )(cj
(cf. Ligtheld descriptor [7]). For each of the templates
X
(ci − c0i )2 ,
i
the parameter and design choices that some of the wellknown and successful existing shape descriptors are reproduced. Second, we need to show that the parameter
where we have used the orthonormality of the basis.
and design choices involved in the framework do indeed
m of these expansion coecients, descT (S1 ) = (c1 , c2 , ..., cm ) and descT (S2 ) = (c1 , c2 , ..., cm ), we see that Euclidean distance between descT (S1 ) and descT (S2 ) is an approximation of the L2 norm of the dierence between the mapped feature functions of models S1 and S2 .
produce variations in the retrieval results. We consider
There can be benets in using weighted Euclidean
In this subsection we will consider some of the well-
distances between shape descriptors. For example, in
known existing shape descriptors and show how they
case of Laplace-Beltrami eigenbasis, one can use the
can be obtained as instantiations of our framework; the
following weighted distance
aim is to establish inclusiveness. Since in all of the cases
m X
the signal processing is done using classical tools which
Since our shape descriptors are made of the initial
e−tλi (ci − c0i )2 ,
both of these factors in separate subsections.
9.1 Theoretical Evaluation
are equivalent to Laplace-Beltrami bases, we will not mention this component below.
i=1 is the eigenvalue corresponding to the eigen-
Ray based descriptor: This is the descriptor of Vrani¢
φi . This can be seen to be equivalent to smooth-
and Saupe [32] discussed at the end of Section 3. We
ing the dierence between the mapped feature functions
only note that this descriptor can be obtained in our
via the heat kernel. Such distances can be computed for
framework by the following settings. The feature func-
t and combined to give an overall
tion is distance to the center of mass; the template is
where
λi
function
a variety of settings of
the sphere; the mapping is projection.
distance. In case of diusion wavelet basis, weighted distances
Depth-buer descriptor: The depth-buer descrip-
can be used to emphasize/deemphasize features at dif-
tor of Heczko et al. [16] places a normalized mesh into a
ferent scales. For example, consider the distance of the
unit cube, and generates six gray-scale images on each
form
face of the cube by parallel projection. The grayness
m X
value is set based on the distance from the cube face
etli (ci − c0i )2 ,
to the model. Coecients obtained by applying the 2D Fourier transform to each of the six gray-scale images
i=1 where li is the level to which the basis function longs. When
t
φi
be-
are used as descriptors. In our terms, for each of the
is positive, this distance will emphasize
cube faces, the feature function is the distance from
9 mesh point to the face; template is a square, namely a cube face; mapping is done by projection (if two mesh points project to the same face point, one resolves the ambiguity by selecting the smaller function value). In the survey [5], depth-buer descriptor was fount to be the best overall feature vector among the set of implemented methods.
Lighteld descriptor:
The basic component of the
Lighteld descriptor of Chen et al. [7] is a set of silhouette images rendered via orthogonal projection onto planes tangent to the hemisphere at the vertices of a dodecahedron. These silhouettes are compactly encoded by 2D Fourier transform. During the retrieval the feature vectors for each plane are pairwisely matched to achieve rotational invariance. We can see that the feature vector corresponding to each of the planes can be obtained as an instantiation of our framework: the fea-
Template surfaces used in the study. Except for sphere, these were obtained by simplifying the benchmark models. Fig. 2
ture function is the constant function; the template is the plane (a square); the mapping is done via orthogo-
unit variance, and rotating using Voronoi area weighted
nal projection. The Lighteld descriptor was found to
PCA [29, 31]. The eectiveness of such normalization
give the best retrieval results by Shilane et al. [34] who
has been corroborated in the literature, and it is known
compared the performance of a variety of shape descrip-
that descriptors deriving their invariance from such nor-
tors on the Princeton Shape Benchmark.
malization can outperform descriptors that are intrin-
Spherical harmonics descriptor:
The rotation in-
sically invariant [5].
variant spherical harmonics descriptor of Kazhdan et
The templates used in the experiments, except the
al. [21] with slight modications can be obtained in our
sphere, were chosen randomly and are depicted in Fig-
framework by considering each of the concentric spheres
ure 2. These were obtained from models in the bench-
used separately. For each of the concentric spheres, the
mark by decimating using QSlim [12]. This was done
feature function is the constant function on the mesh;
to make the computation of shape descriptors less time
the sphere is the templates; mapping is done using in-
consuming.
terpolation based on Gaussian kernel, namely, the interpolant is obtained by determining the closest point on the surface, and multiplying the function value at that point by euclidean distance based Gaussian weight; the spherical harmonic transform, which is equivalent to Laplace-Beltrami basis on sphere, is used for signal processing a distinctive aspect being that the expansion coecients are combined in a way that makes the resulting descriptor rotation invariant.
We conduct a series of leave-one-out experiments: every model in the benchmark is queried against all other models. The ranked result lists generated by the queries are used to compute ve retrieval statistics: nearest neighbor (NN), rst tier (FT), second tier (ST), discounted cumulative gain (DCG), and average dynamic recall (ADR). In all of the experiments the feature function is the distance to the center of mass the origin:
|p|, p ∈ S . 9.2 Practical Evaluation
We use the descriptor vector size of
f (p) = m = 20
for compactness and the smoothing eect; similarity between descriptors is measured by Euclidean distance.
In this subsection we experimentally evaluate the eect
Eect of mapping: We rst compare the performance
of variations in choices involved in our framework on
of our shape descriptor when the same template (sphere)
the retrieval results. The aim is to establish that the
and basis (Laplace-Beltrami eigenbasis) are used with
proposed design and parameter degrees of freedom have
dierent mapping methods. Since the mapping domain
non-trivial practical eect.
is the sphere, the descriptor obtained using projection
For our experiments we use the Watertight Bench-
is equivalent to that of [32]. Shepard interpolation can
α, so we
mark [14] which consists of 400 closed surface models,
be controlled through the use of the parameter
divided into 20 equal object classes. Some of the classes
experimented with three choices as shown. The choice
include both dierent objects and articulations of the
was based on the fact that the long distance behavior
same object. All objects are normalized by shifting the
of mean-value interpolant is similar to that of Shepard
origin to the center of mass, uniform scaling to obtain
interpolant with
α = 2.
10 The statistics summarized in Table 1 show that barycen- also been processed by the normalization applied to the tric interpolation and projection show very similar re-
whole benchmark, so they span similar spatial regions.
trieval performances. On the other hand, Shepard inter-
We expect the descriptors to be more independent when
polation performs worst of all, yet Shepard interpola-
the templates are dierently posed.
tion is easiest to implement and fastest to run, so it can be the method of choice when eciency is paramount.
Mapping
NN
FT
ST
DCG
ADR
Projection Mean-value Shepard α = 1 Shepard α = 2 Shepard α = 3
76.8 73.5 61.0 71.3 72.0
31.4 32.3 25.0 26.9 27.3
45.6 45.2 39.3 41.8 40.7
68.3 68.0 61.5 64.8 65.0
55.5 55.7 46.3 50.4 50.8
Comparison of various mapping methods. All numbers are percentages. Table 1
Eect of basis: Table 2 compares the performance of
Template
NN
FT
ST
DCG
ADR
Bear Fish Hand Horse Pliers Sphere All combined
77.0 78.8 80.5 78.8 72.5 73.5 81.3
35.1 35.8 35.6 34.8 34.5 32.3 39.0
49.2 49.9 49.6 48.4 49.6 45.2 55.1
69.7 70.5 70.9 70.0 68.6 68.0 73.3
58.0 58.3 58.3 57.9 55.6 55.7 61.6
Table 3
Using variety of templates for shape retrieval.
Implementation: Since the same templates are used repeatedly with dierent models, we save time by pro-
our shape descriptor when the same template (sphere)
cessing every template mesh once (per each basis type)
and mapping method (mean value interpolation) are
to extract the rst
used with dierent function bases on the template. The
onal matrix
Laplace-Beltrami and diusion wavelet bases depend
sis) only these are needed to obtain the expansion
only on the template, and so can be directly computed.
coecients
However, the PCA basis requires a sample of represen-
have about 500 vertices/sampled points, the processing
tative shapes from the database. To evaluate the full
takes less than a few seconds. Storing
possible power of the PCA basis we included all the
non-zero entries) and the basis vectors consumes about
database models into our sample. In practice the PCA
500 × m + 500 = 10, 500
basis would be computed on a subset of models; thus,
R
ci .
m
basis vectors and also the diag-
(needed for Laplace-Beltrami eigenbaSince the template surfaces we consider
R
(approx. 500
oating numbers.
When computing the shape descriptor for the model
produce best nearest neighbor classication), and thus
S , we rst obtain the mapped feature function fˆ : T → R. We also compute the feature function of the template, fT . The function that is expanded in terms of the basis functions is g = fˆ/fT ; note that this
each of the three may be most appropriate for dierent
amounts to slightly modifying the process of mapping.
applications.
Since
we expect the PCA basis results to deteriorate in realistic scenarios. Note that these three choices of bases exhibit dierent performances (e.g. diusion wavelets
surface
m
basis vectors have already been computed and
stored, we only need to do the vector-matrix multiplica-
Basis Laplace-Beltrami Diusion Wavelets PCA basis Table 2
NN 73.5 75.0 74.8
FT 32.3 32.1 32.3
ST 45.2 44.8 45.7
DCG 68.0 67.8 68.5
ci , i = 1, ..., m,
ADR
tions to extract the needed coecients
55.7 55.3 56.1
which is done extremely fast. Finally, our feature vec-
Comparison of various signal processing methods.
Eect of template: Table 3 compares the performance
tor consists of m oating numbers, namely descT (S) = (c1 , c2 , ..., cm ). Notice that when S and T coincide, g = fˆ/fT = 1, and the feature vector is of the form descT (T ) = (a, 0, ..., 0). We scale all the feature vectors to achieve a = 1. This is done in order to make the descriptors obtained using dierent templates commensurable, and
of our shape descriptor when the same mapping method
avoid normalization problems when concatenating de-
(mean value interpolation) and basis (Laplace-Beltrami
scriptors coming from dierent templates.
eigenbasis) are used with dierent templates depicted
Our implementation was done in MATLAB, except
in Figure 1. The table also shows the retrieval results of
mean-value and Shepard interpolation part that was
combined descriptor obtained by simply concatenating
done in C++ and called within MATLAB as needed. To
the feature vectors for all the templates. The improved
give an idea about the timing: evaluating the shape de-
retrieval resulting from such combination, and Figure
scriptor based on mean-value interpolant takes on aver-
3 showing the average ADR per each class for dierent
age slightly less than 1 minute for each model/template
templates demonstrate that the descriptors correspond-
pair; most of this time is consumed by mean-value co-
ing to dierent templates can capture independent fea-
ordinate computation; simplifying the mesh prior to
tures of shapes. Let us note that the templates have
computation will help reduce this time. The descriptor
11
Fig. 3
The average ADR values for each class shown by template used.
based on Shepard interpolation is considerably faster
the produced retrieval results. We believe that there
and takes a few seconds to compute.
does not exist an ideal template that would be optimal for all kinds of shape repositories. In fact, the main exibility oered by our approach is that one can
10 Conclusion and Future Work This paper presents a versatile framework for shape description that can be controlled through continuum of parameter and a variety of design choices. Some of the existing shape descriptors t into this framework, and
choose optimal templates based on the shape database at hand protein shapes can benet from dierent templates than mechanical engineering parts. Designing approaches for selecting/generating these optimal templates for a given shape database is an interesting direction for future work.
a multitude of new descriptors can be obtained. Within
Finally, an attractive topic for future work is oered
our analysis of design choices, we introduce novel ap-
by the observation that mean-value interpolation per-
proaches to the mapping and basis construction steps.
forms considerably better than Shepard interpolation.
We also provide an empirical analysis of the eects that
It is natural to link this to the dierence in the order of
design and parameter choices have on the descriptor's
approximation provided by these interpolation schemes,
retrieval performance.
yet we believe this fact should be investigated further.
This work provides a small, rst step and therefore
Indeed, the expression for Shepard interpolation has
has limitations that suggest topics for future work. A
connections to physical force potentials, so the rami-
rst important topic for future work would be to inves-
cations of our observation to force eld based shape
tigate the possibility of constructing intrinsically rota-
descriptors [22, 13] can be fruitful.
tionally invariant shape descriptors using similar ideas. Another theme suitable for further investigation is the dependency between the nature of the template and
We thank Szymon Rusinkiewicz for the extremely useful Trimesh2 library. We are grateful to Mauro MagAcknowledgements
12 gioni for making publicly available an implementation of the diffusion wavelets. For the 3D models used in this paper, we thank Daniela Giorgi and AIM@SHAPE.
References 1. Mikhail Belkin, Partha Niyogi, and Vikas Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res., 7:23992434, 2006. 2. S. Biasotti, L. De Floriani, B. Falcidieno, P. Frosini, D. Giorgi, C. Landi, L. Papaleo, and M. Spagnuolo. Describing shapes by geometrical-topological properties of real functions. ACM Comput. Surv., 40(4):187, 2008. 3. A. M. Bronstein, M. M. Bronstein, and R. Kimmel. Generalized multidimensional scaling: A framework for isometryinvariant partial surface matching. Proceedings of the National Academy of Science, 103:11681172, January 2006. 4. Alexander Bronstein, Michael Bronstein, Ron Kimmel, Mona Mahmoudi, and Guillermo Sapiro. A gromov-hausdor framework with diusion geometry for topologically-robust non-rigid shape matching. International Journal of Computer Vision. 5. Benjamin Bustos, Daniel A. Keim, Dietmar Saupe, Tobias Schreck, and Dejan V. Vrani¢. Feature-based similarity search in 3d object databases. ACM Comput. Surv., 37(4):345387, 2005. 6. F. Chazal, D. Cohen-Steiner, L. J. Guibas, F. Mémoli, and S. Y. Oudot. Gromov-hausdor stable signatures for shapes using persistence. Computer Graphics Forum (proc. SGP 2009), 2009. 7. Ding-Yun Chen, Xiao-Pei Tian, Yu-Te Shen, and Ming Ouhyoung. On visual similarity based 3d model retrieval. Comput. Graph. Forum, 22(3):223232, 2003. 8. R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker. Geometric diusions as a tool for harmonic analysis and structure denition of data: Diusion maps. PNAS, 102(21):74267431, 2005. 9. Ronald R. Coifman and Mauro Maggioni. Diusion wavelets. Appl. Comput. Harmon. Anal., 21(1):5394, 2006. 10. Michael S. Floater. Mean value coordinates. Comput. Aided Geom. Des., 20(1):1927, 2003. 11. Michael S. Floater, Géza Kós, and Martin Reimers. Mean value coordinates in 3d. Comput. Aided Geom. Des., 22(7):623631, 2005. 12. Michael Garland and Paul S. Heckbert. Surface simplication using quadric error metrics. In SIGGRAPH '97: Proceedings of the 24th annual conference on Computer graphics and interactive techniques, pages 209216, 1997. 13. Xi Geng, Wenyu Liu, and Hairong Liu. Force eld based expression for 3d shape retrieval. In HCI (2), volume 4551 of Lecture Notes in Computer Science, pages 587596, 2007. 14. Daniela Giorgi, Silvia Biasotti, and Laura Paraboschi. Watertight models track. Technical Report IMATI-CNR-GE 9/07, September 2007. 15. William J. Gordon and James A. Wixom. Shepard's method of "metric interpolation" to bivariate and multivariate interpolation. Mathematics of Computation, Jan 1978. 16. M. Heczko, D. Keim, D. Saupe, and D. V. Vranic. Methods for similarity search of 3d objects (in german). DatenbankSpektrum Zeitschrift fï¾÷r Datenbanktechnologie, 2(2):54 63, 2002. 17. Natraj Iyer, Subramaniam Jayanti, Kuiyang Lou, Yagnanarayanan Kalyanaraman, and Karthik Ramani. Threedimensional shape searching: state-of-the-art review and future trends. Computer-Aided Design, 37(5):509530, 2005.
18. Varun Jain and Hao Zhang. Robust 3D shape correspondence in the spectral domain. In Shape Modeling International, 2006. 19. I. T. Jollie. Principal Component Analysis. Springer, second edition, October 2002. 20. Tao Ju, Scott Schaefer, and Joe Warren. Mean value coordinates for closed triangular meshes. In TOG(SIGGRAPH), pages 561566, 2005. 21. Michael Kazhdan, Thomas Funkhouser, and Szymon Rusinkiewicz. Rotation invariant spherical harmonic representation of 3D shape descriptors. In Symposium on Geometry Processing, 2003. 22. A. Mademlis, P. Daras, D. Tzovaras, and M.G. Strintzis. 3d object retrieval based on resulting elds. In Workshop on 3D object retrieval, April 2008. 23. A. Mademlis, P. Daras, D. Tzovaras, and M.G. Strintzis. Using ellipsoidal harmonics for 3d shape representation. In INTERMEDIA Workshop on Hypermedia 3D Internet, 2008. 24. Facundo Mémoli. On the use of Gromov-Hausdor Distances for Shape Comparison. pages 8190, Prague, Czech Republic, 2007. Eurographics Association. 25. Facundo Mémoli and Guillermo Sapiro. Comparing point clouds. In SGP '04: Proceedings of the 2004 Eurographics/ACM SIGGRAPH symposium on Geometry processing, pages 3240, New York, NY, USA, 2004. ACM. 26. Facundo Mémoli and Guillermo Sapiro. A theoretical and computational framework for isometry invariant recognition of point cloud data. Found. Comput. Math., 5(3):313347, 2005. 27. M. Meyer, M. Desbrun, P. Schröder, and A. Barr. Discrete dierential geometry operators for triangulated 2-manifolds. In Proceedings of Visual Mathematics, 2002. 28. R. Osada, T. Funkhouser, B. Chazelle, and D. Dobkin. Shape distributions. TOG, 21(4):807832, 2002. 29. Eric Paquet and Marc Rioux. Nefertiti: a query by content system for three-dimensional model and image databases management. Image Vision Comput., 17(2):157166, 1999. 30. P. Frosini D. Giorgi C. Landi S. Marini G. Patanè M. Spagnuolo S. Biasotti, B. Falcidieno. 3d shape description and matching based on properties of real functions. In Eurographics 2007 Tutorial Notes, pages 10251074. The Eurographics Association, 2007. 31. D. V. Vrani¢and D. Saupe. 3d model retrieval retrieval. In Proceedings of the Spring Conference on Computer Graphics and its Applications, pages 8993, 2000. 32. Dietmar Saupe and Dejan V. Vranic. 3d model retrieval with spherical harmonics and moments. In Proceedings of the 23rd DAGM-Symposium on Pattern Recognition, pages 392397, London, UK, 2001. Springer-Verlag. 33. Donald Shepard. A two-dimensional interpolation function for irregularly-spaced data. In Proceedings of the 1968 23rd ACM national conference, pages 517524, New York, NY, USA, 1968. ACM. 34. Philip Shilane, Patrick Min, Michael Kazhdan, and Thomas Funkhouser. The princeton shape benchmark. pages 167 178, Los Alamitos, CA, USA, 2004. IEEE Computer Society. 35. Johan Tangelder and Remco Veltkamp. A survey of content based 3d shape retrieval methods. Multimedia Tools and Applications, 39:441471, 2008. 36. Bruno Vallet and Bruno Lï¾÷vy. Manifold harmonics. Computer Graphics Forum (Proceedings Eurographics), 2008.