A Versatile Framework for Shape Description

Viewer
Transcript

Noname manuscript No.

(will be inserted by the editor)

A Versatile Framework for Shape Description Raif M. Rustamov

the date of receipt and acceptance should be inserted later

Abstract

We present a shape description framework

the shape retrieval procedures must be adjusted to the

that generates a multitude of shape descriptors through

classes of shapes arising in a particular application. This

a variety of design and continuum of parameter choices.

is usually achieved by implementing a number of shape

Our parameter is a surface mesh, referred to as tem-

descriptors, and choosing a shape descriptor or a com-

plate, which is supplied at run time, and allows gen-

bination of them that produces optimal results. This

erating dierent shape descriptors for the same model.

time consuming process would benet immensely from

Our framework extracts a numerical shape descriptor

a shape description framework that automatically gen-

by computing a selected function on the model mesh,

erates a variety of shape descriptors through a natural

mapping (transferring) it to the template, expanding

control parameter. Such a framework would obviate the

the mapped function in terms of a basis on the tem-

need to implement these descriptors one by one, and al-

plate, and collecting the expansion coecients into a

low evaluating a larger spectrum of shape descriptors

vector. We investigate possible design choices for the

for optimality.

steps in the framework, and introduce novel approaches

While there has been an explosion in the number

that provide further freedom in generating a multitude

of 3D shape descriptors proposed for shape retrieval,

of previously unknown descriptors. We show that our

the vast majority of them are rigid constructs that can-

approach is a generalization of the way some of the ex-

not be modied except for ne-tuning parameters such

isting numerical descriptors are dened, and that for

as histogram bin number/size, descriptor vector length,

appropriate template choices one is able to reproduce

and etc. The proposed classications of shape descrip-

some of the well-known descriptors. Finally, we show

tors such as appearing in surveys [17, 5, 35] provide in-

empirically that design and parameter choices have non-

clusive frameworks for shape descriptors that are too

trivial eects on the descriptor's performance, and that

general/theoretical to oer an implementable tool. The

better retrieval results can be obtained by combining

Gromov-Hausdor distance based approaches [25, 26, 3]

descriptors obtained via dierent templates.

allow generating dierent descriptors but only through

3D shape retrieval, numerical shape descriptors

a limited number of design choices. Shape descriptors based on Morse theory [30, 2] are parametric with respect to the choice of a surface function, yet feature

1 Introduction Content based 3D shape retrieval is a problem of great importance that has many applications in computer graphics, 3D face recognition, CAD, medical imaging, and protein bioinformatics. Due to the diversity of shapes,

Raif M. Rustamov, Drew University, 36 Madison Ave, Madison, NJ 07940, USA E-mail: [email protected]

based shape descriptors [5] do not fall within this class. The goal of our work is to introduce an implementable shape description framework that generates a dierent feature based shape descriptor for each setting of a parameter. Our parameter is a surface mesh to which we refer as the template. To compute our shape descriptor we input two meshes: one, the model, is the shape for which the descriptor is computed, and the other, the template, is the parameter that allows generating dierent shape descriptors for the same model. Our de-

Model (S)

2

Signal Processing

Template (T)

Mapping

Feature function Fig. 1

Mapped feature function

= C1

+ C2

+

+ C3

+ C4

+ ...

Expansion in terms of basis and the extracted descriptor

Steps used in our framework to extract the shape descriptor of a model.

scriptor is extracted in three steps as illustrated in Fig-

for basis construction are novel in the context of shape

ure 1. We start by computing a selected function on the

retrieval. Finally, we theoretically and experimentally

model mesh, feature function, that captures a property

evaluate the eect of design choices and parameter set-

relevant to shape description (e.g. Gaussian curvature,

tings on the generated shape descriptors (Section 9).

distance to the origin). Next, we map (e.g using projection) the computed function to the template mesh. Finally, we expand the mapped function in terms of a

2 Previous Work

function basis on the template mesh, and collect the expansion coecients in a vector to obtain a concise

Content based 3D shape retrieval has drawn consid-

numerical shape descriptor of the model.

erable attention of researchers, and led to the intro-

This approach was chosen for ve reasons. First, our

duction of a large variety of shape descriptors; we re-

framework generates a multitude of shape descriptors

fer the reader to the survey papers [17, 5, 30, 35, 2]. It

through a continuum of parameter (template) and a va-

can be seen that the vast majority of the descriptors

riety of design (feature function, mapping method, basis

are rigid constructs that cannot be modied except for

generation procedure) choices. Second, our approach is

ne-tuning parameters such as histogram bin size, de-

a generalization of the way some of the existing nu-

scriptor vector length, etc.

merical descriptors are dened; in fact, for appropriate

Survey papers [17, 5, 35] also introduce inclusive clas-

design and parameter choices one is able to reproduce

sications of the existing descriptors. However, these

some of the well-known descriptors. Third, the use of

classications are made at an abstract level, and can-

templates bypasses the dicult problem of nding cor-

not be used to build a framework to produce dierent

respondences between basis functions coming from dif-

descriptors through dierent choices of a natural pa-

ferent surfaces. Fourth, templates constitute a natural

rameter.

parameter a shape already in the database can be

A concrete general framework for shape description

used as a template. Fifth, templates constitute a pow-

is provided Gromov-Hausdor distance based approaches

erful parameter with non-trivial eects on the generated

[25, 26, 3]. Here dierent shape descriptors can be ob-

shape descriptor, a fact we demonstrate experimentally.

tained by varying the metric on the surface, cf. [4],

The main contribution of this paper is the proposed

the base distance within the Gromov-Hausdor the-

framework for automatic generation of a large spectrum

ory [24], and representation of the descriptor (e.g. his-

of shape descriptors (Sections 3, 4). Within this gen-

togram [28], persistence diagram [6]). However, the known

eral framework, we consider design choices for feature

choices within this framework constitute a discrete set

function (Section 5), mapping method (Section 6), ba-

and are best described as design choices.

sis construction (Section 7), and the metric used for

A truly parametric and concrete general framework

descriptor comparison (Section 8). Our use of interpo-

for shape characterization is rendered by Morse theory.

lation for mapping, and diusion wavelets and PCA

A systematic survey of the relevant methods is given

3 in [30, 2]. Here dierent shape descriptors can be ob-

The second component is mapping the feature func-

tained by applying Morse theory to dierent functions

tion to, perhaps, a dierent domain. The feature func-

on the model, and through the dierent representations

tion is used to construct a new function dened on

(Reeb graphs, persistence homology). The main param-

some predetermined domain; we call this domain as the

eter in this case is the function which can be supplied

mapping domain, and the new function as the mapped

by the user at the runtime. In fact, the AIM@SHAPE

feature function. The existing approaches use spheres,

Geometric Searching Engine provides such a function-

planes, the 3D space (surface's bounding volume), the

ality. While Morse theory techniques do provide gen-

surface itself as mapping domains. The mapped feature

eral framework, they rely on dierent ways of describ-

function is usually constructed by some kind of projec-

ing shapes. As a result, the realm of shape descriptors

tion. When the mapping domain is the surface itself,

they capture is disjoint from our framework which is

the mapped feature function is just set equal to the

geared towards feature-based shape description. Note

feature function.

that choosing dierent functions on the model is among

The third component is signal processing to extract

the degrees of freedom provided in our framework as

a concise, noise-robust numerical shape descriptor from

well.

the mapped feature function. The type of processing

Note that in cases when existing shape descriptors

depends on the mapping domain: for sphere one may

are reproduced by our framework, the template is one of

use the Spherical Harmonic Transform, for a plane or

the simple geometries (plane, sphere), because previous

a box volume 2D or 3D Fourier transform, ball vol-

methods depend on the classical signal processing tools,

ume 3D Zernike Transform. Essentially, in all of these

which limits the possible choices to simple geometries.

cases the mapped feature function is expanded in a se-

It is through the use of non-classical signal processing

ries in terms of the relevant basis, and the expansion

tools (Laplace-Beltrami eigenbasis, diusion wavelets)

coecients are used as the shape descriptor.

that we are able to escape the conne of the simple

To provide an example, let us analyze an existing

geometries, and employ any surface as a template, pro-

descriptor in the described terms. Consider the descrip-

ducing previously unknown descriptors. However, even

tor of [32] where rays are shoot from the origin (cen-

within the realm of classical signal processing it is pos-

ter of mass of the mesh) to determine the distance to

sible to construct a shape description framework with a

the farthest intersection point with the mesh. The rays

template parameter, if one limits the templates to ellip-

are parametrized by the unit sphere, and as a result

soids. Indeed, in such a scenario ellipsoidal harmonics

one obtains a function dened on the sphere. Next, the

can be used for signal processing. Ellipsoidal harmonics

spherical harmonic transform is applied to this function

were employed in the work of Mademlis et. al. [23] to

to extract the numerical shape descriptor. In our terms,

extract shape descriptors similar to spherical harmonic

the feature function is the distance from the mesh point

descriptors of [21]. However, they only t a single el-

to the origin; the mapping domain is the unit sphere;

lipsoid to the shape, and do not exploit the degree of

mapped feature function is obtained by projecting the

freedom provided by the choice of ellipsoid as a param-

feature function onto the sphere, and if two mesh points

eter.

project to the same sphere point, one resolves the ambiguity by selecting the larger function value; nally,

3 Rationale Before describing the details of our framework, we set up the context by highlighting three components that frequently go into the construction of numerical shape descriptors. This discussion motivates our approach, and claries that it is a generalization of the way the

the signal processing component is performed using the spherical harmonic transform. Let us conclude by noting that our second and third components roughly correspond to the object abstraction and numerical transformation components of [5].

4 Approach

majority of existing numerical descriptors are dened. The rst component is the extraction of a surface

We will now describe our shape description framework

feature. By surface feature we mean any function on the

following the component delineation of Section 3. The

surface that captures a property relevant to shape de-

framework inputs two surface meshes, model and tem-

scription. Examples include the constant function (re-

plate, and computes a numerical shape descriptor for

striction of the surface's characteristic function to the

the model.

surface itself ), distance to the center of mass, curvature,

Models in the search database are assumed to be

components of the normal vector, and many others. We

given as triangle meshes; we will use the terms sur-

refer to the selected function as the feature function.

face, surface mesh, and mesh interchangeably. We let

4

S

denote any surface in the database. All models in

5 Feature Functions

the database are assumed to be normalized to achieve translation, rotation and scale invariance of the result-

A variety of possibilities exist as a choice for the sur-

ing shape descriptor. The template is another surface

face feature distance to the center of mass, curvature,

T

components of the normal vector, and many others. We

that is used as the mapping domain. The template

f,

where

f :S→R

can be one of the surfaces in the database, or any other

denote the feature function by

manifold surface mesh chosen arbitrarily. Note that the

real-valued function on the model surface. Let us men-

is a

role of the template is to provide a parameter through

tion that when interpolation is employed for mapping,

which the resulting shape descriptor for the model can

f

be controlled.

In fact, it restricts to a constant function on the surface,

cannot be the characteristic function of the surface.

and the interpolation schemes we consider extend the The shape descriptor is generated in three steps (Figure 1). First (extraction), the feature function where

f,

f : S → R is a real-valued function on the model

surface, is computed. Second (mapping), we construct a new function

fˆ : T → R,

the mapped feature function,

based on the given function

f : S → R.

fˆ = c1 φ1 + c2 φ2 + · · · cN φN , where {φi }N i=1 orthogonal basis for real-valued functions on T .

into a series

The extracted shape descriptor is dened as the vector descT (S)

= (c1 , c2 , ..., cm ),

where

making impossible extraction of any meaningful information. Linear functions of the

x, y, z

coordinates can-

not be used with barycentric interpolation for the same reason they extend to a linear function everywhere.

Third (signal

processing), the mapped feature function is expanded is an

constant function to a constant function everywhere,

m ≤ N.

One advantage of this approach is that the use of a template ascertains that the extracted expansion coefcients are in correspondence and their comparison is meaningful. Indeed, it may be tempting to altogether

6 Mapping Our goal is to construct the mapped feature function using the feature function of the model surface, namely dene a new function function

fˆ : T → R

based on the given

f : S → R. We consider three possible choices:

projection, Shepard interpolation, and barycentric interpolation.

bypass mapping to the template, and expand the feature function in terms of some basis on the original surface. However, this would have required nding the

6.1 Projection

correspondence between basis functions of dierent sur-

For projection, a correspondence between the model

faces, which is a dicult problem. For example, see

and template surface points is established, and the fea-

[18] for problems encountered when trying to nd cor-

ture function values are transferred to the template

respondences between eigenfunctions of the Laplace-

using this correspondence. Since the correspondence is

Beltrami operators on dierent surfaces.

generally not one-to-one, a procedure to resolve the am-

A second advantage of this approach originates from the use of signal processing. First, for shape description it is crucial to reduce the amount of noise so that only the essential properties of the shape are captured. Second, compact numerical shape descriptors are naturally provided by signal processing. Both goals are achieved by expanding a given function in a series and using the expansion coecients of the low-frequency components as the shape descriptor. Keeping only the low-frequency part not only results in a compact descriptor, but also is essentially equivalent to smoothing the function making the descriptor more robust to noise. To realize the described approach one needs to choose

biguity is needed. An advantage of projection is that almost all of the existing shape descriptors use projection for the mapping step, which corroborates its usefulness in the practice of shape description. However, projection also has important disadvantages. First, since the correspondence used for projection is generally not one-to-one, the resulting mapped feature function can be discontinuous. Therefore, signal processing will be subject to the Gibbs eect, rendering the low-frequency expansion coecients the ones used as the shape descriptor inadequate for representing the mapped function. Second, when dierent templates are used, we expect projection to lead to high redundancy, limiting the gains of concatenating descriptors

a feature function, decide how the mapping is done, and

obtained from dierent templates. Indeed, if there are

determine what basis to use for signal processing. Since

no overlaps, the value sets of the mapped feature func-

the selection of the feature function is ultimately appli-

tions on various templates will be the same, with only

cation driven, we will concentrate mostly on mapping

dierence being how the values are distributed over the

and signal processing.

template surfaces.

5

i = 1, ..., n

6.2 Shepard Interpolation

are called barycentric coordinates if they

satisfy the following three properties: partition of unity, Shepard interpolation is a zeroth order polynomial precision interpolation scheme that can be used to extend a function from a set of points to the entire ambient Euclidean space. Although we are given a function

f :S→R

dened on the mesh, we can sample points

from the mesh to obtain a function on a set of points.

n X

bi (p) = 1,

i=1 ane combination

n X

bi (p)vi = p,

We use Shepard interpolation to extend this function

i=1

to the template, and designate the obtained function

and Lagrange property

as

fˆ. Given a set of scattered points

f (pi )

pi

where the values

are known, the Shepard interpolant [33, 15] is de-

where

P

wi (p)f (pi ) iP i wi (p)

wi

Due to these properties, the barycentric coordinates can be used for interpolation: given function values

ned as follows

fˆ(p) =

bi (vj ) = δij .

the mesh vertices, one extends it to the whole space by setting for each

, p ∈ R3

are user assigned weights. The most com-

monly used weights are of the following form

−α

wi (p) = |p − pi |

fˆ(p) =

n X

p ∈ R3 ,

bi (p)f (vi ).

i=1 The barycentric interpolant has the rst order precision meaning that the linear functions of coordinates

,

are reproduced, i.e. if where

α

is a positive parameter. Shepard interpolant

has the zeroth order precision meaning that the constant functions are reproduced, i.e. if function, then

fˆ is

f (vi ) on

f

is a constant

also constant.

f

x, y, z fˆ is

is a linear function, then

also linear. In our experiments, we use mean-value coordinates [10, 11, 20] as barycentric coordinates, because they are fastest to evaluate. We compute the feature function

pi we uniformly

values at the vertices of the model mesh, and then

sample the area of the model surface, and evaluate the

evaluate the mapped feature function at the vertices

feature function at these points. Then, the mapped fea-

of the template mesh using mean-value interpolation.

ture function is evaluated at the vertices of the template

Our implementation of mean-value coordinates follows

In our experiments, to obtain points

mesh

T

using the formula for Shepard interpolation.

the pseudo-code presented in [20].

The advantages of mapping via Shepard interpola-

Mapping with mean-value interpolation has the fol-

tion include the following. First, the relative positions

lowing benets. First, the relative positions and dis-

and distances between the template and the model sur-

tances between the template and the model surface in-

face inuence the interpolant greatly, which leads to

uence the mean-value interpolant greatly, which leads

less redundancy between interpolants obtained for dif-

to less redundancy between interpolants obtained for

ferent templates. Second, the obtained mapped feature

dierent templates. Second, the mean-value interpolant

fˆ is continuous. Third, it is easy to implement,

is continuous, and so signal processing does not suer

function

from the Gibbs eect. Third, in principle, a mesh can be

and fast to compute.

reconstructed if the mean-value coordinates are given everywhere in the space (due to the Lagrange property, one can reconstruct the mesh vertices; the trian-

6.3 Barycentric Interpolation

gle faces can be reconstructed by nding the triplets of

Barycentric interpolation is a rst order polynomial precision interpolation scheme that can be used to extend a function from a mesh to the entire ambient Euclidean space. Using barycentric interpolation, we extend the given feature function

f : S → R

to the

template, and designate the obtained function on the template as

fˆ.

non-zero barycentric coordinates such that all remaining barycentric coordinates are zero); consequently, one may reasonably expect that interpolation based on meanvalue coordinates implicitly injects even more shape information into the mapped feature function.

7 Signal Processing

Barycentric interpolation is based on barycentric coordinates. For an arbitrary triangulated surface mesh with vertices

v1 , v2 , ..., vn ,

the functions

bi : R3 → R,

Our goal is to construct an orthonormal basis for real functions on the template surface

T.

This basis is used

6 to obtain a numerical shape descriptor by series ex-

given by a matrix, and its eigenvectors are used to ex-

fˆ : T → R.

pand a discrete function given on the mesh. For a mesh

pansion of the mapped feature function

g

We will consider three alternatives: Laplace-Beltrami

function

eigenbasis (also called manifold harmonics), diusion

proximated as

wavelets, and optimal compression basis via principal component analysis (PCA).

∆g(pi ) ≈

, the value of

∆g(pi )

i

at the vertex

is ap-

1 X mij (g(pj ) − g(pi )), ri

(2)

j∈N (i)

j adjari as the point-areas. For the Laplacian we use, mij 6= 0 only if i and j are adjacent, and the equality mij = mji holds for all i, j .

where the summation is over all vertex indices 7.1 Laplace-Beltrami Eigenbasis

cent to vertex

As has been corroborated in the literature, most recently in [36], Laplace-Beltrami eigenfunctions provide a natural basis for signal processing on closed manifold surfaces. For example, the spherical harmonics used so often in shape retrieval are nothing but the LaplaceBeltrami eigenfunctions of the sphere. For a closed compact manifold surface

T,

let

∆

de-

note its Laplace-Beltrami dierential operator. Consider the equation

∆φ = λφ. λ for which the equation has a nontrivial solu∆; the solution φ is called eigenfunction corresponding to λ. Note that λ = 0

i.

It is customary to refer to

Let us stack up the values of the function at mesh vertices,

g(pi ),

to construct the column-vector

g.

Now,

the above formula can be written as a matrix-vector multiplication

∆g ≈ Lg.

The involved matrix

L

the

discrete Laplacian has the entries as follows

P  k mik /ri Lij = −mij /ri  0

if if

i = j, iand j adjacent,

otherwise.

A scalar

Now the Laplace-Beltrami eigenvalues/eigenfunctions

tion is called an eigenvalue of

are replaced by the eigenvalues

an

of the matrix

is always an eigenvalue the corresponding eigenfunctions are constant functions.

λi

and eigenvectors

vi

L.

Since the point-areas

ri

associated with mesh ver-

tices vary from vertex to vertex, the discrete Laplacian

The eigenvalues of the Laplace-Beltrami operator

matrix

L

is not symmetric. However, nding

L's

eigen-

are non-negative and constitute a discrete set; we can

values and eigenvectors in all cases can be reduced to

λ1 = 0 ≤ λ2 ≤ λ3 ≤ . . . ≤ λi ≤ . . .. The eigenfunction corresponding to λi , normalized to have the unit L2 norm will be denoted by φi .

a symmetric eigenvalue problem. In fact, consider the

Since the Laplace-Beltrami operator is Hermitian,

L is similar to the symmetric matrix 1 1 K = R− 2 M R− 2 . Thus, the eigenvalues and eigenvectors of L can be evaluated from those of K : if wi is an −1 eigenvector of K with eigenvalue λi , then vi = R 2 wi is the eigenvector of L with the same eigenvalue λi . Another crucial point is that the eigenvectors of L

put them into non-decreasing order:

the eigenfunctions corresponding to its dierent eigen-

hφi , φj i =

´

φ φ = 0. Thus, the T i j eigenfunctions of Laplace-Beltrami operator give an or-

values are orthogonal:

thogonal basis for the space of functions dened on the surface. As a result, a given function

g

on the surface

can be expanded in terms of the eigenfunctions

g = c1 φ1 + c2 φ2 + · · · ,

R

with

Rii = ri ;

denote by M the maMij = mij if i 6= j , L = R−1 M . Now we can

trix whose entries are given by and

Mii = lii ri .

Notice that

easily check that

are orthogonal. However, the orthogonality here is in (1)

ˆ

R-inner

(dot) product:

Finally, the series expansion of the mesh function

gφi .

which is represented by vector

T It is known that low-frequency eigenfunctions (small

terms of the

hvi , vj iR = vTi Rvj = δij .

where the coecients are given by

ci = hg, φi i =

diagonal matrix

λ)

display less oscillation on the surface. Thus, similarly to the standard Fourier analysis, truncating the series

g=

X

g,

ci vi , ci = hg, vi iR = gT Rvi .

i In our experiments we use the Laplacian of [27]:

expansion results in smoothing. Put dierently, the ex-

are the cotangent weights, and points areas

pansion coecients corresponding to the beginning of

in terms of vertex Voronoi areas.

the series capture the essential properties of the function; for a discussion see [1].

g,

is given as

ri

mij

are given

An advantage of Laplace-Beltrami eigenbasis is that almost all of the existing shape descriptors use it for sig-

In the discrete setting the Laplace-Beltrami oper-

nal processing on simple domains such as spheres and

ator of a manifold triangle mesh without boundary is

planes, which corroborates its usefulness in the practice

7 of shape description. Another advantage is that keeping

As a result, in order to nd the expansion coecients,

the low-frequency part not only results in a compact de-

one only needs to take the inner product of the mapped

scriptor, but also is essentially equivalent to smoothing

feature function with these basis vectors. As with ex-

making the descriptor more robust to noise.

pansion in terms of the Laplace-Beltrami eigenbasis one needs to truncate the obtained series expansion in a way that keeps smoother (less oscillating) components.

7.2 Diusion Wavelets

In the context of diusion wavelets, this corresponds to

Diusion wavelets, [9], provide multi-scale bases for signal processing in a variety of settings. The setting we will be using is that of scattered points in the space. The following is a very rough summary of the construction in [9], which avoids the intricate theoretical details and is geared towards implementation. From the template surface mesh

T

we uniformly

(with respect to the area) sample a set of points Clearly, any real-valued function on as a vector in

D

R|P | .

P

P.

can be recorded

trix that satises certain properties, most notably its eigenvalues are in the

[0, 1] interval. The diusion oper-

ator can be constructed as follows: for each point, one nds a xed number of nearest neighbors and assigns

exp(−t ∗ dist2 ), point in question, t is

the corresponding matrix entry to be

dist

is the distance to the

a parameter; next,

D

is obtained by normalizing this

matrix, see [8]. Diusion wavelet construction uses the dyadic powers of

D

to construct subspaces

Vi

and

Wi

R

of

|P |

together with their orthogonal bases; the subscript of a subspace is called its level. These subspaces satisfy

R|P | = W0 ⊕ V0 , V0 = W1 ⊕ V1 , V1 = W2 ⊕ V2 ,

and

so on. In fact, in the implementation we worked with,

W0 = {0},

and

V0 = R|P |

with delta-function basis.

To obtain the basis of subspace

Vi+1 , one i D2 on

dyadic power of diusion operator

Vi ,

acts by the the basis of

and applies some threshold and a special locality

preserving orthogonalization. An important property of these bases is that dierent levels represent dierent scales together they provide a multi-scale basis of

R

|P |

The main advantage of the diusion wavelet basis is its multi-scale nature, which can be benecial in cases where features at dierent scales need to be extracted and/or dierently weighted (cf. Section 8). Another advantage is that keeping the terms that belong to higher level subspaces not only results in a compact descriptor, but also is essentially equivalent to smoothing, which makes the descriptor more robust to noise.

Next, we need a diusion operator

(that linearly acts on these vectors) that is a ma-

where

keeping the terms that belong to higher level subspaces.

. For example, the basis vectors of

V0

(in our

case delta functions) represent the smallest scale possible their support is just one vertex. Meanwhile, the basis vectors of subspaces of high level have more global nature, because they are obtained by many repetitive applications of the diusion operator in a sense, the function values are diused throughout the point cloud. Fixing some maximum level

M

that one desires to

7.3 Optimal Basis via PCA This is a data-driven basis: we rst compute the mapped feature function

fˆ for a subset of models in the database,

and then extract the PCA basis for representing these functions. This basis is then used for all models both in the database and the query sets. We sample from the template surface mesh

T

uni-

formly (with respect to the area) a set of points and we evaluate the mapped feature functions on Let

{fˆ(k) : P → R}K k=1

P, P.

be the mapped feature func-

tions for a subset of models in the database; notice that

R|P | . Our goal is to prom |P | duce an orthonormal set of vectors {bi }i=1 in R such (k) K ˆ that the average squared distance of {f }k=1 from the m subspace spanned by {bi }i=1 is minimal. Note that this m means that {bi }i=1 is the optimal (in L2 sense) set of m vectors to expand fˆ(k) . these can be seen as vectors in

The problem of nding the optimal set of vectors as above can be solved using the principal component analysis (PCA) [19], which proceeds as follows. First,

a= ˆ(k) . Second, shift all the functions by this avf k=1 (k) erage, u = fˆ(k) − a. Now compute the P × P coPK (k) (k) variance matrix C , with Cij = k=1 ui uj . Finally, m the sought set {bi }i=1 is given by the eigenvectors of C corresponding to the largest m eigenvalues. Since the covariance matrix is symmetric, bi are orthogonal; one

compute the average function value at each point,

1 K

PK

only needs to normalize these vectors to obtain an orthonormal set.

work with, one can write

The main benet of the PCA basis is its optimality

R|P | = W0 ⊕ W1 ⊕ W2 ⊕ · · · ⊕ WM ⊕ VM .

(provided that the subset of models used to extract the

Now gathering the bases of the subspaces appearing in

PCA basis is representative of shapes in the applica-

this direct sum, one obtains a multi-scale orthogonal (in

tion area) in representing the mapped feature function

R|P | .

using a given number of coecients. However, unlike

the standard inner product) basis for the entire

8 the Laplace-Beltrami eigenbasis and diusion wavelets,

features at larger scales, whereas for negative

here we do not have a guarantee that truncating the

smaller features would be emphasized.

t,

the

series expansion is equivalent to smoothing, which can

With a careful choice of shape descriptor compari-

in principle result in descriptors that are sensitive to

son metric and templates one can add the elements of

noise.

rotational invariance to our framework as follows. Starting from a single template one generates a set of templates by rotating the original one to dierent angles

8 Choice of Metric

that uniformly sample the rotational degree of freedom

In this section we consider the problem of comparing shape descriptors of dierent shapes. First we justify the use of Euclidean distance between shape descrip-

g and g 0 on the P template surface T have expansions g = i ci φi and P g 0 = i c0i φi in terms of some orthonormal basis {φi }. The squared L2 norm of the dierence between g and g 0 can be evaluated as !2 ˆ ˆ X (g − g 0 )2 = (ci − c0i )φi = tor vectors. Indeed, let two functions

T

T

=

X

=

X

i

(ci −

−

c0j )

(ci − c0i )(cj − c0j )δij =

i,j

proach. Now, one can either use pairwise matching of feature vectors when comparing to another shape in the database, or one can aggregate (e.g. add up, or combine in some symmetric way) these feature vectors in order to get a single rotationally invariant feature vector.

9 Evaluation factors. First, it is important for a framework to be in-

φi φj =

clusive in the sense that there should be such settings of

T

i,j

in this set one computes the feature vector using our ap-

To evaluate the proposed framework we look for two

ˆ c0i )(cj

(cf. Ligtheld descriptor [7]). For each of the templates

X

(ci − c0i )2 ,

i

the parameter and design choices that some of the wellknown and successful existing shape descriptors are reproduced. Second, we need to show that the parameter

where we have used the orthonormality of the basis.

and design choices involved in the framework do indeed

m of these expansion coecients, descT (S1 ) = (c1 , c2 , ..., cm ) and descT (S2 ) = (c1 , c2 , ..., cm ), we see that Euclidean distance between descT (S1 ) and descT (S2 ) is an approximation of the L2 norm of the dierence between the mapped feature functions of models S1 and S2 .

produce variations in the retrieval results. We consider

There can be benets in using weighted Euclidean

In this subsection we will consider some of the well-

distances between shape descriptors. For example, in

known existing shape descriptors and show how they

case of Laplace-Beltrami eigenbasis, one can use the

can be obtained as instantiations of our framework; the

following weighted distance

aim is to establish inclusiveness. Since in all of the cases

m X

the signal processing is done using classical tools which

Since our shape descriptors are made of the initial

e−tλi (ci − c0i )2 ,

both of these factors in separate subsections.

9.1 Theoretical Evaluation

are equivalent to Laplace-Beltrami bases, we will not mention this component below.

i=1 is the eigenvalue corresponding to the eigen-

Ray based descriptor: This is the descriptor of Vrani¢

φi . This can be seen to be equivalent to smooth-

and Saupe [32] discussed at the end of Section 3. We

ing the dierence between the mapped feature functions

only note that this descriptor can be obtained in our

via the heat kernel. Such distances can be computed for

framework by the following settings. The feature func-

t and combined to give an overall

tion is distance to the center of mass; the template is

where

λi

function

a variety of settings of

the sphere; the mapping is projection.

distance. In case of diusion wavelet basis, weighted distances

Depth-buer descriptor: The depth-buer descrip-

can be used to emphasize/deemphasize features at dif-

tor of Heczko et al. [16] places a normalized mesh into a

ferent scales. For example, consider the distance of the

unit cube, and generates six gray-scale images on each

form

face of the cube by parallel projection. The grayness

m X

value is set based on the distance from the cube face

etli (ci − c0i )2 ,

to the model. Coecients obtained by applying the 2D Fourier transform to each of the six gray-scale images

i=1 where li is the level to which the basis function longs. When

t

φi

be-

are used as descriptors. In our terms, for each of the

is positive, this distance will emphasize

cube faces, the feature function is the distance from

9 mesh point to the face; template is a square, namely a cube face; mapping is done by projection (if two mesh points project to the same face point, one resolves the ambiguity by selecting the smaller function value). In the survey [5], depth-buer descriptor was fount to be the best overall feature vector among the set of implemented methods.

Lighteld descriptor:

The basic component of the

Lighteld descriptor of Chen et al. [7] is a set of silhouette images rendered via orthogonal projection onto planes tangent to the hemisphere at the vertices of a dodecahedron. These silhouettes are compactly encoded by 2D Fourier transform. During the retrieval the feature vectors for each plane are pairwisely matched to achieve rotational invariance. We can see that the feature vector corresponding to each of the planes can be obtained as an instantiation of our framework: the fea-

Template surfaces used in the study. Except for sphere, these were obtained by simplifying the benchmark models. Fig. 2

ture function is the constant function; the template is the plane (a square); the mapping is done via orthogo-

unit variance, and rotating using Voronoi area weighted

nal projection. The Lighteld descriptor was found to

PCA [29, 31]. The eectiveness of such normalization

give the best retrieval results by Shilane et al. [34] who

has been corroborated in the literature, and it is known

compared the performance of a variety of shape descrip-

that descriptors deriving their invariance from such nor-

tors on the Princeton Shape Benchmark.

malization can outperform descriptors that are intrin-

Spherical harmonics descriptor:

The rotation in-

sically invariant [5].

variant spherical harmonics descriptor of Kazhdan et

The templates used in the experiments, except the

al. [21] with slight modications can be obtained in our

sphere, were chosen randomly and are depicted in Fig-

framework by considering each of the concentric spheres

ure 2. These were obtained from models in the bench-

used separately. For each of the concentric spheres, the

mark by decimating using QSlim [12]. This was done

feature function is the constant function on the mesh;

to make the computation of shape descriptors less time

the sphere is the templates; mapping is done using in-

consuming.

terpolation based on Gaussian kernel, namely, the interpolant is obtained by determining the closest point on the surface, and multiplying the function value at that point by euclidean distance based Gaussian weight; the spherical harmonic transform, which is equivalent to Laplace-Beltrami basis on sphere, is used for signal processing a distinctive aspect being that the expansion coecients are combined in a way that makes the resulting descriptor rotation invariant.

We conduct a series of leave-one-out experiments: every model in the benchmark is queried against all other models. The ranked result lists generated by the queries are used to compute ve retrieval statistics: nearest neighbor (NN), rst tier (FT), second tier (ST), discounted cumulative gain (DCG), and average dynamic recall (ADR). In all of the experiments the feature function is the distance to the center of mass the origin:

|p|, p ∈ S . 9.2 Practical Evaluation

We use the descriptor vector size of

f (p) = m = 20

for compactness and the smoothing eect; similarity between descriptors is measured by Euclidean distance.

In this subsection we experimentally evaluate the eect

Eect of mapping: We rst compare the performance

of variations in choices involved in our framework on

of our shape descriptor when the same template (sphere)

the retrieval results. The aim is to establish that the

and basis (Laplace-Beltrami eigenbasis) are used with

proposed design and parameter degrees of freedom have

dierent mapping methods. Since the mapping domain

non-trivial practical eect.

is the sphere, the descriptor obtained using projection

For our experiments we use the Watertight Bench-

is equivalent to that of [32]. Shepard interpolation can

α, so we

mark [14] which consists of 400 closed surface models,

be controlled through the use of the parameter

divided into 20 equal object classes. Some of the classes

experimented with three choices as shown. The choice

include both dierent objects and articulations of the

was based on the fact that the long distance behavior

same object. All objects are normalized by shifting the

of mean-value interpolant is similar to that of Shepard

origin to the center of mass, uniform scaling to obtain

interpolant with

α = 2.

10 The statistics summarized in Table 1 show that barycen- also been processed by the normalization applied to the tric interpolation and projection show very similar re-

whole benchmark, so they span similar spatial regions.

trieval performances. On the other hand, Shepard inter-

We expect the descriptors to be more independent when

polation performs worst of all, yet Shepard interpola-

the templates are dierently posed.

tion is easiest to implement and fastest to run, so it can be the method of choice when eciency is paramount.

Mapping

NN

FT

ST

DCG

ADR

Projection Mean-value Shepard α = 1 Shepard α = 2 Shepard α = 3

76.8 73.5 61.0 71.3 72.0

31.4 32.3 25.0 26.9 27.3

45.6 45.2 39.3 41.8 40.7

68.3 68.0 61.5 64.8 65.0

55.5 55.7 46.3 50.4 50.8

Comparison of various mapping methods. All numbers are percentages. Table 1

Eect of basis: Table 2 compares the performance of

Template

NN

FT

ST

DCG

ADR

Bear Fish Hand Horse Pliers Sphere All combined

77.0 78.8 80.5 78.8 72.5 73.5 81.3

35.1 35.8 35.6 34.8 34.5 32.3 39.0

49.2 49.9 49.6 48.4 49.6 45.2 55.1

69.7 70.5 70.9 70.0 68.6 68.0 73.3

58.0 58.3 58.3 57.9 55.6 55.7 61.6

Table 3

Using variety of templates for shape retrieval.

Implementation: Since the same templates are used repeatedly with dierent models, we save time by pro-

our shape descriptor when the same template (sphere)

cessing every template mesh once (per each basis type)

and mapping method (mean value interpolation) are

to extract the rst

used with dierent function bases on the template. The

onal matrix

Laplace-Beltrami and diusion wavelet bases depend

sis) only these are needed to obtain the expansion

only on the template, and so can be directly computed.

coecients

However, the PCA basis requires a sample of represen-

have about 500 vertices/sampled points, the processing

tative shapes from the database. To evaluate the full

takes less than a few seconds. Storing

possible power of the PCA basis we included all the

non-zero entries) and the basis vectors consumes about

database models into our sample. In practice the PCA

500 × m + 500 = 10, 500

basis would be computed on a subset of models; thus,

R

ci .

m

basis vectors and also the diag-

(needed for Laplace-Beltrami eigenbaSince the template surfaces we consider

R

(approx. 500

oating numbers.

When computing the shape descriptor for the model

produce best nearest neighbor classication), and thus

S , we rst obtain the mapped feature function fˆ : T → R. We also compute the feature function of the template, fT . The function that is expanded in terms of the basis functions is g = fˆ/fT ; note that this

each of the three may be most appropriate for dierent

amounts to slightly modifying the process of mapping.

applications.

Since

we expect the PCA basis results to deteriorate in realistic scenarios. Note that these three choices of bases exhibit dierent performances (e.g. diusion wavelets

surface

m

basis vectors have already been computed and

stored, we only need to do the vector-matrix multiplica-

Basis Laplace-Beltrami Diusion Wavelets PCA basis Table 2

NN 73.5 75.0 74.8

FT 32.3 32.1 32.3

ST 45.2 44.8 45.7

DCG 68.0 67.8 68.5

ci , i = 1, ..., m,

ADR

tions to extract the needed coecients

55.7 55.3 56.1

which is done extremely fast. Finally, our feature vec-

Comparison of various signal processing methods.

Eect of template: Table 3 compares the performance

tor consists of m oating numbers, namely descT (S) = (c1 , c2 , ..., cm ). Notice that when S and T coincide, g = fˆ/fT = 1, and the feature vector is of the form descT (T ) = (a, 0, ..., 0). We scale all the feature vectors to achieve a = 1. This is done in order to make the descriptors obtained using dierent templates commensurable, and

of our shape descriptor when the same mapping method

avoid normalization problems when concatenating de-

(mean value interpolation) and basis (Laplace-Beltrami

scriptors coming from dierent templates.

eigenbasis) are used with dierent templates depicted

Our implementation was done in MATLAB, except

in Figure 1. The table also shows the retrieval results of

mean-value and Shepard interpolation part that was

combined descriptor obtained by simply concatenating

done in C++ and called within MATLAB as needed. To

the feature vectors for all the templates. The improved

give an idea about the timing: evaluating the shape de-

retrieval resulting from such combination, and Figure

scriptor based on mean-value interpolant takes on aver-

3 showing the average ADR per each class for dierent

age slightly less than 1 minute for each model/template

templates demonstrate that the descriptors correspond-

pair; most of this time is consumed by mean-value co-

ing to dierent templates can capture independent fea-

ordinate computation; simplifying the mesh prior to

tures of shapes. Let us note that the templates have

computation will help reduce this time. The descriptor

11

Fig. 3

The average ADR values for each class shown by template used.

based on Shepard interpolation is considerably faster

the produced retrieval results. We believe that there

and takes a few seconds to compute.

does not exist an ideal template that would be optimal for all kinds of shape repositories. In fact, the main exibility oered by our approach is that one can

10 Conclusion and Future Work This paper presents a versatile framework for shape description that can be controlled through continuum of parameter and a variety of design choices. Some of the existing shape descriptors t into this framework, and

choose optimal templates based on the shape database at hand protein shapes can benet from dierent templates than mechanical engineering parts. Designing approaches for selecting/generating these optimal templates for a given shape database is an interesting direction for future work.

a multitude of new descriptors can be obtained. Within

Finally, an attractive topic for future work is oered

our analysis of design choices, we introduce novel ap-

by the observation that mean-value interpolation per-

proaches to the mapping and basis construction steps.

forms considerably better than Shepard interpolation.

We also provide an empirical analysis of the eects that

It is natural to link this to the dierence in the order of

design and parameter choices have on the descriptor's

approximation provided by these interpolation schemes,

retrieval performance.

yet we believe this fact should be investigated further.

This work provides a small, rst step and therefore

Indeed, the expression for Shepard interpolation has

has limitations that suggest topics for future work. A

connections to physical force potentials, so the rami-

rst important topic for future work would be to inves-

cations of our observation to force eld based shape

tigate the possibility of constructing intrinsically rota-

descriptors [22, 13] can be fruitful.

tionally invariant shape descriptors using similar ideas. Another theme suitable for further investigation is the dependency between the nature of the template and

We thank Szymon Rusinkiewicz for the extremely useful Trimesh2 library. We are grateful to Mauro MagAcknowledgements

12 gioni for making publicly available an implementation of the diffusion wavelets. For the 3D models used in this paper, we thank Daniela Giorgi and AIM@SHAPE.

References 1. Mikhail Belkin, Partha Niyogi, and Vikas Sindhwani. Manifold regularization: A geometric framework for learning from labeled and unlabeled examples. J. Mach. Learn. Res., 7:23992434, 2006. 2. S. Biasotti, L. De Floriani, B. Falcidieno, P. Frosini, D. Giorgi, C. Landi, L. Papaleo, and M. Spagnuolo. Describing shapes by geometrical-topological properties of real functions. ACM Comput. Surv., 40(4):187, 2008. 3. A. M. Bronstein, M. M. Bronstein, and R. Kimmel. Generalized multidimensional scaling: A framework for isometryinvariant partial surface matching. Proceedings of the National Academy of Science, 103:11681172, January 2006. 4. Alexander Bronstein, Michael Bronstein, Ron Kimmel, Mona Mahmoudi, and Guillermo Sapiro. A gromov-hausdor framework with diusion geometry for topologically-robust non-rigid shape matching. International Journal of Computer Vision. 5. Benjamin Bustos, Daniel A. Keim, Dietmar Saupe, Tobias Schreck, and Dejan V. Vrani¢. Feature-based similarity search in 3d object databases. ACM Comput. Surv., 37(4):345387, 2005. 6. F. Chazal, D. Cohen-Steiner, L. J. Guibas, F. Mémoli, and S. Y. Oudot. Gromov-hausdor stable signatures for shapes using persistence. Computer Graphics Forum (proc. SGP 2009), 2009. 7. Ding-Yun Chen, Xiao-Pei Tian, Yu-Te Shen, and Ming Ouhyoung. On visual similarity based 3d model retrieval. Comput. Graph. Forum, 22(3):223232, 2003. 8. R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner, and S. W. Zucker. Geometric diusions as a tool for harmonic analysis and structure denition of data: Diusion maps. PNAS, 102(21):74267431, 2005. 9. Ronald R. Coifman and Mauro Maggioni. Diusion wavelets. Appl. Comput. Harmon. Anal., 21(1):5394, 2006. 10. Michael S. Floater. Mean value coordinates. Comput. Aided Geom. Des., 20(1):1927, 2003. 11. Michael S. Floater, Géza Kós, and Martin Reimers. Mean value coordinates in 3d. Comput. Aided Geom. Des., 22(7):623631, 2005. 12. Michael Garland and Paul S. Heckbert. Surface simplication using quadric error metrics. In SIGGRAPH '97: Proceedings of the 24th annual conference on Computer graphics and interactive techniques, pages 209216, 1997. 13. Xi Geng, Wenyu Liu, and Hairong Liu. Force eld based expression for 3d shape retrieval. In HCI (2), volume 4551 of Lecture Notes in Computer Science, pages 587596, 2007. 14. Daniela Giorgi, Silvia Biasotti, and Laura Paraboschi. Watertight models track. Technical Report IMATI-CNR-GE 9/07, September 2007. 15. William J. Gordon and James A. Wixom. Shepard's method of "metric interpolation" to bivariate and multivariate interpolation. Mathematics of Computation, Jan 1978. 16. M. Heczko, D. Keim, D. Saupe, and D. V. Vranic. Methods for similarity search of 3d objects (in german). DatenbankSpektrum Zeitschrift fï¾÷r Datenbanktechnologie, 2(2):54 63, 2002. 17. Natraj Iyer, Subramaniam Jayanti, Kuiyang Lou, Yagnanarayanan Kalyanaraman, and Karthik Ramani. Threedimensional shape searching: state-of-the-art review and future trends. Computer-Aided Design, 37(5):509530, 2005.

18. Varun Jain and Hao Zhang. Robust 3D shape correspondence in the spectral domain. In Shape Modeling International, 2006. 19. I. T. Jollie. Principal Component Analysis. Springer, second edition, October 2002. 20. Tao Ju, Scott Schaefer, and Joe Warren. Mean value coordinates for closed triangular meshes. In TOG(SIGGRAPH), pages 561566, 2005. 21. Michael Kazhdan, Thomas Funkhouser, and Szymon Rusinkiewicz. Rotation invariant spherical harmonic representation of 3D shape descriptors. In Symposium on Geometry Processing, 2003. 22. A. Mademlis, P. Daras, D. Tzovaras, and M.G. Strintzis. 3d object retrieval based on resulting elds. In Workshop on 3D object retrieval, April 2008. 23. A. Mademlis, P. Daras, D. Tzovaras, and M.G. Strintzis. Using ellipsoidal harmonics for 3d shape representation. In INTERMEDIA Workshop on Hypermedia 3D Internet, 2008. 24. Facundo Mémoli. On the use of Gromov-Hausdor Distances for Shape Comparison. pages 8190, Prague, Czech Republic, 2007. Eurographics Association. 25. Facundo Mémoli and Guillermo Sapiro. Comparing point clouds. In SGP '04: Proceedings of the 2004 Eurographics/ACM SIGGRAPH symposium on Geometry processing, pages 3240, New York, NY, USA, 2004. ACM. 26. Facundo Mémoli and Guillermo Sapiro. A theoretical and computational framework for isometry invariant recognition of point cloud data. Found. Comput. Math., 5(3):313347, 2005. 27. M. Meyer, M. Desbrun, P. Schröder, and A. Barr. Discrete dierential geometry operators for triangulated 2-manifolds. In Proceedings of Visual Mathematics, 2002. 28. R. Osada, T. Funkhouser, B. Chazelle, and D. Dobkin. Shape distributions. TOG, 21(4):807832, 2002. 29. Eric Paquet and Marc Rioux. Nefertiti: a query by content system for three-dimensional model and image databases management. Image Vision Comput., 17(2):157166, 1999. 30. P. Frosini D. Giorgi C. Landi S. Marini G. Patanè M. Spagnuolo S. Biasotti, B. Falcidieno. 3d shape description and matching based on properties of real functions. In Eurographics 2007 Tutorial Notes, pages 10251074. The Eurographics Association, 2007. 31. D. V. Vrani¢and D. Saupe. 3d model retrieval retrieval. In Proceedings of the Spring Conference on Computer Graphics and its Applications, pages 8993, 2000. 32. Dietmar Saupe and Dejan V. Vranic. 3d model retrieval with spherical harmonics and moments. In Proceedings of the 23rd DAGM-Symposium on Pattern Recognition, pages 392397, London, UK, 2001. Springer-Verlag. 33. Donald Shepard. A two-dimensional interpolation function for irregularly-spaced data. In Proceedings of the 1968 23rd ACM national conference, pages 517524, New York, NY, USA, 1968. ACM. 34. Philip Shilane, Patrick Min, Michael Kazhdan, and Thomas Funkhouser. The princeton shape benchmark. pages 167 178, Los Alamitos, CA, USA, 2004. IEEE Computer Society. 35. Johan Tangelder and Remco Veltkamp. A survey of content based 3d shape retrieval methods. Multimedia Tools and Applications, 39:441471, 2008. 36. Bruno Vallet and Bruno Lï¾÷vy. Manifold harmonics. Computer Graphics Forum (Proceedings Eurographics), 2008.

A Versatile Framework for Shape Description

function values are diffused throughout the point cloud. Fixing some maximum ..... TERMEDIA Workshop on Hypermedia 3D Internet, 2008. 24. Facundo MÃ©moli.

Download PDF

1012KB Sizes 3 Downloads 260 Views

Report

A Versatile Framework for Shape Description

Recommend Documents