Learning Features by Contrasting Natural Images with ...

Viewer
Transcript

Learning Features by Contrasting Natural Images with Noise Michael Gutmann University of Helsinki

Aapo Hyvärinen University of Helsinki

[email protected]

[email protected]

Michael Gutmann – University of Helsinki

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 1/17

Introduction

Michael Gutmann – University of Helsinki

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 2/17

Natural images? A natural scene Natural image patches

Introduction

3

● Preliminaries ● Nat. image vs. noise ● Classifier Contrastive feature learning Simulations

5

4

■

Most prob. models are models of natural image patches.

■

Difficult enough as the data is high dimensional.

■

Can serve as building block for models of entire scenes.

2 1 6

Michael Gutmann – University of Helsinki

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 3/17

Why model natural images?

Introduction ● Preliminaries ● Nat. image vs. noise ● Classifier Contrastive feature learning Simulations

Possible motivations for building statistical models of natural images: ■ You can use the model as prior in tasks which involve Bayesian inference (not the topic of this presentation). ■ You can use the model to generate artificial natural images (not the topic of this presentation). ■ There are some connections to visual neuroscience (not the topic of this presentation). ■ Finding interesting features of natural images: ◆ What kind of features appear in natural images? ◆ What kind of structure is characteristic for natural images? ◆ How do natural images differ from other, artificial, image data (noise)?

Michael Gutmann – University of Helsinki

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 4/17

Natural images vs. noise

Introduction ● Preliminaries

In what aspects do the two datasets differ from each other?

● Nat. image vs. noise ● Classifier Contrastive feature learning Simulations

natural images

Michael Gutmann – University of Helsinki

noise images

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 5/17

Learning features by classification

Introduction ● Preliminaries ● Nat. image vs. noise ● Classifier Contrastive feature learning Simulations

Key idea: ■ Train a classifier to discriminate between natural images and some artificial noise. ■ To succeed in the discrimination task, the classifier must “discover structure” in the data, i.e. identify features of natural images. Michael Gutmann – University of Helsinki

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 6/17

Learning features by classification

Introduction ● Preliminaries ● Nat. image vs. noise ● Classifier Contrastive feature learning Simulations

Key idea: ■ Train a classifier to discriminate between natural images and some artificial noise. ■ To succeed in the discrimination task, the classifier must “discover structure” in the data, i.e. identify features of natural images. Michael Gutmann – University of Helsinki

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 7/17

Contrastive feature learning

Michael Gutmann – University of Helsinki

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 8/17

The elements of contrastive feature learning

Introduction Contrastive feature learning ● Elements ● More details ● Nonlinearities Simulations

1. Classifier: Assign C = 1 if input x is a natural image, and C = 0 if input is noise. 2. Estimation method: Fit the parameters in the classifier to the data (supervised learning!) 3. Noise: The reference data that is used for comparison with the natural images.

Michael Gutmann – University of Helsinki

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 9/17

More on the classifier and its estimation ■

Use a classification approach based on logistic regression

Introduction

1 P (C = 1|x) = 1 + exp(−y(x))

Contrastive feature learning ● Elements ● More details ● Nonlinearities Simulations

■

■

y(x) =

M X

T g(wm x + bm ) + γ

m=1

Parameters in the model are the features wm , the bias terms bm , the offset γ, as well as possibly the function g(u). The parameters can be estimated by maximum (conditional) likelihood. This is the same as minimization of the cross-entropy error J T 1X −Ct log [P (Ct = 1|xt )]−(1−Ct ) log [1 − P (Ct = 1|xt )] J= T t=1

■

Michael Gutmann – University of Helsinki

Reference data: Use noise with the same covariance structure as natural images.

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 10/17

Choice of the nonlinearity in the discriminant y(x) Simple choices

10

Introduction Contrastive feature learning ● Elements ● More details

Classical choices

● Nonlinearities

"Next" simplest choices

Simulations

40

Linear

Quadratic

5

30

0

20

−5

10

−10 −6 1

−4

−2

0

2

4

0 −6 1

6

−4

−2

Tanh

0

2

4

6

2

4

6

Logistic

0.5 0

0.5

−0.5 −1 −6

−4

−2

0

2

1

4

0 −6

6

−4

5

Symm logistic

0.8

3

0.4

2

0.2

1

0

20

Lin thresh

4

0.6

−2

Squared thresh 15 10

0

−10

−5

0

5

10

0 −6

5

−4

−2

0

2

4

6

0 −6

−4

−2

0

2

4

6

Parameterized nonlinearity:

g(u) = α1 [max(0, u − β1 )]η1 + α2 [max(0, −(u − β2 ))]η2 Michael Gutmann – University of Helsinki

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 11/17

Simulations

Michael Gutmann – University of Helsinki

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 12/17

The questions addressed

Introduction Contrastive feature learning Simulations ● Questions addressed ● Performance ● Features

1. Which nonlinearity g(u) gives the best performance? 2. How do the features wm look like? 3. Which principle does the classifier use to solve the discrimination task?

● Classification principle

Michael Gutmann – University of Helsinki

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 13/17

Classification performance Cross entropy error

False classification rate

0.34

0.45 0.32 0.3

0.4

0.28

0.35

0.26 Tanh

Logistic

Symm. Logistic Lin thresh

Squared thresh

Tanh

Logistic

Symm. Logistic Lin thresh

Squared thresh

0.4 0.3 0.38 0.29 0.36 0.28 0.34 0.27 0.32 0.26 0.3 0.25 0.28 Optimized nonlinearity

Michael Gutmann – University of Helsinki

Symm. Logistic

Optimized nonlinearity

Symm. Logistic

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 14/17

The learned features Symm. logistic

Michael Gutmann – University of Helsinki

Optimized nonlinearity

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 15/17

Classification principle The discriminant y(x) = Introduction

PM

T g(w x) m m=1

y(x) > 0 ⇒ x is a natural image.

+ γ/M rules:

y(x) < 0 ⇒ x is noise.

Contrastive feature learning Simulations

Symm. logistic

● Questions addressed

Optimized nonlinearity

● Performance

15

1.2

● Features ● Classification principle

1 10 g(u)+γ/M

g(u) + γ/M

0.8 0.6 0.4 0.2

5

0

0 −0.2 −15

−10

−5

0 u

5

10

15

Thresholding of each feature output

Michael Gutmann – University of Helsinki

−5 −6

−4

−2

0 u

2

4

6

Sign consistency across feature outputs

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 16/17

Summary ■ Introduction Contrastive feature learning Simulations Summary

■

Michael Gutmann – University of Helsinki

The minimum to retain: 1. The talk was about learning features in data (here: natural images). 2. Features are learned by training a classifier to distinguish between the data and some artificial noise. 3. We used nonlinear logistic regression to do the classification. Some more details: 1. Classification by thresholding outputs of gabor-like feature detectors. 2. Optimizing the nonlinearity gives an asymmetric solution. Classification performance improves. 3. An alternative classification principle: the outputs of some gabor-like feature detectors need to have the same sign for natural images.

ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 17/17