Learning Features by Contrasting Natural Images with Noise Michael Gutmann University of Helsinki
Aapo Hyvärinen University of Helsinki
[email protected]
[email protected]
Michael Gutmann – University of Helsinki
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 1/17
Introduction
Michael Gutmann – University of Helsinki
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 2/17
Natural images? A natural scene Natural image patches
Introduction
3
● Preliminaries ● Nat. image vs. noise ● Classifier Contrastive feature learning Simulations
5
4
■
Most prob. models are models of natural image patches.
■
Difficult enough as the data is high dimensional.
■
Can serve as building block for models of entire scenes.
2 1 6
Michael Gutmann – University of Helsinki
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 3/17
Why model natural images?
Introduction ● Preliminaries ● Nat. image vs. noise ● Classifier Contrastive feature learning Simulations
Possible motivations for building statistical models of natural images: ■ You can use the model as prior in tasks which involve Bayesian inference (not the topic of this presentation). ■ You can use the model to generate artificial natural images (not the topic of this presentation). ■ There are some connections to visual neuroscience (not the topic of this presentation). ■ Finding interesting features of natural images: ◆ What kind of features appear in natural images? ◆ What kind of structure is characteristic for natural images? ◆ How do natural images differ from other, artificial, image data (noise)?
Michael Gutmann – University of Helsinki
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 4/17
Natural images vs. noise
Introduction ● Preliminaries
In what aspects do the two datasets differ from each other?
● Nat. image vs. noise ● Classifier Contrastive feature learning Simulations
natural images
Michael Gutmann – University of Helsinki
noise images
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 5/17
Learning features by classification
Introduction ● Preliminaries ● Nat. image vs. noise ● Classifier Contrastive feature learning Simulations
Key idea: ■ Train a classifier to discriminate between natural images and some artificial noise. ■ To succeed in the discrimination task, the classifier must “discover structure” in the data, i.e. identify features of natural images. Michael Gutmann – University of Helsinki
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 6/17
Learning features by classification
Introduction ● Preliminaries ● Nat. image vs. noise ● Classifier Contrastive feature learning Simulations
Key idea: ■ Train a classifier to discriminate between natural images and some artificial noise. ■ To succeed in the discrimination task, the classifier must “discover structure” in the data, i.e. identify features of natural images. Michael Gutmann – University of Helsinki
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 7/17
Contrastive feature learning
Michael Gutmann – University of Helsinki
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 8/17
The elements of contrastive feature learning
Introduction Contrastive feature learning ● Elements ● More details ● Nonlinearities Simulations
1. Classifier: Assign C = 1 if input x is a natural image, and C = 0 if input is noise. 2. Estimation method: Fit the parameters in the classifier to the data (supervised learning!) 3. Noise: The reference data that is used for comparison with the natural images.
Michael Gutmann – University of Helsinki
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 9/17
More on the classifier and its estimation ■
Use a classification approach based on logistic regression
Introduction
1 P (C = 1|x) = 1 + exp(−y(x))
Contrastive feature learning ● Elements ● More details ● Nonlinearities Simulations
■
■
y(x) =
M X
T g(wm x + bm ) + γ
m=1
Parameters in the model are the features wm , the bias terms bm , the offset γ, as well as possibly the function g(u). The parameters can be estimated by maximum (conditional) likelihood. This is the same as minimization of the cross-entropy error J T 1X −Ct log [P (Ct = 1|xt )]−(1−Ct ) log [1 − P (Ct = 1|xt )] J= T t=1
■
Michael Gutmann – University of Helsinki
Reference data: Use noise with the same covariance structure as natural images.
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 10/17
Choice of the nonlinearity in the discriminant y(x) Simple choices
10
Introduction Contrastive feature learning ● Elements ● More details
Classical choices
● Nonlinearities
"Next" simplest choices
Simulations
40
Linear
Quadratic
5
30
0
20
−5
10
−10 −6 1
−4
−2
0
2
4
0 −6 1
6
−4
−2
Tanh
0
2
4
6
2
4
6
Logistic
0.5 0
0.5
−0.5 −1 −6
−4
−2
0
2
1
4
0 −6
6
−4
5
Symm logistic
0.8
3
0.4
2
0.2
1
0
20
Lin thresh
4
0.6
−2
Squared thresh 15 10
0
−10
−5
0
5
10
0 −6
5
−4
−2
0
2
4
6
0 −6
−4
−2
0
2
4
6
Parameterized nonlinearity:
g(u) = α1 [max(0, u − β1 )]η1 + α2 [max(0, −(u − β2 ))]η2 Michael Gutmann – University of Helsinki
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 11/17
Simulations
Michael Gutmann – University of Helsinki
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 12/17
The questions addressed
Introduction Contrastive feature learning Simulations ● Questions addressed ● Performance ● Features
1. Which nonlinearity g(u) gives the best performance? 2. How do the features wm look like? 3. Which principle does the classifier use to solve the discrimination task?
● Classification principle
Michael Gutmann – University of Helsinki
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 13/17
Classification performance Cross entropy error
False classification rate
0.34
0.45 0.32 0.3
0.4
0.28
0.35
0.26 Tanh
Logistic
Symm. Logistic Lin thresh
Squared thresh
Tanh
Logistic
Symm. Logistic Lin thresh
Squared thresh
0.4 0.3 0.38 0.29 0.36 0.28 0.34 0.27 0.32 0.26 0.3 0.25 0.28 Optimized nonlinearity
Michael Gutmann – University of Helsinki
Symm. Logistic
Optimized nonlinearity
Symm. Logistic
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 14/17
The learned features Symm. logistic
Michael Gutmann – University of Helsinki
Optimized nonlinearity
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 15/17
Classification principle The discriminant y(x) = Introduction
PM
T g(w x) m m=1
y(x) > 0 ⇒ x is a natural image.
+ γ/M rules:
y(x) < 0 ⇒ x is noise.
Contrastive feature learning Simulations
Symm. logistic
● Questions addressed
Optimized nonlinearity
● Performance
15
1.2
● Features ● Classification principle
1 10 g(u)+γ/M
g(u) + γ/M
0.8 0.6 0.4 0.2
5
0
0 −0.2 −15
−10
−5
0 u
5
10
15
Thresholding of each feature output
Michael Gutmann – University of Helsinki
−5 −6
−4
−2
0 u
2
4
6
Sign consistency across feature outputs
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 16/17
Summary ■ Introduction Contrastive feature learning Simulations Summary
■
Michael Gutmann – University of Helsinki
The minimum to retain: 1. The talk was about learning features in data (here: natural images). 2. Features are learned by training a classifier to distinguish between the data and some artificial noise. 3. We used nonlinear logistic regression to do the classification. Some more details: 1. Classification by thresholding outputs of gabor-like feature detectors. 2. Optimizing the nonlinearity gives an asymmetric solution. Classification performance improves. 3. An alternative classification principle: the outputs of some gabor-like feature detectors need to have the same sign for natural images.
ICANN2009: Learning Features by Contrasting Natural Images with Noise - p. 17/17