NON-NEGATIVE MATRIX FACTORIZATION ON KERNELS Daoqiang Zhang, Zhi-Hua Zhou, and Songcan Chen
Notes by Raul Torres for NMF Seminar
Outline • • • • •
Non-negative Matrix Factorization: An overview Kernel Non-negative Matrix Factorization Sub-pattern based KNMF Experiments Conclusions
Non-negative Matrix Factorization: An overview • The key ingredient of NMF is the non-negativity constraints imposed on matrix factors.
•Columns of W are called NMF bases •Columns of H are its combining coefficients •The dimensions of W and H are n × r and r ×m , respectively The rank r of the factorization is usually chosen such that (n+m)r < nm, and hence the dimensionality reduction is achieved.
NMF: An overview Multiplicative update rules:
Kernel Non-negative Matrix Factorization Given m objects O1, O2, …, Om,
v1
v2 1
1 2
3 4 5 6 7
8 . . . n
v3 2
3.
.
.
vm m
•Attribute values represented as an n x m matrix: V=[v1, v2, …, vm] •Each column represents one of the m objects •Define the nonlinear map from original input space V to a higher or infinite dimensional feature space F
KNMF •As in NMF, KNMF finds two non-negative factors:
• is the bases in feature space • H is its combining coefficients: each column is the dimension-reduced representation of each object • is unknown, so it is impractical to directly factorize it • So:
•Kernel: is a function in the input space and at the same time is the inner product in the feature space through the kernel-induced nonlinear mapping
KNMF
KNMF
KNMF
• • Y
is the learned bases of is the learned bases of K
Sub-pattern based KNMF(SpKNMF) •Assume n is divisible by p •Reassemble the original matrix V into n/p by mp matrix U
v1
u1
U
u2
v3
u3
u4
u5
u6
u7
u8
u9
u10
u11
=
v2 n=21
m=4
p=3
v4 n/p=7
m*p=12
u12
SpKNMF
•H={hj} with dimension of r by mp, where r is the number of reduced dimensions •Then reassemble the matrix H into rp by m matrix R as R1 h1
H
h2
R3 h3
h4
h5
h6
h7
h8
h9
h10
h11
=
r*p= 15 r=5
R2 m=4
p=3
R4 m*p=12
h12
SpKNMF
Experiments • Configurations: ▫ Gaussian kernel with standard variance ▫ Nearest neighborhood classifier (1-NN) ▫ UCI: 10 independent runs, avereged 50% training – 50% testing (random) Tested in different dimensions
Experiments
Ionosphere
Bupa
Glass
PID
Experiments FERET Face Database
•400 gray-level frontal view face images from 200 persons are used •2 images per person •There are 71 females and 129 males
Accuracy
NMF
KNMF
SpKNMF
69.23%
80.37%
84.44%
Conclusions • KNMF can: ▫ Extract more useful features hidden in the original data using some kernel-induced nonlinear mapping ▫ Deal with relational data where only the relationships between objects are known ▫ Process data with negative values by using some specific kernel functions (e.g. Gaussian)
• KNMF is more general than NMF • SpKNMF improves the performance of KNMF performing KNMF on sub-patterns of the original data. • Issues ▫ Selection of kernels and parameters ▫ Choosing the appropriate size for the reassembled matrix
non-negative matrix factorization on kernels
vm. 1. 2. 3. . . m. 1. 2. 3. 4. 5. 6. 7. 8 . . . n. â¢Attribute values represented as an n x m matrix: V=[v. 1. , v. 2. , â¦, v m. ] â¢Each column represents one of the m objects.