NON-NEGATIVE MATRIX FACTORIZATION ON KERNELS Daoqiang Zhang, Zhi-Hua Zhou, and Songcan Chen
Notes by Raul Torres for NMF Seminar
Outline • • • • •
Non-negative Matrix Factorization: An overview Kernel Non-negative Matrix Factorization Sub-pattern based KNMF Experiments Conclusions
Non-negative Matrix Factorization: An overview • The key ingredient of NMF is the non-negativity constraints imposed on matrix factors.
•Columns of W are called NMF bases •Columns of H are its combining coefficients •The dimensions of W and H are n × r and r ×m , respectively The rank r of the factorization is usually chosen such that (n+m)r < nm, and hence the dimensionality reduction is achieved.
NMF: An overview Multiplicative update rules:
Kernel Non-negative Matrix Factorization Given m objects O1, O2, …, Om,
v1
v2 1
1 2
3 4 5 6 7
8 . . . n
v3 2
3.
.
.
vm m
•Attribute values represented as an n x m matrix: V=[v1, v2, …, vm] •Each column represents one of the m objects •Define the nonlinear map from original input space V to a higher or infinite dimensional feature space F
KNMF •As in NMF, KNMF finds two non-negative factors:
• is the bases in feature space • H is its combining coefficients: each column is the dimension-reduced representation of each object • is unknown, so it is impractical to directly factorize it • So:
•Kernel: is a function in the input space and at the same time is the inner product in the feature space through the kernel-induced nonlinear mapping
KNMF
KNMF
KNMF
• • Y
is the learned bases of is the learned bases of K
Sub-pattern based KNMF(SpKNMF) •Assume n is divisible by p •Reassemble the original matrix V into n/p by mp matrix U
v1
u1
U
u2
v3
u3
u4
u5
u6
u7
u8
u9
u10
u11
=
v2 n=21
m=4
p=3
v4 n/p=7
m*p=12
u12
SpKNMF
•H={hj} with dimension of r by mp, where r is the number of reduced dimensions •Then reassemble the matrix H into rp by m matrix R as R1 h1
H
h2
R3 h3
h4
h5
h6
h7
h8
h9
h10
h11
=
r*p= 15 r=5
R2 m=4
p=3
R4 m*p=12
h12
SpKNMF
Experiments • Configurations: ▫ Gaussian kernel with standard variance ▫ Nearest neighborhood classifier (1-NN) ▫ UCI: 10 independent runs, avereged 50% training – 50% testing (random) Tested in different dimensions
Experiments
Ionosphere
Bupa
Glass
PID
Experiments FERET Face Database
•400 gray-level frontal view face images from 200 persons are used •2 images per person •There are 71 females and 129 males
Accuracy
NMF
KNMF
SpKNMF
69.23%
80.37%
84.44%
Conclusions • KNMF can: ▫ Extract more useful features hidden in the original data using some kernel-induced nonlinear mapping ▫ Deal with relational data where only the relationships between objects are known ▫ Process data with negative values by using some specific kernel functions (e.g. Gaussian)
• KNMF is more general than NMF • SpKNMF improves the performance of KNMF performing KNMF on sub-patterns of the original data. • Issues ▫ Selection of kernels and parameters ▫ Choosing the appropriate size for the reassembled matrix
vm. 1. 2. 3. . . m. 1. 2. 3. 4. 5. 6. 7. 8 . . . n. â¢Attribute values represented as an n x m matrix: V=[v. 1. , v. 2. , â¦, v m. ] â¢Each column represents one of the m objects.
ABSTRACT. We address the problem of blind audio source separation in the under-determined and convolutive case. The contribution of each source to the mixture channels in the time-frequency domain is modeled by a zero-mean Gaussian random vector with
FAST NONNEGATIVE MATRIX FACTORIZATION: AN. ACTIVE-SET-LIKE METHOD AND COMPARISONSâ. JINGU KIMâ AND HAESUN PARKâ . Abstract. Nonnegative matrix factorization (NMF) is a dimension reduction method that has been widely used for numerous application
points on different manifolds, which can diffuse information across manifolds ... taking the multiple manifold structure information into con- sideration. ..... Technology. Lee, D. D. ... Turlach, B. A.; Venablesy, W. N.; and Wright, S. J. 2005. Simu
College of Computing, Georgia Institute of Technology. Atlanta, GA ..... Otherwise, a complementary ba- ...... In Advances in Neural Information Pro- cessing ...
Dec 16, 2008 - Nonlinear programming. Athena Scientific ... Proceedings of the National Academy of Sciences, 101(12):4164â4169, 2004 ... CVPR '01: Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and.
Institute of Automation, CAS. Beijing ... can provide a platform for discussion of the impacts of different .... The contribution of CSMF is to provide a platform for.
given. Finally conclusion is provided in Section 5. 2. Constrained sparse matrix factorization. 2.1. A general framework. Suppose given the data matrix X=(x1, â¦
Keywordsâmatrix factorization, blind source separation, multiplicative update rule, signal dependent noise, EMG, ... parameters defining the distribution, e.g., one related to. E(Dij), to be W C, and let the rest of the parameters in the .... contr
From the machine learning perspective, gene selection is just a feature selection ..... Let ¯X be the DDS of the feature set X, and R be the cluster representative ...
campaigns to perform audience retrieval for the given target campaign. ... systems in the following way. ... users' preferences in the target campaign, which we call focus ...... In Proceedings of the 15th ACM SIGKDD international conference on.
a low-rank one by imposing sparsity on its singular values, and its robust variant further ...... is very efficient: it takes less than 0.05 sec on a laptop to segment a 192 Ã 144 grey ... gave good separation, while 'LE'-SAMF failed in several fram
Aug 16, 2012 - ing about 3 million documents, show that GRLSI and GNMF can greatly improve ... Categories and Subject Descriptors: H.3.1 [Information Storage ... A document is viewed as a bag of terms generated from a mixture of latent top- ics. Many
Nov 7, 2014 - In the experiments, we study the effec- tiveness of our HGMF for both rating prediction and item recommendation, and find that it is better than some state- of-the-art methods on several real-world data sets. Categories and Subject Desc
Figure 1: Illustration of conditions for uniqueness of the LSD clustering for the case k = 3 and for an LSDable K. ...... 3D face recognition using Euclidean integral invariants signa- ture. Proc. ... U. von Luxburg. A tutorial on spectral clustering
Journal of Machine Learning Research 14 (2013) 1715-1746 ..... Figure 1: Illustration of conditions for uniqueness of the LSD clustering for the case k = 3 and.
of output targets to achieve good performance, the majority of these parameters are in the final ... recognition, the best performance with CNNs can be achieved when matching the number of ..... guage models,â Tech. Rep. RC 24671, IBM ...
best neural net model given a sufficient number ... decade as a powerful hypothesis class that can capture com- ...... In Foundations of Computer Science,.
of |Aλ| as a power of q and of the size of the constant underlying the Oânotation. We think that our methods may be extended to deal with this more general case, at least for certain classes of parameterizing affine varieties. 2. Factorization pat