Mining Labelled Tensors by Discovering both their Common and Discriminative Subspaces Wei Liu∗

Jeffrey Chan∗ James Bailey∗ Christopher Leckie∗ and Kotagiri Ramamohanarao∗

cus in data mining domain. The non-negativity property is desirable in many practical domains where nonzero data entries are usually positive, such as the adjacency matrices in graph mining domains and the images/videos analyzed in computer vision. Many tensor factorization methods have been proposed in the past, such as Tucker decomposition [24], canonical polyadic decomposition [5] (CP, also known as CANDECOMP/PARAFAC [11]) and the NTF [25]. All of these methods and their later variants can be considered as higher-order generalizations of matrix factorizations. However, these existing methods are all restricted to decomposing a single instance of a tensor object in an unsupervised manner. This raises the question of what strategy should be used when dealing with multiple tensor objects which are associated with class labels. Given a tensor of M dimensions1 , existing NTF methods decompose that tensor into M low-rank matrix factors, each of which explains a compact basis of each dimension of the tensor. Two common approaches for using NTF to factorize a sequence of m tensors are: (option 1 ) decompose each tensor separately (Figure 1a) – this approach generates a low-rank factor matrix for each mode of each tensor, and does not necessarily identify a potentially important “common factor” matrix that these tensors may share; or (option 2 ) concatenate all tensors to form one big tensor and then decompose it (Figure 1b) – in contrast to the first option, this strategy may discard all possible “discriminative factor” matrices in the concatenated dimension, and only produces factors that are a consensus of the original tensors. Although it is possible to treat these tensors as a data stream and use sliding windows to analyze them incrementally [22, 23], the actual decomposition on each el1 Introduction ement within each window is still limited to the above In this paper we focus on the problem of non-negative two options. tensor factorization (NTF) for a sequence of multiple In this research we propose a novel strategy for antensors. Computing low-rank non-negative approximations of high dimensional data has become a major fo-

Abstract

Conventional non-negative tensor factorization (NTF) methods assume there is only one tensor that needs to be decomposed to low-rank factors. However, in practice data are usually generated from different time periods or by different class labels, which are represented by a sequence of multiple tensors associated with different labels. This raises the problem that when one needs to analyze and compare multiple tensors, existing NTF is unsuitable for discovering all potentially useful patterns: 1) if one factorizes each tensor separately, the common information shared by the tensors is lost in the factors, and 2) if one concatenates these tensors together and forms a larger tensor to factorize, the intrinsic discriminative subspaces that are unique to each tensor are not captured. The cause of such an issue is from the fact that conventional factorization methods handle data observations in an unsupervised way, which only considers features and not labels of the data. To tackle this problem, in this paper we design a novel factorization algorithm called CDNTF (common and discriminative subspace non-negative tensor factorization), which takes both features and class labels into account in the factorization process. CDNTF uses a set of labelled tensors as input and computes both their common and discriminative subspaces simultaneously as output. We design an iterative algorithm that solves the common and discriminative subspace factorization problem with a proof of convergence. Experiment results on solving graph classification problems demonstrate the power and the effectiveness of the subspaces discovered by our method.

∗ Department

of Computing and Information Systems, The University of Melbourne, VIC 3010, Australia. Correspondence goes to [email protected]

614

1 We use the notion of “cardinality” to describe the length of a dimension (i.e., a mode) of a tensor/matrix. For example, in a two-dimension matrix, the numbers of rows and columns are the cardinalities of the two dimensions of the matrix.

Copyright © SIAM. Unauthorized reproduction of this article is prohibited.

Mining Labelled Tensors by Discovering both their ...

Mining Labelled Tensors by Discovering both their. Common and ... However, in prac- tice data are usually generated from different time pe- ... cus in data mining domain. ... tensors to form one big tensor and then decompose it. (Figure 1b) ...

62KB Sizes 13 Downloads 138 Views

Recommend Documents

Discovering Math APIs by Mining Unit Tests
Department of Computer Science and Automation, Indian Institute of Science. {anirudh s,pandita.omesh ... top-most pseudo-code snippet to implement the entire expression was correct in 93% of the cases. .... 4 hadoop.apache.org. 5 respectively, acs.lb

Discovering Math APIs by Mining Unit Tests - (SEAL), IISc Bangalore
... unit test mining approach. The semantics of APIs to be migrated can be specified in math notation, to obtain matching APIs from other libraries using MathFinder. Acknowledgements We thank the volunteers of the user study, and the mem- bers of the

Discovering Math APIs by Mining Unit Tests
Abstract. In today's API-rich world, programmer productivity depends heavily on the programmer's ability to discover the required APIs. In this paper, we present a technique and tool, called MathFinder, to discover. APIs for mathematical computations

Morgan Kaufmann - Mining the Web - Discovering Knowledge from ...
Morgan Kaufmann - Mining the Web - Discovering Knowledge from Hypertext Data.pdf. Morgan Kaufmann - Mining the Web - Discovering Knowledge from ...

A Temporal Data-Mining Approach for Discovering End ...
of solution quality, scale well with the data size, and are robust against noises in ..... mapping is an one-to-one mapping m between two sub- sets Ai. 1 and Ai.

David Lovelock_Hanno Rund-Tensors-Differential forms-and ...
David Lovelock_Hanno Rund-Tensors-Differential forms-and Variational Principles-Dover (1989).pdf. David Lovelock_Hanno Rund-Tensors-Differential ...

Chapter 7 Multilinear Functions (Tensors) -
notation is that the ∂alk can be factored out of the argument like a simple scalar). T (ai,...,aj−1, ∇ak ,aj+1,...,ar) ≡T (ai,...,aj−1. , elk ∂alk ,aj+1,...,aik eik ,...,ar).

Finding Communities by Their Centers.pdf
requests for materials should be addressed to P.L. (email: [email protected]) or J.Z. (email: [email protected]). received: 10 May 2015. accepted: 16 ...

Quantum Critical Scaling of the Geometric Tensors
Aug 30, 2007 - system driven by quantum fluctuations. This phenomenon, known as quantum phase transition .... Notice that if H H0 V then G is nothing but the dynamic response function associated to the ''perturbation'' V. We now move to the frequency

man-66\well-labelled-diagram-rhizome.pdf
man-66\well-labelled-diagram-rhizome.pdf. man-66\well-labelled-diagram-rhizome.pdf. Open. Extract. Open with. Sign In. Main menu.

Directed Interpretable Discovery in Tensors with Sparse ...
typically requires adding a penalty term in addition to the reconstruction ... held constant. Adding in non-negative constraint on ..... Now we solve the quadratic equation in line 3 for α: ||s||2 .... dataset from AT&T Laboratories Cambridge2 of 40

Both Rubber Duck Problems Solutions.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Both Rubber ...

Insurance-FT-BOTH-12.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item.

Should both husband and wife?
home. Even if you don't have children, you may need to consider care for a dependent ... pages of your local telephone ... Some offices buy gifts for co-workers.

Both Sinkhole Model Labs.pdf
Download. Connect more apps... Try one of the apps below to open or edit this item. Both Sinkhole Model Labs.pdf. Both Sinkhole Model Labs.pdf. Open. Extract.

Clustering Graphs by Weighted Substructure Mining
Call the mining algorithm to obtain F. Estimate θlk ..... an advanced graph mining method with the taxonomy of labels ... Computational Biology Research Center.

Generating Links by Mining Quotations
Jun 21, 2008 - run on a digital library of over 1 million books and has been used by ... mation represents 500 years of printing that preceded the digital era.