Multimodal Visualization Based On Non-negative Matrix Factorization

Multimodal Visualization Based On Non-negative Matrix Factorization Jorge Camargo

Juan Caicedo

Fabio González

BioIngenium Research Group National University of Colombia

April 26, 2010

Multimodal Visualization Based On Non-negative Matrix Factorization

Outline 1 Introduction 2 Problem Definition 3 Multimodal Image Collection Visualization 4 Experimental Evaluation 5 Conclusion

Multimodal Visualization Based On Non-negative Matrix Factorization Introduction

Motivation

Flickr receives about 5,000 new photos per minute Pitkanen et al. [2], reported a production of about 70,000 new daily images in a radiology department Image collection exploration has been shown to be a good strategy Summarization Visualization Interaction

Multimodal Visualization Based On Non-negative Matrix Factorization Problem Definition

Problem

Traditionally image collection visualization approaches only use visual content to represent image content and to project similarity relationships in the visualization space. However there are other information sources, such as text, which is useful to better visualize image collections. How to use visual and textual content to improve image collection visualization How to project text and images in the same visualization space How to measure the quality of the visualization

Multimodal Visualization Based On Non-negative Matrix Factorization Multimodal Image Collection Visualization

Non-negative Matrix Factorization The general problem of matrix factorization is to decompose a matrix X into two matrix factors A and B: Xn×l = An×r Br ×l

(1)

There are different ways to find a NMF [1], the most obvious one is to minimize: ||X − AB||2 An alternative objective function is:   Xij − Xij + (AB)ij D(X |AB) = ∑ Xij log (AB)ij ij In both cases, the constraint is A, B ≥ 0.

(2)

(3)

Multimodal Visualization Based On Non-negative Matrix Factorization Multimodal Image Collection Visualization

NMF-based Multimodal Image Representation The image database is composed of two data modalities, herein denoted by Xv and Xt . The proposed strategy consists in the construction of a multimodal matrix X = [XvT XtT ]T . Then, the matrix is decomposed using NMF as follows: X(n+m)×l = W(n+m)×r Hr ×l ,

(4)

where W is the basis of the latent space in which each multimodal object is represented by a linear combination of the r columns of W . The corresponding coefficients of the combination are codified in the columns of H.

Multimodal Visualization Based On Non-negative Matrix Factorization Multimodal Image Collection Visualization

Multimodal Visualization

We use PCA algorithm to reduce the dimensionality of text data and images taking their representation in the latent space. As input, PCA receives a transformation matrix T obtained as follows, h i T T = Wrxm Hrxl , T is the representation of concepts in the latent space where Wrxm and Hrxl is the representation of images in the latent space. We reduce the dimensionality of images and concepts with PCA using as input the matrix T .

Multimodal Visualization Based On Non-negative Matrix Factorization Multimodal Image Collection Visualization

Multimodal Visualization (2)

Figure: Process to obtain the transformation matrix T

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Experimental Setup 2500 images from the Corel image database (100 images per class) Data representation BoF: blocks of 8x8 pixels, SIFT descriptor for each block, Codebook of 1000 patches (k-means) Each image is represented in a histogram with the occurrence of each codebook patch in the image (the closest)

XvT is a vector in R1000 XtT is a binary vector in R25 NMF factorization: X(1000+25)×2500 = W(1000+25)×30 H30×2500

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Experiment 1

Figure: Multimodal visualization with concepts and images

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Experiment 2

We select the closest image to the i-th concept in the latent space. This is reached by selecting the minimum distance among each concept and all the images in the latent space as follows,   Ii = min d wti , wvj , w ∈ W , where Ii is the i-th image to visualize, wti is the i-th concept, wvj is the j-th image, W is the latent space matrix obtained of the NMF, and d (·, ·) is the Euclidean distance between two vectors.

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Experiment 2

Figure: Visualization of the 25 concepts and their corresponding closest images (one per class)

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Experiment 2

Figure: Confusion matrix of experiment 1. An "1" indicates that the closest image to i-th concept match with correct image (same class)

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Experiment 3 We visualize some pair of classes highlighting associated concepts. All images belonging to both classes are visualized.

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Experiment 3

Figure: Visualization of aviation and butterfly

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Experiment 3

Figure: Visualization of cards and forest

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Experiment 3

Figure: Visualization of cats and dogs

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Class Distance Matrix (KL Divergence)

Distance matrix (KL) using PCA

Distance matrix (KL) using NMF-Asymmetric

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Classes as p.d.f In this experiment we model each class visualization as a probability distribution function: We divide the visualization space in a grid of 10x10 cells We count the amount of images in each cell We generate a vector with the probability of occurrence of images in each cell

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Histogram Intersection

Then, we calculate the intersection between each pair of histograms thus: n

Int(hi , hj ) =

∑ min (hi (k), hj (k))

k=1

Now, we build a histogram intersection matrix, which say us how close the classes are each other.

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Graph

Figure: Graph of the intersection matrix. Edges are drawn when the intersection score is higher than 0.5.

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Convex Combination of X

We use a convex combination between Xv and Xt to see the impact of each component in multimodal visualization,     (1 − α)Xv (1 − α)Wv = Hv , αXt αWt where α range from 0 to 1.

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Results Convex Combination (α = 0.1)

Figure: Visualization for r = 0.5 and α = 0.1

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Results Convex Combination (α = 0.1)

Figure: Graph for r = 0.5 and α = 0.1

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Results of Convex Combination (α = 0.9)

Figure: Visualization for r = 0.5 and α = 0.1

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Results of Convex Combination (α = 0.9)

Figure: Graph for r = 0.5 and α = 0.1

Multimodal Visualization Based On Non-negative Matrix Factorization Experimental Evaluation

Results of Convex Combination (α = 0.5) and Normalizing Xv

Figure: Visualization for r = 0.5 and α = 0.1 when Xv is normalized

Multimodal Visualization Based On Non-negative Matrix Factorization Conclusion

Conclusion This paper presented a first step towards the construction of a semantic image exploration system that allows to understand the distribution of images in the collection. We used a Non-negative Matrix Factorization to built a latent space for multimodal data, in which images and text terms can be represented together. We performed qualitative evaluation of the resulting collection visualizations. To study the full potential of this approach, a more systematic evaluation will be required, involving quantitative measures and interactions with real users.

Multimodal Visualization Based On Non-negative Matrix Factorization Conclusion

References

Lee, D. D., and Seung, H. S. Algorithms for nonnegative matrix factorization. Advances in Neural Information Processing Systems 13 (2001), 556–562. Pitkanen, M. J. Z. X. H. A. . M. H., Zhou, X., NewAuthor4, and Muller, H. Using the grid for enhancing the performance of a medical image search engine. In 21st IEEE International Symposium on (2008), In Computer-Based Medical Systems, CBMS ’08, pp. 367–372.

Multimodal Visualization Based On Non-negative ...

Apr 26, 2010 - Traditionally image collection visualization approaches only use visual content to represent image content and to project similarity relationships ...

4MB Sizes 1 Downloads 190 Views

Recommend Documents

Multimodal Visualization Based On Non-negative ...
Apr 26, 2010 - 2 Problem Definition ... visual content to represent image content and to project similarity .... the distribution of images in the collection. We used ...

Active Perception based on Multimodal Hierarchical ...
IEEE International Conference on Robotics and Automation, pages. 6233–6238 ... tions for maximizing submodular set functions-I. Mathematical Programming, ...

a multimodal search engine based on rich unified ... - Semantic Scholar
Apr 16, 2012 - Copyright is held by the International World Wide Web Conference Com- ..... [1] Apple iPhone 4S – Ask Siri to help you get things done. Avail. at.

a multimodal search engine based on rich unified ... - Semantic Scholar
Apr 16, 2012 - Google's Voice Actions [2] for Android, and through Voice. Search [3] for .... mented with the objective of sharing one common code base.

Point-Based Visualization of Metaballs on a GPU
Jun 16, 2007 - For this purpose, we devised a novel data structure for quickly evaluating the implicit ... Figure 7-1 shows a comparison of the three methods.

a decison theory based multimodal biometric authentication system ...
Jul 15, 2009 - ... MULTIMODAL BIOMETRIC. AUTHENTICATION SYSTEM USING WAVELET TRANSFORM ... identification is security. Most biometric systems ..... Biometric Methods”, University of Nevada, Las Vegas. [3]. Ross, A., Jain, A. K. ...

Multimodal Information Spaces for Content-based ...
gies to search for relevant images based on visual content analysis. ..... Late fusion, i.e. combining different rankings, is also referred to as rank ... have been evaluated for image retrieval, using a text search engine and a content- ..... A soft

Nonnegative Matrix Factorization Clustering on Multiple ...
points on different manifolds, which can diffuse information across manifolds ... taking the multiple manifold structure information into con- sideration. ..... Technology. Lee, D. D. ... Turlach, B. A.; Venablesy, W. N.; and Wright, S. J. 2005. Simu

a decison theory based multimodal biometric ...
Jul 15, 2009 - E-MAIL: [email protected], [email protected], [email protected], ... gamma of greater than 1 to create greater contrast in a darker band of .... For the analysis of the iris and the speech templates we are.

Multimodal Information Spaces for Content-based ...
One of the main challenges to develop effective image retrieval systems is ... related web pages with historical information, technical data and tour guides [22].

NONNEGATIVE MATRIX FACTORIZATION AND SPATIAL ...
ABSTRACT. We address the problem of blind audio source separation in the under-determined and convolutive case. The contribution of each source to the mixture channels in the time-frequency domain is modeled by a zero-mean Gaussian random vector with

Research on Moving Objects with Multimodal ... - FernUni Hagen
Database Systems for New Applications, Mathematics and Computer ..... development which is in order to give the answer for such .... for web application.

Research on Moving Objects with Multimodal ... - FernUni Hagen
1.1 Motivation. Consider the .... We call it Generic Location as it can represent locations of moving .... Q1: Find all people walking through the city center area on ...

Multiplicative Nonnegative Graph Embedding
data factorization, graph embedding, and tensor represen- ... tion brings a group of novel nonnegative data factorization ...... Basis matrix visualization of the algorithms PCA (1st .... is a tool and can be used to design new nonnegative data.

Kernel-Based Visualization of Large Collections of ...
dress the problem of learning a matrix kernel for involving domain knowledge, they are not focused ..... proposed strategy is based on a supervised machine learning technique called ... Master's thesis, National University of Colombia, 2008. 2.

Kernel-Based Visualization of Large Collections of ...
edge in the visualization of large collections of medical images. The strat- ... Visualization tools are ... choose a kernel that best preserves the original structure.

Real-time event based visualization of multivariate abstract datasets
Jun 11, 2015 - from developing the dashboard was how to handle the scalability of the ...... as seen in Spence (2001), but also how a modern web application ...

Multiform Glyph Based Web Search Result Visualization
visualization of mixed data sets based on transformed data sets. ... Introduction. Existed in many application areas, the data sets that ... A Star Coordinates-based visualization for ... However, these often create artificial patterns, thus equal.

Real-time event based visualization of multivariate abstract datasets
Jun 11, 2015 - Project provider: Christoffer Luthman ... sent to a self-implemented web server that opens up a websocket connection with the dashboard client ...

Unsupervised Feature Selection Using Nonnegative ...
trix A, ai means the i-th row vector of A, Aij denotes the. (i, j)-th entry of A, ∥A∥F is ..... 4http://www.cs.nyu.edu/∼roweis/data.html. Table 1: Dataset Description.

FAST NONNEGATIVE MATRIX FACTORIZATION
FAST NONNEGATIVE MATRIX FACTORIZATION: AN. ACTIVE-SET-LIKE METHOD AND COMPARISONS∗. JINGU KIM† AND HAESUN PARK†. Abstract. Nonnegative matrix factorization (NMF) is a dimension reduction method that has been widely used for numerous application

Multimodal Metaphor
components depict things of different kinds, the alignment is apt for express- ing pictorial simile. ... mat of simile, this pictorial simile can be labeled AMERICAN NEWS IS LIKE. HORROR NOVEL (see ... sign of multimodal metaphor. Figure 1.

Joint Weighted Nonnegative Matrix Factorization for Mining ...
Joint Weighted Nonnegative Matrix Factorization for Mining Attributed Graphs.pdf. Joint Weighted Nonnegative Matrix Factorization for Mining Attributed Graphs.