An Integrated Recommendation, Browsing and Search Interface Using Tags Christian Wartena1, Rogier Brussee1, Martin Wibbels1, Ynze van Houten1 1

Novay, Enschede, The Netherlands

E-mail: [email protected] Abstract: Tagging with free form tags is becoming an increasingly important indexing mechanism. However, tags have characteristics that require special treatment when used for searching or recommendation because they show much more variation than controlled keywords. In this paper we present a method that puts this large variation to good use. We introduce second order co-occurrence and a related distance measure for tag similarities that is robust against the variation in tags. From this distance measure between tags it is straightforward to derive methods to analyze user interest, compute recommendations and search in a tagged collection in a unified way. We present an interface in which recommendation, browsing and searching in a tagged collection based on the proposed techniques are integrated according to principles of information foraging theory. Keywords: Tagging, Browsing, Recommendation, User Interfaces

1

INTRODUCTION

The European FP7 project MyMedia is developing techniques to help users deal with the overwhelming and ever growing amount of content that is available on digital channels. Personalization and recommendation are the key technologies pursued in the project. In this paper we present a method for helping users navigate through huge collections by exploiting the possibilities of collaborative tagging. The focus of the paper is not on the tagging process but on the possibilities to use tags for search and recommendation and on the way these techniques can be integrated in a user interface. Tagging is becoming an increasingly important tool for people to organize their information in collections of various types. For example, it allows people to bookmark the items they are interested in and to organize them into various topic sets. Once sufficiently many items are tagged, the tags can also be used to search items on a certain topic. Since tags are associated to both items and users, tags can also be used for generating personalized recommendations. However, unlike keywords or subject headings assigned by information professionals, tags usually lack any form of explicit organization and normalisation. Therefore search and recommendation need to be adapted to these characteristics of tagging systems. In this paper we present a theoretical base for treating tags

that can be used for searching and recommendation in tagged collections. After introducing this theoretical base we present a user interface that combines recommendation, browsing along a personalized structure and searching in one interface. Thus this interface provides an example of a breaking down of the traditional boarders between browsing, receiving recommendations and searching. We believe that the proposed mixture of techniques better serves the diffuse information need of people exploring large collections of multi media items than offering each of the techniques separately. The organisation of this paper is as follows. In section 2 we discuss related work. In section 3 and 4 we discuss the statistical techniques used to relate tags users and items, and our approach to user interests based on clustering related tags. In section 5 we discuss the user interface that was created to facilitate tag based searching browsing and recommendation. Finally we draw some conclusions.

2

RELATED WORK

There is a growing literature on tagging and the use of tagging in search and retrieval. The user interface described in this paper covers several of these aspects. The base of the system is a similarity or distance measure between tags. Other aspects are the lay-out of tag clouds and user profiling and recommendation of unseen items using tags.

2.1 Tag classification and similarity Measures Hotho et al. in [15] consider a weighing scheme for tags based on the same (user, tags, resource) co-occurrence data we use in this paper. However, inspired by the Pagerank, they use the stationary distribution of an associated Markov chain on the weighted bipartite graph of tags and resources (folkrank). The ternary (user, tag, resource) structure is also used by Clements et al in [6] to detect synonyms. Their main point is that synonyms are seldom used as tags by the same people and that synonym terms the item distributions of the tags should therefore have positive and the user distribution of the tags have a negative correlation. We use the same library-thing data collections that they do. The work by Begelman et al. [1] stresses the importance of browsing tags and the use of clustering in that context. Their methods for finding similar tags and clusterings are completely different from our methods. Their methods are based on finding cut-offs for the frequency of cooccurring tags based on the shape of the co-occurrence

Corresponding author: Christian Wartena, Novay, P.O. Box 589, 7500 AN Enschede, The Netherlands, +31-53-4850355, [email protected]

distribution and using a spectral clustering algorithm on the remaining relatively sparse graph. The similarity measure that seems most closely related to ours can be found in Cattuto et al. [5]. They introduce several similarity measures based on (user, tag, resource) triple data, one of which, tag context similarity, is similar to ours. However, this measure is based on co-occurrence of tags in posts, where a post is a set of tags that is added in one bookmarking step by a user. Moreover, to define similarity they use a cosine distance rather than the information theoretic distance that we use. The resource context similarity they define uses the same data we use, but their measure is a first order co-occurrence measure whereas we use second order co-occurrence. They find that tag context measure (also a second order cooccurrence measure!) is well suited to find semantically closely related or synonymous terms in the global context of all tagged data In a different direction, in [3] Bischoff et al classify tags based on the apparent intended usage of the tags by the tagger. They find that depending on the source of the tags only about half of tags can be considered topic related. For the music related site they studied, roughly half the tags can be related to genre of the music.

2.2 Tag Based Recommendation Automatic content recommendation has already become a mature field of academic study. We refer to [10] for an overview. A number of standard algorithms has evolved, most of which are based on implicit or explicit feedback from users on items, usually using some form of item rating. As large collaboratively tagged data collections have only become available recently, there are no standard techniques for tag based recommendation yet We find two methods to use tags for recommendation in literature. First, tag-aware recommender systems are based on user feedback, but they use tags to compute additional user-user or item-item similarities in order to improve the results of collaborative filtering techniques. A representative example of this category is [23]. With the second approach, which we will follow in this paper, recommendation is completely based on tags. Hung et al. [18] recommend items with a set of tags that is most similar to the set of tags used by a user. As in our approach, this similarity is based on a tag-tag similarity. As a base for this similarity, they take co-occurrence of tags in the user-tag matrix. To compute the similarity between a user and an item, they determine the most similar user tag for each item. The similarity between the item and the user is defined as the sum over all item tags of all these similarities. Firan et al [8] discuss several variants In alternative methods they use collaborative filtering on the user-tag matrix to find a new set of tags for a user. This set is compared with the tags from an item to compute a user-item similarity. Again the most similar items are recommended. The similarity measure is defined by the cosine between the tag vectors. The second group of methods is similar, but directly uses the tags of a user, skipping the recommendation of tags. They evaluate

their algorithms in a user experiment with lastfm data. They find that the first group of algorithms performs significantly worse than the baseline collaborative filtering algorithm based on ratings. The second group of algorithms clearly outperforms the collaborative filtering. In Jaschke et al.[19], folk rank is used for recommendation in combination with collaborative filtering.

2.3 Tag Cloud Lay-Out Tag clouds are a popular way of displaying tags on web sites. Usually tags are ordered alphabetically, at random or by frequency. One of the first studies that consider using tag similarities for ordering tags in tag clouds is [14] Here clustering is used to find similar tags that are displayed on the same line. From a more general perspective the problem of tag cloud layout using tag similarities is an instance of multi dimensional scaling [28]. However, we are not aware of any applications of this technique to tag cloud lay out. Also a number of user studies is published on the usefulness of tag clouds [22],[21],[12]. These studies show above all that font size and position of words has a clear influence on the time users need to find a tag. However, none of these studies compares a semantically motivated grouping of tags as proposed by [14] with other orderings. Even if this intelligent kind of ordering is not used it turns out that tag clouds can be helpful for tasks like browsing or giving a visual summarization. Nonetheless, for a task like searching a known item, an alphabetically ordered list is much more efficient.

3

TAG DISTRIBUTIONS

One of the main characteristics of collaborative tagging systems is that users are not restricted in their choice of tags. Most users experience this as a feature that makes the use of the system easy. At the same time, this freedom is the main bottleneck for using tags in retrieval tasks.Different tags might have been used for the same concept, which therefore makes it difficult to find all items relevant for a certain tag. To overcome this problem we have to find the tags that are conceptually related. A natural approach to do this automatically is to use cooccurrence patterns of tags to determine tag similarity. The underlying idea here, the so called distributional hypothesis [9], [13], is that tags that have often been associated to the same items (or with some caveats, have been given by the same people) are likely to be semantically related (see also [5], [25] for experimental evidence in the tagging domain). For example if many people associate the tag zoo and lion to the same picture, one might conclude that lions and zoo have something to do with each other. Of course people may tag the same picture with tags that only make sense in their private context, for example birthday for a picture of a lion in the zoo if the picture happened to be made on a birthday party. However, such co-occurrences will be less frequent if many people tagged the same item, and it will yield a less pronounced pattern because people have birthday parties at many different places.

In [26] and [25] we have presented an approach to cooccurrence that uses second order co-occurrence of tags and shown that it gives better results for tag similarity than direct co-occurrence. Whereas first order cooccurrence only looks at the co-occurrence of one tag with one other tag, second order co-occurrence considers the co-occurrence of one tag with all other tags. This whole pattern of co-occurrences is more stable and more informative than a single co-occurrence In the remainder of this subsection we will make the above informal description more precise. Readers not interested in the technical elaboration of these ideas can continue reading with the next subsection. Technically the above can be understood in the following way: for each tag we count all tag co-occurrences, i.e. for each tag, the number of times is given to an item with a fixed particular tag. Normalising by the total number of tags this gives a distribution over tags, which we call the co-occurrence distribution. The intuitive notion of semantic similarity of tags can now be operationalized as the similarity between the co-occurrence distributions. Note, that two tags can have similar co-occurrence distributions while their mutual co-occurrence is actually very low. In [11] the observation is made that this is in fact typical for synonyms in texts. Our results indicate that this observation holds for tags as well. To make things even more formal, let ƒ q(t|d) be the tag distribution of item d, and ƒ Q(d|z) be the item distribution of tag z These are probability distributions that describe how tag occurrences of a given item d are distributed over different tags, and symmetrically how the occurrences of a given tag z are distributed over different items. Now define the co-occurrence distribution of a term z as:

p z (t ) = ∑ q(t | d )Q(d | z ) d

The co-occurrence distribution is in fact the weighted average of the term distributions of documents, where the weight is the relevance of d for z given by the probability Q(d|z). To define similarity between tag distributions we use the Jensen Shannon divergence of the distributions. This is an information theoretic measure that is defined for two probability distributions and is defined as follows

JSD( p, q ) =

1

2

D ( p m) + 1 2 D ( q m)

where

m=

1

2

( p + q)

is the mean distribution of p and q and where D(p||q) is the Kullback Leibler divergence, defined by

⎛ p (t ) ⎞ ⎟⎟ D ( p q ) = ∑ p (t ) log⎜⎜ t ⎝ q (t ) ⎠ The Kullback Leibler divergence has a nice interpretation as the average number of bits per tag saved by using an

optimal compression scheme that takes into account the actual distribution of tags p rather than some assumed distribution q. In our setting the distributions are usually very sparse. For efficient computation with sparse distributions it is convenient to rewrite the Jensen Shannon divergence as

JSD( p, q) = log 2 + +

⎛ q(t ) ⎞ ⎛ p(t ) ⎞ 1 ∑ p(t ) log⎜⎜ p(t ) + q(t ) ⎟⎟ + q(t ) log⎜⎜ p(t ) + q(t ) ⎟⎟ 2 t: p (t ) ≠ 0 ∧ q (t ) ≠ 0 ⎠ ⎝ ⎠ ⎝

In [7] it is shown that the square root of the Jenson Shannon divergence gives a metric distance measure. In particular, the triangle inequality holds for this measure. This is an important property that we need for an efficient implementation.

3.1 Searching with Tags The divergence of co-occurrence distributions not only gives us an interesting and stable similarity measure between tags; it also allows us to compute similarities between tags and items or between tags and users. In fact, since we represent a tag by its co-occurrence distribution, which is a distribution over tags, we can compare tags with items by representing an item as the distribution of the tags that have been assigned to the item. Likewise we compare tags to users by representing a user as the distribution of tags they have assigned. Since we now have a uniform representation as a tag distribution we can also compute divergences of these distributions with respect to each other. Using the (co-occurrence) distribution and the divergence between distributions solves a number of problems for search in tagged collections. Consider the following example. We search for items relevant for the tag British history. If we search for items with the tag British history we will miss items that are highly relevant but tagged with e.g. English history, history of Great Britain, Medieval England etc. On the other hand we will find items in a high ranked position that have the tag British history, but that in fact have a different subject or are only vaguely related to British history. If we search using the co-occurrence distribution we are in fact making a kind of query expansion in which every term gets a weight. Thus we improve recall because we will also find items that are not tagged with British history but with related tags. Furthermore, we look at the divergence of the distribution of tags of an item with the co-occurrence distribution of the queried tag which gives a better match if an item with tags related to British history. Thus we increase precision as well because items will be ranked higher that are not only tagged with British history but additionally with many related tags, while items with a large number of tags irrelevant to British history will have a larger divergence and thus be ranked lower. Regrettably, searching the whole collection for relevant items now becomes a very costly task: searching nearest neighbors in a metric,space. Given the approach sketched

above we cannot use a simple index to find items with a certain tag. Moreover the computation of a divergence is relatively costly. For efficient search for the k nearest neighbours of a query we use the vantage point tree. search algorithm [27]. It requires using a true distance measure for which the triangle inequality holds. Another nice property of the representation of a tag by a distribution over tags is that the (weighted) average of a set of tags can be defined in a natural way as the average of their co-occurrence distributions. This gives us the possibility to compute the centre of a tag cloud and search for items close to its centre. Finally note that the co-occurrence distribution gives us an alternative distribution for a user that has assigned tags or to a tagged item: we can not only represent them directly by the distribution of tags they have given or the tags that are associated to them, but also by the average co-occurrence distribution of these tags.

4

USER INTERESTS

Users assign tags to items. This tells us something about these items as well as the users. Tags that someone frequently uses might reflect some of his interests in the context of the collection under consideration. These interests could be used to recommend the user new items. Therefore we need a way to distil the user’s interests from the set of tags that he has used. The individual tags might be too detailed to represent interests, since people tend to use dozens or hundreds of different tags. On the other hand, the overall weighted average of all user tags blurs all topics to one uniform grey mixture. Clustering tags and finding the users distribution over clusters seems a natural way to go.

4.1 Tag clusters For clustering we used a straightforward agglomerative hierarchical clustering algorithm. Initially each tag is a cluster. Subsequently in each step the two clusters are merged until a stopping criterion is fulfilled As a stopping criterion we require the number of clusters to be equal to the square root of the number of tags. More advanced criteria, like optimization of the Dunn index or the Calinski Harabasz index turned out not to give useful results. To select the clusters that are merged in each step we determine the pair of clusters for which the sum of squares of all distances between their elements is minimal. It is easy to see that this criterion guarantees that at each step the merger is chosen that yields the best Calinksi Harabasz index [4]. Since the sum of squares of distances tends to be larger when the number of elements increases, this method has a tendency to find more or less equally sized clusters. We apply the clustering to the set of tags of each user. So there is no overall clustering of tags, but the topic clusters are determined individually. Thus, e.g. the tag British history for one user might end up in a cluster about England, while for another user it belongs to a cluster on general history or to a cluster on the history of London. While the tags involved in the clustering are only those used by a given user, we use for each tag the co-

occurrence distribution that is obtained by taking into account the whole collection. The set of items tagged by one user is generally too small for reliable statistics.

4.2 Recommendation Once we have computed the tags cluster for a user we determine the centre of each cluster by computing the weighted average of the co-occurrence distributions of the tags. We take the number of times the user used the tag. as weight for averaging The average distribution is used to search the most similar items in the database. The results from each cluster are combined to produce a final recommendation This approach for generating recommendations described above is in some aspects similar to the one from Firan et al. [8]. We also compute the similarity between the vector of tags on the item and the vector of user tags to find a set of items most interesting for the user. However, we first separate the set of tags into clusters representing different interests. The second major difference is that we expand our query vector by taking the average co-occurrence distribution of the tags in the cluster rather than the tags themselves. This is in some sense similar to the approach in the first group of algorithms in [8]. Apart from the different method for expanding the vector, the essential difference is that we use the item-tag matrix for expansion, while Firan et al. use the user-tag matrix. As discussed above the item-tag matrix is much more suited to find synonyms, while the user-tag matrix probably is suited better to find cross links between various interests.

5

USER INTERFACE

In the previous sections we have presented a convenient representation of tags in their context, their so called cooccurrence distributions. We have shown a number of techniques that follow naturally once we have this representation such as searching, clustering or computing averages of a set of tags. In this section we show how these various techniques can be integrated in one user interface using concepts from information foraging theory.

5.1 Information Foraging The user interface supports forms of interaction as described in Information Foraging Theory (IFT) [20]. IFT is a psychological theory explaining how people interact with their information environment. The theory also provides useful tools for design and evaluation of information environments [24]. IFT states that people search for information by minimising their effort in much the same way that animals forage for food. The forager is constantly adapting the decision-making process for the direction to follow, by preferring information-seeking strategies that yield most useful information per unit cost. For a searching person, the information environment has a heterogeneous structure with pieces of information being offered together in patches. Within a patch, a person can decide to forage the patch or switch to another patch. People make navigational decisions guided by scent. When there is a match between (associations with) easily recognisable elements in the information environment and

(associations with) the user’s goals or interests, the elements give off scent. Scent can be found within an information source (e.g. a book) as well as in links and metadata that refer to that source (e.g. a tag). In the user interface presented here, the patches are groups of items that are clustered on the basis of being relevant to the same (set of) tags. This is also how they are represented in the interface: tags are links providing access to the item groups. The tag can carry more or less scent for an individual user, depending on how much the tag’s meaning associates to the user’s goals or interests and how visible it is. The user will follow the tag/link with the highest scent. At the link’s destination, the user will try to detect scent in the available items (within-patch browsing). At any state of interaction, the user is provided with related tags that may tempt the user to explore other patches (between-patches browsing). As such, the user interface rightfully supports “natural” human search behaviour as described in IFT, meaning that the user is able to constantly adapt decision-making and direction based on perceived scent in the information environment.

5.2 Dataset To test our ideas we use two different datasets. The first is a subset of the data from LibraryThing, a web service for managing book collections, compiled by Maarten Clements [16]. The second set is the tagged MovieLens dataset [17]. We did not try to build alternative interfaces for the websites these data stem from, but implemented just a few aspects concerning search and browsing that allow demonstrating the potential of our techniques.

5.3 Description of the Interface In the interface the user will see two panes, a pane with items (book covers in the LibraryThing case, movie posters in the MovieLens case) along with their (most frequent) tags, and a pane with a tag cloud. An example is given in Figure 1 Tag cloud and recommendations 1

Figure 1 Tag cloud and recommendations

Initially, the user will not see a cloud of all the tags he has

used, but a reduced set of tags, each tag representing a cluster of tags, which is likely to correspond to an interest of the user. To select the tag representing a cluster we take the tag that is closest to the cluster centre. At the same time, in the item pane the user will see a list of items that he has not yet bookmarked and that best fit into these clusters. This list consists of the top-n recommendations for each cluster. Since we show results for each cluster, the list will have a nice variation of topics. In the first screen the user can click on one of the tags representing a cluster. He will then move on to a screen in which all tags of the cluster are shown. The item pane will now only show recommendations for that cluster. Such a recommendation is still a personalised view because clusters are composed individually for each user with the tags he used. Thus the clusters provide a kind of personalized structure to use for browsing through the collection. In each view, the user can also click on a tag, either in the tag cloud or in the list of tags from an item. As a result, a list of items most relevant to that tag will be presented. The corresponding tag cloud is the cloud of the most frequent tags found on the presented items. If the user clicks a tag he is in fact searching with a tag as query. The results he will get are no longer personalized but allow the user to get some insight in the way a tag is used in the collection.

5.4 Tag Cloud Layout A cluster of tags is roughly associated with one topic. More precisely, the algorithm explained in section 3.1 is intended to cluster semantically related tags and is based on the heuristic that co-occurring tags are semantically related. Nevertheless, tag clouds can cover a range of subtopics. The tag cloud should help the user find interesting tags or topics that might lead him to items he or she wants to inspect. This is clearly not a search task. Therefore, it is probably not helpful if tags are displayed in alphabetical order (see e.g. [22]). We therefore try to support the user by trying to display tags in such a way that related tags appear in the same region. To do so, we basically follow the approach of Hassan and Herrero ([14]), but using a different clustering algorithm. Hassan and Herrero do not explicitly explain the method used to order tags on a line and lines in the cloud, so our algorithm might slightly differ at that point. For determining the sub clusters we use the single link algorithm with an additional upper bound on the maximum cluster size. As a result, each cluster can be displayed on a single line. To order tags on a line we use the following rules. We pick the tags with the largest mutual distance (in the sense of section 3) from the cluster. These tags will be the left and right end points of the line. Next we take the tag closest to one of the end points and add that tag to the particular end point. This process is repeated until all elements are added to one of both ends. To determine the order of the lines the same procedure is used, applied to the most central elements of each line. Finally, it has to be decided which of the endpoints is on the left and which one is on the right. For the first line an

random choice is made. Once this choice is made, for the other lines it is determined by comparing the end points to the left (or right) end of the first arbitrarily chosen one. An example of a resulting tag cloud is given in Figure 2

Figure 2: Tag cloud

6

CONCLUSION

In the present paper we have presented co-occurrence distributions of tags as a representation of tags that can form a single base for several tasks related to search and recommendation. Divergences of these distributions can be interpreted as a second order co-occurrence measure and serve as similarity measure between tags. Since users and items can also be characterized by a distribution of tags, we obtain a natural distance measure between tags and items and users and items as well. To generate recommendations that fit the user’s interests and have enough variation, we cluster related tags for each user individually. Finally, we show how these concepts can be integrated in one user interface. Future work has to deal with evaluation of the methods presented here. The MovieLens dataset provides nice possibilities to compare the results of out tag based recommendation with other (standard) recommendation techniques based on ratings. Furthermore the user interface should be tested in a usability study.

Acknowledgements We would like to thank Maarten Clements (Technical University Delft) for making available his set of LibraryThing data. The research leading to these results is part of the MyMedia project (http://www.mymediaproject.org) and has received funding from the European Community's Seventh Framework Program (FP7/2007-2011) under grant agreement n° 215006. The research was also coordinated within the PetaMedia network of excellence (http://www.petamedia.eu), FP7 grant agreement n° 216444.

References [1] Begelman, G. Keller, P. Smadja, F. Automated Tag Clustering Improving search and exploration in tag space. Collaborative Web Tagging Workshop at WWW2006, Edinburgh, Scotland, [2] Benz, B. Tso, K.H.L, Schmidt Thieme, L. Supporting collaborative hierarchical classification: bookmarks as an example., Computer Networks 51 (2007) p. 4574-4585. [3] Bischoff, K. , C. Firan, W. Nejdl, R. Paiu. Can All Tags Be Used for Search?. CIKM’08 [4] Calinski, T. and J. Harabasz, A dendrite method for cluster analysis, Communications in statistics vol 3, pp. 1—27 (1974)

[5] Catutto, C. Benz, D., Hotho, A., stumme, G. Semantic Grounding od Tag Relatedness in Social bookmarking Systems. Sheth et al. ed. International Semantic Web Conference (ISCW2008) LNCS 5318 p 615-631 Springer 2008 [6] Clements, Maarten, Arjen P. de Vries and Marcel J. T. Reinders: Detecting synonyms in social tagging systems to improve content retrieval. SIGIR 2008: 739-740 [7] Endres, Dominik M. and Johannes E. Schindelin, A New Metric for Probability Distributions, IEEE Transactions on information theory, vol. 49, no 7 July 2003. [8] Firan, C., Nejdl, W., & Paiu, R. The Benefit of Using Tag-Based Profiles. Proceedings of the 2007 Latin American Web Conference, Santiago, 2007. [9] Firth, J.R. A synopsis of linguistic theory 1930-55. Studies in Linguistic Analysis (special issue of the philological society) 1952-59, 1-32 (1957) [10] Gediminas Adomavicius and Alexander Tuzhilin. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Transactions on Knowledge and Data Engineering, 17(6):734{749, June 2005. [11] H. Schütze and J.O. Pederson. A cooccurrence-based thesaurus and two applications to information retrieval. In Proceedings of RIA Conference, pp. 266-274, 1994 [12] Halvey, K., and Keane, M. T. An Assessment of Tag Presentation Techniques. Proc. WWW ’07. 1313-1314. [13] Harris, Z.S. Mathematical structures of Language. Wiley, New York (1968) [14] Hassan-Montero, Y., & Herrero-Solana, V., “Improving tagclouds as visual information retrieval interfaces,” Proc. InfoSciT2006. [15] Hotho, A. and Jaschke, R. and Schmitz, C. and Stumme, G. Information retrieval in folksonomies: Search and ranking, Lecture Notes in Computer Science 4011, p. 411-426 Springer (2006) [16] http://ict.ewi.tudelft.nl/~maarten/LT [17] http://www.grouplens.org/taxonomy/term/14 [18] Hung, C.C., Y.C. Huang, J.Y. Hsu, D.K.C. Wu. Tag-Based User Profiling for Social Media Recommendation. https://www.aaai.org/Papers/Workshops/2008/WS-08-06/WS08-06006.pdf [19] Jaschke, R. and Marinho, L. and Hotho, A. and Schmidt-Thieme, L. and Stumme, G., Tag recommendations in folksonomies, Knowledge Discovery in Databases: PKDD 2007 Lecture Notes In Computer Science volume 4702 Springer (2007), p. 506-514 [20] Pirolli, P. anf Card, S. K. (1999). Information foraging. Psychological Review, 106, 643-675. [21] Rivadeneira, A.W., Gruen, D.M., Muller, M.J., and Millen, D.R. Getting our head in the clouds: toward evaluation studies of tagclouds. Proc. CHI ’07. 995-998. [22] Sinclair, J., and Cardew-Hall, M. The folksonomy tag cloud :when is it useful? Journal of Information Science. 6 (1), 15-23. Feb 1, 2008. [23] Tso-Sutter, K. H. L., L.B. Marinho, and L. Schmidt-Thieme. Tagaware Recommender Systems by Fusion of Collaborative Filtering Algorithms. In Proc. of 23rd ACM Symposium on Applied Computing, pages 16-20, 2008 [24] Van Houten, Y. (2009). Searching for Videos: the Structure of Video Interaction in the Framework of Information Foraging Theory. Dissertation - Telematica Instituut Fundamental Research Series, vol. 023. Enschede, the Netherlands: Telematica Instituut [25] Wartena, Christian and Rogier Brussee: Instance-Based Mapping between Thesauri and Folksonomies. International Semantic Web Conference 2008: 356-370 [26] Wartena, Christian and Rogier Brussee: Topic Detection by Clustering Keywords. DEXA Workshops 2008: 54-58 [27] Yianilos, Peter N., Data structures and algorithms for nearest neighbor search in general metric spaces, Proceedings of the fourth annual ACM-SIAM Symposium on Discrete algorithms, p.311-321, January 25-27, 1993, Austin, Texas, United States [28] Young, F. W and R. M. Hamer. Multidimensional Scaling:History, Theory, and Applications. Lawrence Erlbaum Associates, Hilldale, N.J., 1987. .

An Integrated Recommendation, Browsing and Search ...

The base of the system is a similarity or distance measure between tags. .... this technique to tag cloud lay out. ... tags this gives a distribution over tags, which we call the .... possibility to compute the centre of a tag cloud and search for items ...

NAN Sizes 0 Downloads 161 Views

Recommend Documents

An Integrated Recommendation, Browsing and Search ...
triple data, one of which, tag context similarity, is similar to ours. However, this measure is based on co-occurrence of tags in posts, where a post is a set of tags ...

A recommendation system for browsing digital libraries - Isa-Cnr
that offers a web-based access to a multimedia collection of digital reproductions of paintings. .... taxonomic and signature based distances for images, as in.

Browsing-oriented Semantic Faceted Search
search solutions assume a precise information need, and thus optimise rel- ... 4], databases [7, 3, 2] and semantic data [18, 21, 13] (referred to as semantic ...

A recommendation system for browsing digital libraries - Isa-Cnr
browsing system methodologies to recommendation system techniques. In particular, regarding this ... in an automatic way and code in apposite data structures these information. ...... and Angelo Chianese (DIS, University of of Naples, email:.

A recommendation system for browsing digital libraries
H.3 [Information Storage and Retrieval]: Information. Search and Retrieval .... that offers a web-based access to a multimedia collection of digital reproductions of ...

Private Browsing: an Inquiry on Usability and ... - Rutgers WINLAB
having inaccurate mental models of the software and ordi- nary users were reluctant to ...... http://support.apple.com/kb/ph5000, 2014. [16] D. J. Ohana and N.

Private Browsing: an Inquiry on Usability and ... - Rutgers WINLAB
computer from viewing the browsing history and other re- .... Table 1: Table shows the comparison of private browsing mode in five popular ... an Associate (two year) degree. ...... tional Science Foundation under Grant Numbers 1223977.