Building a Desktop Search Test-bed Sergey Chernov1 , Pavel Serdyukov2 , Paul-Alexandru Chirita1 , Gianluca Demartini1 , and Wolfgang Nejdl1 1

2

L3S / University of Hannover, Appelstr. 9a D-30167 Hannover, Germany Database Group, University of Twente, PO Box 217, 7500 AE Enschede, The Netherlands {chernov, chirita, demartini, nejdl}@l3s.de, [email protected]

Abstract. In the last years several top-quality papers utilized temporary Desktop data and/or browsing activity logs for experimental evaluation. Building a common testbed for the Personal Information Management community is thus becoming an indispensable task. In this paper we present a possible dataset design and discuss the means to create it.

1

Introduction

In the last years several top-quality papers utilized Desktop data and / or activity logs for experimental evaluation. For example, in [4], the authors used indexed Desktop resources (i.e., files, etc.) from 15 Microsoft employees of various professions with about 80 queries selected from their previous searches. In [3] Google search sessions of 10 computer science researchers have been logged for 6 months to gather a set of realistic search queries. Similarly, several papers from Yahoo [2], Microsoft [1] and Google [5] presented approaches to mining their search engine logs for personalization. We want to provide a common public Desktop specific dataset for this research community. The most related dataset creation effort is the TREC-2006 Enterprise Track 3 . Enterprise search considers a user who searches the data of an organisation in order to complete some task. The most relevant analogy between the Enterprise search and Desktop search is the variety of items which compose the collection (e.g., in the TREC-2006 Enterprise Track collection e-mails, cvs logs, web pages, wiki pages, and personal home pages are available). The biggest difference between the two collections is the presence of personal documents and especially activity logs (e.g., resource read / write time stamps, etc.) within the Desktop dataset. In this paper we present an approach we envision for generating such a Desktop dataset. We plan our new dataset to include activity logs containing the history of each file, email or clipboard usage. This dataset will bring a basis for designing and evaluating of special-purpose retrieval algorithms for different Desktop search tasks.

2

Dataset Design

File Formats and Metadata. The data for the Desktop dataset will be collected among the participanting research groups. We are going to store several file formats: TXT, 3

http://www.ins.cwi.nl/projects/trec-ent/

Application File Format Acrobat Reader PDF files MS Word DOC, TXT, RTF MS Excel XSL MS Powerpoint PPT MS Internet Explorer HTML MS Outlook PST Mozilla Firefox HTML Mozilla Thunderbird MSF and empty extension, mbox format Table 1. Logged Application and File Formats

Permanent Information Applied to URL HTML Author All files Recipients Email messages Metadata tags MP3 Has/is attachment Emails and attachments Saved picture’s URL and saving time Graphic files Timeline information Time of being in focus All files Time of being opened All files Being edited All files History of moving/renaming All files Request type: bookmark, clicked link, typed URL HTML Adding/editing an entry in calendar and tasks Outlook Journal Being printed All files Search queries in Google/MSN Search/Yahoo!/etc. Search fields in Internet browsers Clicked links URL Text selections from the clipboard Text pieces within a file and the filename Bookmarking time Bookmarks in Internet browsers Chat client properties Status, contact’s statuses, sent filenames and links Running applications Task queue IP address User’s address and addresses user connects to Email status Change between “received” and “read” status Table 2. Timeline and Permanent Logged Information

HTML, PDF, DOC, XLS, PPT, MP3 (tags only), JPG, GIF, and BMP. Then, each group willing to test its system would submit 1-2 Desktop dumps, using logging tools for a number of applications listed in the Table 1. The set of logged applications can be extended in the future. Loggers save the information which we describe in Table 2. Data Gathering. As the privacy issue is very important here, we propose two options for possible information gathering. 1. Optimistic approach. We assume there are volunteers ready to contribute some of their personal information to the community, given that this information would be redistributed only among a restricted group of participants. As a test case, we gave two laptops to students for half a year. They were able to use them for free, but the condition was that all the information on these laptops will be available for future research. They were also warned not to store highly private information like passwords or credit card numbers. As this approach worked well, we expect that all participating groups will find similar reasonable incentives to attract more volunteers. 2. Pessimistic approach. While some people are ready to share information with their close friends and colleagues, they do not like to disclose it to outsiders. For this

case, there is a way to keep information available only for a small number of people: Personal data is collected from participating groups by some coordinators and preprocessed into the publicly available uniform XML format. Every group can adapt its search prototypes to this format and submit binary files to the coordinators. Runs are then produced locally by a coordinator and results are sent back to the participants. This way, only trusted coordinators have access to the actual documents, while it is possible for all participants to evaluate their results. Similar schemes has been tested in TREC Spam Track 4 , and it might be a necessary solution for future TREC tracks as well, whenever they involve highly private data (i.e. medical, legal, etc.).

3

Relevance Assessments and Evaluation

As we are aiming at real world tasks and data, we want to reuse real queries from Desktop users. Since every Desktop is a unique set of information, its user should be involved in both query development and relevance assessment. Thus, Desktop contributors should be ready to give 10 queries selected from their everyday tasks. This also solves the problem of subjective query evaluation, since users know best their information needs. In this setting queries are designed for the collection of a single user, but some more general scenarios can be designed as well, for example finding relevant documents in every considered Desktop. It is thus possible to see the test collection as partitioned in sub-collections that represent single Desktops with their own queries and relevance assessments. This solution would be very related to the MrX collection used in the TREC SPAM Track, which is formed by a set of emails of an unknown person. The query can have the following format: – – – – –

KIS01 < /num> Eleonet project deliverable June< /query> date:June topic:Eleonet project type:deliverable< /metadataquery> I am combining new deliverable for the Eleonet project.< /taskdescription> I am looking for the Eleonet project deliverable, I remember that the main contribution to this document has been done in June. < /narrative>

We include the field so that one could specify semi-structured parameters like metadata field names, in order to narrow down the query. The set of possible metadata fields would be defined after collecting the Desktop data. The Desktop contributors must be able to assess pooled documents 6 months after they contributed the Desktop. Moreover, each query will be supplemented with the description of context (e.g., clicked / opened documents in the respective query session), so that users could provide relevance judgments according to the actual context of the query. As users know their documents very well, the assessment phase should go faster than normal TREC assessments. For the task of known-item search, the assessments are quite easy, since only one (at most several duplicates) document is considered relevant. For the adhoc search task we expect users to spend about 3-4 hours to do relevance assessment per query. 4

http://plg.uwaterloo.ca/˜gvcormac/spam/

4

Proposed Tasks

1. AdHoc Retrieval Task. Ad hoc search is the classic type of text retrieval when the user believes she has relevant information somewhere. Several documents can contain pieces of necessary data, but she does not remember whether or where she stored them, and she is not sure which keywords are best to find them. 2. Known-Item Retrieval Task. Targeted or known-item search task is the most common for the Desktop environment. Here the user wants to find a specific document on the Desktop, but does not know where it is stored or what is its exact title. This document can be an email, a working paper, etc. The task considers that the user has some knowledge about the context in which the document has been used before. Possible additional query fields are: time period, location, topical description of the task in which scope the document had been used, etc. 3. Folder Retrieval Task. It is very popular among users to have their personal items topically organized in folders. Later they may search not for a specific document, but for a group of documents in order to use it later as a whole - browse them manually, reorganize or send to a colleague. The retrieval system should be able to estimate the relevance of folders and sub-folders using simple keyword queries.

5

Conclusion

Building a Desktop IR testbed seems to be more challenging than creating a Web Search or an XML Retrieval dataset. In this paper we presented the concrete parameters for defining the features of such a Desktop Dataset and discussed the possible means for creating it, as well as utilizing it for algorithm assessments.

References 1. E. Agichtein, E. Brill, S. Dumais, and R. Ragno. Learning user interaction models for predicting web search result preferences. In SIGIR ’06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, pages 3–10, New York, NY, USA, 2006. ACM Press. 2. R. Kraft, C. C. Chang, F. Maghoul, and R. Kumar. Searching with context. In WWW ’06: Proceedings of the 15th international conference on World Wide Web, pages 477–486, New York, NY, USA, 2006. ACM Press. 3. F. Qiu and J. Cho. Automatic identification of user interest for personalized search. In WWW ’06: Proceedings of the 15th international conference on World Wide Web, pages 727–736, New York, NY, USA, 2006. ACM Press. 4. J. Teevan, S. T. Dumais, and E. Horvitz. Personalizing search via automated analysis of interests and activities. In SIGIR ’05: Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval, pages 449–456, New York, NY, USA, 2005. ACM Press. 5. B. Yang and G. Jeh. Retroactive answering of search queries. In WWW ’06: Proceedings of the 15th international conference on World Wide Web, pages 457–466, New York, NY, USA, 2006. ACM Press.

Building a Desktop Search Test-bed - CiteSeerX

2 Database Group, University of Twente, PO Box 217, 7500 AE Enschede, The ... Table 2. Timeline and Permanent Logged Information. HTML, PDF, DOC, XLS ...

82KB Sizes 0 Downloads 253 Views

Recommend Documents

Building A Cognitive Radio Network Testbed
Mar 10, 2011 - Testbed. Presenter: Zhe Chen. Department of Electrical and Computer Engineering ... USRP2 needs a powerful host computer. 9. 3/10/2011 ...

Building A Cognitive Radio Network Testbed
There have been some wireless network testbeds, such as the open access research testbed for next-generation wireless networks (ORBIT) [13] and the ...

Building A Cognitive Radio Network Testbed
We are building a CRN testbed at Tennessee Technological. University. ... with 48 nodes [15], which is an exciting advance in this area. ..... Education, 2007, pp.

pdf desktop search
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. pdf desktop ...

Design Considerations for a Mobile Testbed - KAIST
No matter what wide-area networking technology we use, we rely on the mobile service provider for ac- cess to the deployed mobile node. Many mobile ser- vice providers have NAT (Network Address Transla- tor) boxes at the gateway between the cellular

Design Considerations for a Mobile Testbed - kaist
A typical testbed for a mobile wireless technology .... the WiBro modem on Windows or develop a Linux ... is conceptually very similar to Mobile IP[7]. We can.

Design Considerations for a Mobile Testbed - kaist
node will come up and shut down. We can ... rial line is easy to set up, regardless of the mobile ser- ... Windows virtual machine and the Linux node connects.

Characterizing Optimal Syndicated Sponsored Search ... - CiteSeerX
other, and the search engine as the market maker. We call this market the ... tions, [3] showed that the greedy ranking employed by. Google agrees with the ...

Sponsored Search Auctions with Markovian Users - CiteSeerX
Google, Inc. 76 Ninth Avenue, 4th Floor, New ... tisers who bid in order to have their ad shown next to search results for specific keywords. .... There are some in- tuitive user behavior models that express overall click-through probabilities in.

Equilibrium Directed Search with Multiple Applications! - CiteSeerX
Jan 30, 2006 - labor market in which unemployed workers make multiple job applications. Specifically, we consider a matching process in which job seekers, ...

Design Considerations for a Mobile Testbed
merous new applications and services are under devel- ... are: mobile Emulab [6] and DOME [1]. ... ment, PlanetLab nodes use only one network interface,.

Characterizing Optimal Syndicated Sponsored Search ... - CiteSeerX
other, and the search engine as the market maker. We call this market the ... markets requires the use of a relatively new set of tools, reductions that preserve ...

Vector Symbolic Architectures: A New Building Material for ... - CiteSeerX
Holographic Reduced Representation, Binary Spatter Codes, connectionism ... relied on the use of specialized data structures and algorithms to solve the broad ...

Characterizing Result Errors in Internet Desktop Grids - CiteSeerX
when the application is writing to an output file or checkpoint, only a partial number in-memory data blocks could have been flushed to disk (not necessarily ...

Characterizing Result Errors in Internet Desktop Grids - CiteSeerX
and deployed a desktop grid application across several thousand hosts ... number in-memory data blocks could have been flushed to disk (not necessarily ...

Task Detection for Activity-Based Desktop Search
Jul 24, 2008 - draft, but not her email with a paper review or our joint conference ... and ad-hoc search, while latter happens presumably less of- ten in a ...

Includes Google Desktop Search, Google Toolbar, Google Deskbar ...
Page 1. Includes Google Desktop Search, Google Toolbar, Google Deskbar, and Picasa Photo Organizer.

A Solver for the Network Testbed Mapping Problem - Flux Research ...
ing an extra node (thus preferring to waste a gigabit in- terface before choosing ...... work experimentation, the primary concern is whether, for example, a node is ...

A Solver for the Network Testbed Mapping Problem - Flux Research ...
As part of this automation, Netbed ...... tions, we compute the average error for each test case. Ideally ... with available physical resources, the goal of these tests.

Towards A Real-time Cognitive Radio Network Testbed
defined radio (SDR) reflects this trend. Still, the combina- ... security—the central challenge in smart grid. .... This trend is catalyzed by recent hardware advance ...

TWIN Node, A Flexible Wireless Sensor Network Testbed - EWSN
node via a Raspberry Pi. • WiFi based back channel that replaces active USB ca- bles. • Performance evaluation of battery and USB powered wireless sensor nodes. • Remote programming and monitoring of wireless sen- sor nodes. 237. International

Monitoring, Sanctions and Front-Loading of Job Search in ... - CiteSeerX
Email: [email protected], phone + 49 6131 39 23233, fax: + 49. 6131 39 ..... time Bellman Equations is available in the Internet Appendix, Section A.

understanding expert search strategies for designing user ... - CiteSeerX
Pollock and Hockley (1997) studied web searching of Internet novices and found that novices .... The word “maps” will not occur anywhere in the documents.

In search of an SVD and QRcp Based Optimization ... - CiteSeerX
optimize empirically chosen over-parameterized ANN structure. Input nodes present in ... (corresponding author to provide phone: +91-3222-283556/1470; fax: +91-. 3222-255303 ... of the recorded waveform [4], [5] and allows computer aided.