Cool, concise, and contemporary: A recommender system for scholarly articles September 2010, Santiago, Chile Tamar Sadeh, Director of Marketing
Copyright Statement ● All of the information and material inclusive of text, images, logos, product names is either the property of, or used with permission by Ex Libris Ltd. The information may not be distributed, modified, displayed, reproduced – in whole or in part – without the prior written permission of Ex Libris Ltd. ● TRADEMARKS Ex Libris, the Ex Libris logo, Aleph, SFX, SFXIT, MetaLib, DigiTool, Verde, Primo, Voyager, MetaSearch, MetaIndex and other Ex Libris products and services referenced herein are trademarks of Ex Libris, and may be registered in certain jurisdictions. All other product names, company names, marks and logos referenced may be trademarks of their respective owners. ● DISCLAIMER The information contained in this document is compiled from various sources and provided on an "AS IS" basis for general information purposes only without any representations, conditions or warranties whether express or implied, including any implied warranties of satisfactory quality, completeness, accuracy or fitness for a particular purpose. ● Ex Libris, its subsidiaries and related corporations ("Ex Libris Group") disclaim any and all liability for all use of this information, including losses, damages, claims or expenses any person may incur as a result of the use of this information, even if advised of the possibility of such loss or damage.
● © Ex Libris Ltd., 2009
The Web became multi-directional
For amazon.com Better understanding of users Better tailoring of services More sales
For users Easier to evaluate materials Easier to find related materials Fun
Two types of user contribution
Explicit Implicit
What about the scholarly arena?
Explicit user contribution Citations Reviews Tags Rating
…Inspired by the success of Amazon, several organizations have created comment sites where scientists can share their opinions of scientific papers. Perhaps the best known was Nature’s 2006 trial of open commentary on papers undergoing peer review at the journal (see Physics World January 2007 pp29—30). The trial was not a success, as Nature’s final report explained: “There was a significant level of expressed interest in open peer review. A small majority of those authors who did participate received comments, but typically very few, despite significant Web traffic. Most comments were not technically substantive. Feedback suggests that there is a marked reluctance among researchers to offer open comments.” Nielsen, M. (2009) Doing Science in the Open. physicsworld.com http://physicsworld.com/cws/article/indepth/38904
Implicit user contribution ● Circulation data ● Clickstreams, recording a search process ● Actions ○ Item viewed ○ Item downloaded ○ Item sent ○ Item bookmarked ○ Item printed ○ Item stored
Potential use of implicit contribution
● Collection development ● Evaluation ● Trend analysis ● Relevance ranking ● Recommendations
Evaluation based on use? ● Shifting from authorship to readership ● Providing timely evaluation ● Covering new types of materials
Not instead of other measurements. In addition to them.
Challenges ● Privacy ● Comprehensiveness ● Validity ● Standardization ● Volume
Interest in usage-based measures ● COUNTER – www.projectcounter.org ● SUSHI - www.niso.org/workrooms/sushi ● JISC MOSAIC – www.sero.co.uk/jisc-mosaic.html ● Metrics for scholarly evaluation: ○ UKSG Usage Factors project - uksg.org/usagefactors ○ Project MESUR - www.mesur.org
Project MESUR
http://www.mesur.org
Project MESUR Where: Los Alamos National Laboratory, USA Who: Johan Bollen and Herbert Van de Sompel How: Andrew W Mellon foundation funding What: Investigate metrics derived from the network-based usage of scholarly information
The Data ● One billion usage transactions collected from publishers, aggregators, and libraries (2002-2007) ● Objectives: ○ Define and validate a range of usage-based metrics ○ Map the structure of the scholarly community
Evaluation Metrics 1. Popularity: number of links/connections to/from the journal 2. Shortest Path: network distance and strength 3. Prestige: number of prestigious journals that link to the journal
Comparisons made with citation-based data
http://public.lanl.gov/herbertv/ papers/jcdl06_accepted_version.pdf
http://arxiv.org/abs/0902.2183
http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0004803
Now being practical:
What is bX? ● An article recommender service for scholarly materials, primarily articles ● Derives from the Los Alamos research ● Built on the OpenURL framework: based on data mining and structural analysis of aggregated link resolver logs ● Hosted service since May 2009 ● Subscribed by over 800 institutions!
Link resolver logs? ● Link resolver logs serve as a good basis ○ Represent users’ information-seeking paths in a standardized way ○ Are across information providers ○ Are across institutions ● Furthermore, there are many of them
E-journ al publish OpenURL er site
E-journ al publish er site Google OpenURL Scholar
Link Resolv er
E-Book publish er site
OpenURL Library interfac e
OpenURL A&I databas es
Citation databas es
Docum ent Delivery
An Information-seeking “session”
A B C
SFX usage log [start session 1]
A B
C
[End session 1] …….. [start session n] … [End session n]
Article Relationships
SFX usage log [start session 1]
[End session 1] …….. [start session n] … [End session n]
Article relationships
Recommender service
Context object
1 2 3 4 5
Context Object
More about the interaction with bX ● Request to bX is sent through an API ● Results are returned as ○ XML (default) ○ Text ○ ATOM ○ RSS
Current Status Article requests to bX server Article references total: 22,285,306 As of July 09: 14,639,506
14,244,342 Added since: 7,645,800
5,823,192 2,153,026 137,174
Total SFX usage events processed: 200M Approximate rate of increase: 5M/month
To make a long story short… ● Data is harvested from link resolver logs through OAI protocol for metadata harvesting (OAI-PMH) ● A structure describing relationships between scholarly materials is created ● A list of recommended materials is generated per request
bX Partners ● Australia: Monash University ● Asia: Tsinghua University, China ● Africa: University of Stellenbosch, South Africa ● Continental Europe: Catholic University of Leuven, Belgium; Charles University, Czech Republic; ETH, Switzerland; FineLib, Finland; Karolinska Institute, Sweden; University of Amsterdam, Netherlands; University of Leiden, Netherlands ● UK: British Library; Imperial College; University College London (UCL); University of Manchester ● North America: Boston College; California State University Consortium; University of Chicago; University of Texas at Austin; Princeton University; University of Alberta, Canada
Los Alamos as a development partner
● bX is cool: current, Web 2.0-type service ● bX is revolutionary: the first of its kind ● bX is easy to have: a software as a service (Saas) offering
bX recommendations are
● Relevant ● Up-to-date ● Easily accessible
“The Web, they say, is leaving the era of search and entering one of discovery. What's the difference? Search is what you do when you're looking for something. Discovery is when something wonderful that you didn't know existed, or didn't know how to ask for, finds you.“
Jeffrey M. O’Brien, "The race to create a 'smart' Google“ http://money.cnn.com/magazines/fortune/fortune_archive/
Thank you!
[email protected]