insurance and medical companies, financial institutions, and many more.

Acknowledgements

This material is based upon work supported by the National Science Foundation under grant numbers CNS-1314632, IIS-1408924, and DGE-1252522 as well as a Facebook Fellowship. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation or other funding parties.

References 1. G. Strang, Introduction to Linear Algebra, Wellesley-Cambridge Press, 1998. 2. J. Kleinberg, “Authoritative Sources in a Hyperlinked Environment,” J. ACM, vol. 46, no. 5, 1999, pp. 604–632. 3. B.A. Prakash et al., “Eigenspokes: Surprising Patterns and Scalable Community Chipping in Large Graphs,” Proc. Pacific-Asia Conf. Knowledge Discovery and Data Mining, 2010, pp. 435–448. 4. M. Jiang et al., “Inferring Strange Behavior from Connectivity Pattern in Social Networks,” Proc. Pacific-Asia Conf. Knowledge Discovery and Data Mining, 2014, pp. 126–138. 5. S. Pandit et al., “Netprobe: A Fast and Scalable System for Fraud Detection in Online Auction Networks,” Proc. World Wide Web Conf., 2007, pp. 201–210. 6. D. Koutra et al., “Unifying Guilt-byAssociation Approaches: Theorems and Fast Algorithms,” Proc. Pacific-Asia Conf. Knowledge Discovery and Data Mining, 2011, pp. 245–260. 7. A. Beutel et al., “CopyCatch: Stopping Group Attacks by Spotting Lockstep Behavior in Social Networks,” Proc. 22nd Int’l Conf. World Wide Web, 2013, pp. 119–130. Alex Beutel a PhD candidate in the Depart-

ment of Computer Science at Carnegie Mellon University. Contact him at [email protected]. 86

Christos Faloutsos is a professor in the De-

partment of Computer Science at Carnegie Mellon University. Contact him at christos@ cs.cmu.edu.

Transfer Learning for Behavior Prediction Weike Pan, Shenzhen University Qiang Yang, Hong Kong University of Science and Technology

Behavior prediction such as user choice and feedback forecasting is of critical importance to the success of online e-commerce and social networking services. However, there are often some fundamental challenges associated with the task of behavior prediction, such as scarcity, uncertainty, and heterogeneity of users’ behaviors and preferences. Transfer learning has the potential to address these challenges in a unified framework via learning and predicting users’ behavior patterns from a novel perspective, by sharing common knowledge between different but related sets of user behaviors. Transfer learning for behavior prediction is a new interdisciplinary research area that has largely not been explored yet, for which we’ll mainly discuss the challenges and opportunities.

Behavior Prediction User behavior data1 is one of the most valuable resources for an online service provider because it can be exploited to help predict future behaviors, improve user satisfaction, and contribute revenues to the company. Users’ behaviors are usually stored in tuples in the form of (user, entity, behavior), where the entity can be a product or a different user, and the behavior denotes an interaction between the corresponding user and entity such as befriend or purchase. www.computer.org/intelligent

The task of behavior prediction aims to exploit historical and predict future user behaviors to (user, entity) pairs. For example, we can predict whether a user will follow a celebrity on a microblogging social network or whether a user will purchase a certain product on an e-commerce platform. Accurate future behavior prediction can assist a company’s strategy and policy on advertising, customer service, and even logistics, which is of great importance to both users and service providers. However, the task of behavior prediction is often associated with at least the following three fundamental challenges: • The scarcity challenge. When the (user, entity, behavior) tuples are few, we might not be able to train a prediction model through current learning and optimization techniques due to the overfitting phenomenon that commonly exists in scarce-data learning problems. Note that the number of the whole set of tuples may be large, but the percentage of the tuples per user or per entity is usually rather small, about 1 percent in the data of the $1 Million Netflix Prize, for example, which makes it difficult to learn a specific user or entity’s preferences or characteristics. • The uncertainty challenge. Users’ behaviors are usually associated with some levels of uncertainty. For example, we might not be able to infer a user’s true preference directly from her implicit examination behaviors such as clicks and browsings. Specifically, a browsing behavior could indicate her positive preference when she plans to buy it or a negative preference when she finds it not interesting after examination. Treating all such implicit examinations without distinction could bias the process of modeling of users’ preferences or entities’ characteristics. IEEE INTELLIGENT SYSTEMS

• The heterogeneity challenge. Users’ behaviors are usually represented in different forms, including implicit examinations and explicit purchases in an online shopping site, or bidirectional friendships and unidirectional followings in a social networking service. The heterogeneity of users’ behaviors requires more sophisticated modeling techniques to fully make use of different types of behaviors. For example, we have to learn both the behavior-dependent and behavior-independent patterns across different sets of behaviors. Traditional leaning methods for behavior prediction such as the wellknown factorization machine2 usually don’t explicitly address the above three challenges.

Transfer Learning for Behavior Prediction Transfer learning3 treats users’ behaviors in a novel perspective of a finer granularity instead of taking all the user behaviors as a whole. Typically, we have two sets of user behaviors, including target behaviors T that await prediction and auxiliary behaviors A that are different but related to target behaviors. For example, we can treat users’ followings as target behaviors and friendships as auxiliary behaviors in a social media website, and users’ purchases as target behaviors and examinations as auxiliary behaviors in an e-commerce platform. Furthermore, we can bridge two sets of behaviors via sharing some common knowledge K so as to introduce interactions and improve behavior prediction performance on target behaviors. Figure 4 illustrates the transfer learning paradigm and two specific examples, from which we can see that the major difference between traditional machine learning and transfer learning lies in the “knowledge transfer” compomarch/april 2016

Target behaviors (T )

Following

Transaction

Knowledge (K)

Relational knowledge

Preferece knowledge

Friendship

Auxiliary behaviors (A) (a)

(b)

Examination (c)

Figure 4. Illustration of transfer learning for behavior prediction, including (a) a generic transfer learning paradigm, (b) an example of relational knowledge transfer from auxiliary bidirectional friendships to target unidirectional followings, and (c) an example of preference knowledge transfer from auxiliary examinations to target purchases.

nent in the learning paradigm. Designing an appropriate knowledge transfer component is critical to a transfer learning algorithm because it’s closely related to the three fundamental questions in transfer learning,3 that is, what knowledge to transfer, how to transfer it, and when (not) to transfer it. For behavior prediction, previous transfer learning algorithms mainly focus on the first two questions,4 including transferring knowledge of model parameters, behavior instances, or compressed behavior patterns in an adaptive, collective, or integrative manner. Knowledge transfer between target and auxiliary behaviors is a potential solution for the aforementioned three challenges. First, for the scarcity challenge, knowledge transfer from additional data provides a way to selectively incorporate auxiliary data to mitigate the data scarcity problem. Second, for the uncertainty challenge, rich interactions between target behaviors and auxiliary behaviors via knowledge sharing are likely to help reduce the uncertainty or learn the confidence of user behaviors. Third, for the heterogeneity challenge, common knowledge sharing is able to integrate different behaviors in a principled way. Finally, the goal of transfer learning for behavior prediction is to achieve better prediction performance than traditional machine learning methods exploiting either target behaviors (T) only, that is, ML(T), www.computer.org/intelligent

or both target behaviors (T) and auxiliary behaviors (A), that is, ML(A, T): ML(T), ML(A, T) < TL(A, T). 

(1)

Opportunities As a new interdisciplinary area of transfer learning and behavior prediction, there are lots of exciting directions ahead such as multiobjective transfer, open domain transfer, lifelong transfer, and transfer learning theories. Multiobjective transfer aims to improve not only the accuracy in behavior prediction but also efficiency, result interpretation, and even robustness against malicious attack. Open domain transfer is the human ability to transfer knowledge from a far domain instead of from a close one as most previous works do, which is usually called “far transfer of learning” in the education community. Lifelong transfer focus on improving the learning and prediction performance in a neverending manner with little human interference.5 Transfer learning theories on when to transfer or not haven’t been studied much yet, which may be different for different application domains.

T

ransfer learning is a promising solution to the strategically important task of behavior prediction in various online services. We expect to see 87

many exciting works addressing the fundamental challenges such as scarcity, uncertainty, and heterogeneity, exploring the grand opportunities of multiobjective, open domain, and lifelong transfer learning algorithms and theories.

Acknowledgments

We’re thankful for the support of Natural Science Foundation of China number 61502307, China National Fundamental Research Project (973 Program) number 2014CB340304, and ITF project ITS/191/13FX.

References 1. R. Zafarani, M. Ali Abbasi, and IEEE_half_horizontal_Q6:Layout H. Liu, Social Media Mining: An

Introduction, Cambridge Univ. Press, 2014. 2. S. Rendle, “Factorization Machines with Libfm,” ACM Trans. Intelligent Systems and Technology, vol. 3, no. 3, 2012, pp. 57:1–57:22. 3. S. Jialin Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE Trans. Knowledge and Data Eng., vol. 22, no. 10, 2010, pp. 1345–1359. 4. W. Pan, E. Xiang, and Q. Yang, “Transfer Learning in Collaborative Filtering with Uncertain Ratings,” Proc. 26th AAAI Conf. Artificial Intelligence, 2012, pp. 662–668. 5. D.L. Silver, Q. Yang, and L. Li, “Lifelong Machine Learning Systems: 1 4/21/11 4:21 PM Page 1 Beyond Learning Algorithms,”

Proc. AAAI Spring Symp., 2013, pp. 49–55. Weike Pan is a lecturer in the College of Computer Science and Software Engineering at Shenzhen University. Contact him at [email protected]. Qiang Yang is a chair professor in the Department of Computer Science and Engineering at the Hong Kong University of Science and Technology. Contact him at qyang@cse. ust.hk.

Selected CS articles and columns are also available for free at http://ComputingNow.computer.org.

Experimenting with your hiring process? Finding the best computing job or hire shouldn’t be left to chance. IEEE Computer Society Jobs is your ideal recruitment resource, targeting over 85,000 expert researchers and qualified top-level managers in software engineering, robotics, programming, artificial intelligence, networking and communications, consulting, modeling, data structures, and other computer science-related fields worldwide. Whether you’re looking to hire or be hired, IEEE Computer Society Jobs provides real results by matching hundreds of relevant jobs with this hard-to-reach audience each month, in Computer magazine and/or online-only!

http://www.computer.org/jobs The IEEE Computer Society is a partner in the AIP Career Network, a collection of online job sites for scientists, engineers, and computing professionals. Other partners include Physics Today, the American Association of Physicists in Medicine (AAPM), American Association of Physics Teachers (AAPT), American Physical Society (APS), AVS Science and Technology, and the Society of Physics Students (SPS) and Sigma Pi Sigma.

88

www.computer.org/intelligent

IEEE INTELLIGENT SYSTEMS

Transfer Learning for Behavior Prediction

on an e-commerce platform. Accurate future behavior prediction can assist a company's strategy and policy on ad- vertising, customer service, and even logistics, which is of great importance to both users and service providers. However, the task of behavior pre- diction is often associated with at least the following three ...

2MB Sizes 1 Downloads 226 Views

Recommend Documents

Transfer Learning and Active Transfer Learning for ...
1 Machine Learning Laboratory, GE Global Research, Niskayuna, NY USA. 2 Translational ... data in online single-trial ERP classifier calibration, and an Active.

Learning Reactive Robot Behavior for Autonomous Valve ...
Also, the valve can. be rusty and sensitive to high forces/torques. We specify the forces and torques as follows: 368. Page 3 of 8. Learning Reactive Robot Behavior for Autonomous Valve Turning_Humanoids2014.pdf. Learning Reactive Robot Behavior for

Selective Transfer Learning for Cross Domain ...
domain data are not consistent with the observations in the tar- get domain, which may misguide the target domain model build- ing. In this paper, we propose a ...

Transfer Learning for Semi-Supervised Collaborative ...
Transfer Learning for Semi-Supervised Collaborative. Recommendation. Weike Pan1, Qiang Yang2∗, Yuchao Duan1 and Zhong Ming1∗ [email protected], [email protected], [email protected], [email protected]. 1College of Computer Science and So

Active Semi-supervised Transfer Learning (ASTL) for ...
transfer learning (ASTL) for offline BCI calibration, which integrates active learning .... classifier, and the 4th term minimizes the distance between the marginal ... using available sample information in both source and target domains. ..... apply

Exploiting Feature Hierarchy for Transfer Learning in ...
lated target domain, where the two domains' data are ... task learning, where two domains may be re- lated ... train and test articles written by the same author to.

Restricted Transfer Learning for Text ... - Research at Google
We present an algorithm for RIT for text categorization which we call ... Let U be the uniform distribution over labels Ys, and κ be a fixed regularization con-. 2 ...

Transfer Learning in Collaborative Filtering for Sparsity Reduction
ematically, we call such data sparse, where the useful in- ... Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (AAAI-10) ... way. We observe that these two challenges are related to each other, and are similar to the ...

10 Transfer Learning for Semisupervised Collaborative ...
labeled feedback (left part) and unlabeled feedback (right part), and the iterative knowledge transfer process between target ...... In Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data. Mining (KDD'08). 426â

Transfer Learning for Collaborative Filtering via a ...
aims at predicting an active user's ratings on a set of. Appearing in Proceedings of ...... J. of Artificial Intelligence Research, 12, 149–198. Caruana, R. A. (1997).

Transfer learning in heterogeneous collaborative ... - ScienceDirect.com
Tudou,3 the love/ban data in Last.fm,4 and the “Want to see”/“Not interested” data in Flixster.5 It is often more convenient for users to express such preferences .... In this paper, we consider the situation where the auxiliary data is such

Multi-view Discriminant Transfer Learning
view-consistency assumption is largely violated in the setting of transfer learning ..... κ, we empirically set κ = 5%. .... OS 1: /operating systems/realtime/ (595).

Transfer learning in heterogeneous collaborative filtering domains
E-mail addresses: [email protected] (W. Pan), [email protected] (Q. Yang). ...... [16] Michael Collins, S. Dasgupta, Robert E. Schapire, A generalization of ... [30] Daniel D. Lee, H. Sebastian Seung, Algorithms for non-negative matrix ...

Program Behavior Prediction Using a Statistical Metric ... - Canturk Isci
Jun 14, 2010 - Adaptive computing systems rely on predictions of program ... eling workload behavior as a language modeling problem. .... r. LastValue. Table-1024. SMM-Global. Figure 2: Prediction accuracy of our predictor, last-value and ...

Program Behavior Prediction Using a Statistical Metric ... - Canturk Isci
Jun 14, 2010 - ABSTRACT. Adaptive computing systems rely on predictions of program ... rate predictions of changes in application behavior to proac- tively manage system ..... [2] C. Isci, et al. Live, Runtime Phase Monitoring and Prediction.

Source-Selection-Free Transfer Learning
to a cluster with 30 cores using MapReduce, and finished the training with two hours. These pre-trained source base classi- fiers are stored and reused for different incoming target tasks. 3.2 Building Label Graph with Delicious. As mentioned in the

Program Behavior Prediction Using a Statistical Metric ... - Canturk Isci
Jun 14, 2010 - P(s4 | s3) s4. P(s4). P(s3). P(s2). P(s1). Probability. Figure 1: Model with back-off for n = 4. The statistical metric model is a conditional ...

Enrollment and Course Taking Behavior of Reverse Transfer and ...
Herndon, VA: National Student Clearinghouse Research Center. ... Both four-year and two-year campus policy-makers may have seen less ... Course Taking Behavior of Reverse Transfer and Summer-Only Community College Students.pdf.

Enrollment and Course Taking Behavior of Reverse Transfer and ...
Enrollment and Course Taking Behavior of Reverse Transfer and Summer-Only Community College Students.pdf. Enrollment and Course Taking Behavior of ...