Your Digital Image: Factors Behind Demographic And Psychometric Predictions From Social Network Profiles (Demonstration) Yoram Bachrach

Thore Graepel

Pushmeet Kohli

Microsoft Research

Microsoft Research

Microsoft Research

[email protected] [email protected] [email protected] Michal Kosinski David Stillwell University of Cambridge

University of Cambridge

[email protected]

[email protected]

ABSTRACT We demonstrate how information gathered from social network profiles can be used to predict personal attributes such as gender and age, religious and political views, intelligence, happiness and personality traits. Our approach is based on applying machine learning techniques to a large dataset of people who volunteered their Facebook profiles along with their demographic and psychometric test results. We combine various features from the profile, including the numbers or rates of posting status updates, pictures and group memberships, and the specific items liked by individuals. Our system provides insights regarding how the predictions are made, allowing people to understand how they may be perceived by others based on their social network profiles.

Categories and Subject Descriptors K.4 [Computers and Society]: Social Issues

Keywords Social Networks, Analytics, Machine Learning, Personality Predictions, Demographic Predictions, Privacy

1.

INTRODUCTION

Social networking sites (SNS), such as Facebook, have become immensely popular in recent years. Such sites record much information about their users, including their opinions, preferences and interactions. Recent work shows that many traits of individuals can be predicted automatically and accurately by using the wealth of information from SNS, including age and gender [9] personality [7, 4, 1], and even ethnicity, religious or political views, intelligence and happiness [6]. Automated approaches typically rely on obtaining a large dataset consisting of social network profiles along with personal traits of individuals, and applying machine learning techniques to build models that predict traits of individuals given the features of their social network profile. Appears in: Alessio Lomuscio, Paul Scerri, Ana Bazzan, and Michael Huhns (eds.), Proceedings of the 13th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2014), May 5-9, 2014, Paris, France. c 2014, International Foundation for Autonomous Agents and Copyright Multiagent Systems (www.ifaamas.org). All rights reserved.

Some profile features that are used to predict properties of the profile owner are macro-features — aggregates which quantify the total activity or behavior of a certain type, such as the total number of photos uploaded to a profile in a month, the number of friends or the average number of words in status updates. Other approaches use micro-features, relating to very specific profile properties — whether the user is a member of a certain group, or expressed interest in a particular item (such as a movie, website, or book).

1.1 Prediction Approaches One approach that uses micro-features predicts many user traits, such as sexual orientation, ethnicity, religious and political views, personality, intelligence, happiness, use of addictive substances, parental separation, age, and gender [6]. It uses Facebook Likes, which allow Facebook users to express a positive association (“Like”) with online content, including status updates, photos, or websites. Given a set S of objects that can be “Liked”, it represents a user SNS profile as a binary vector, with entries set to 1 if the user Liked the corresponding object (i.e. if there is an association between the user and object) and 0 otherwise. An encoding of the user demographic or psychometric traits is concatenated to the encoding of the SNS profile, so each user is expressed as a vector of equal length. The set of all users is encoded as a matrix where each row is the encoding of a user. After applying singular-value decomposition (SVD) [3], regression methods are used to predict a user’s traits given their SNS profile. Another approach uses macro-features to predict personality traits [1], expressed using the Five Factor Model [2, 8], a widespread model used by psychologists to represent the key features underlying human personality. This approach predicts personality by multiple linear regression, using features such as the number of Facebook friends, group associations, “Likes”, photos and status updates. Both the above approaches use the same dataset to train the machine learning model, released by the myPersonality project 1 . This is a Facebook application, first deployed in 2007, which lets Facebook users complete a demographics questionnaire and a standard personality questionnaire to obtain feedback regarding their personality. Users may give their consent to record their profile information responses, 1

available at: https://http://mypersonality.org/wiki/

allowing the myPersonality dataset to correlate profile features and user traits for a very large user population. In order to improve the prediction accuracy, it is advisable to use all available features. Further, we wish to help users understand how parts of the profile information affect predictions, as this allows them to determine how others may perceive them and what affects these perceptions. We demonstrate an approach that combines both micro and macro features to predict traits of the profile owners, and that helps users to understand what drives these predictions. Our approach also relies on the myPersonality dataset, and allows predicting traits such as demographics and personality, similarly to the methods discussed above [6, 1].

2.

COMBINED PREDICTION

Our approach uses multiple linear regression for continuous variables, and logistic regression for boolean variables. To process “Liked” items, we use an SVD of the “Liked” item matrix, similarly to the approach discussed in the introduction [6]. However, rather than relying solely on these micro-features, we augment the vector encoding a user with the macro-level features used by personality predicting approaches [4, 1]. We build a model for each predicted trait, such as gender, age or each of the Big Five personality traits, and train the model using the myPersonality dataset. To identify the key features that influence the predictions, our system shows which of the “Liked” items resulted in a change in the predicted value of traits, and allows viewing the predictions based only on the macro features or only on the micro features. This allows users of our prediction engine to better understand how various factors affect our predictions. Our combined prediction approach is flexible, allowing integrating additional explanatory variables easily: in order to use another profile feature to improve the prediction accuracy of all the user traits, one only needs to add these as an additional column to the matrix representing the users. We have built the system using the myPersonality project, which only contains Facebook profile features. However, we emphasize that the same approach allows integrating data from several sources, including other social networks, or shopping histories or browsing histories. 2 Our system only analyzes the status updates in a shallow manner, at the macro level; it does not examine the textual content of the status updates, but only the total number of these updates. Recent work indicates that linguistic characteristics are predictive of a user’s traits [9]. Our approach can use such information by augmenting the vector encoding with columns representing specific words in the updates.

3.

CONCLUSION

We demonstrate an approach for aggregating information from SNS profiles to predict user traits. Such predictions have many applications, ranging from tailoring products or recommendations to users, through personalized search engines to better targeted advertising. However, the ability to infer potentially sensitive information about users is also a cause for privacy concerns. This highlights the need for 2 For example, previous work already shows that personality can be predicted using information about the websites people tend to visit [5]. Such information is contained in a user’s browsing history, and can be integrated into our framework.

proper controls, that would balance enable the important applications of this technology, while protecting user privacy. Our system allows users to examine the factors influencing the predictions, so users can determine how “Liking” a certain item changes the predictions regarding their intelligence, or how changing the number of friends they have affects the predictions regarding their personality. Clearly, these factors are under the control of the user, and users may modify their behavior on Facebook to be perceived in a positive manner. As people can form judgments on others based on their social media profiles [4], this phenomenon is not new. However, we believe an automated tool can allow people to easily determine how others may perceive them based on their behavior on social networks. Many questions are open for future research. How can we design systems that would give people more control over the information they reveal ? Can such accurate predictions be made based on other publicly available information, such as personal webpages or blogs? How is the availability of tools that allow making such predictions likely to change the way in which people communicate and interact on social media?

4. REFERENCES [1] Y. Bachrach, M. Kosinski, T. Graepel, P. Kohli, and D. Stillwell. Personality and patterns of Facebook usage. In Proceedings of the 3rd Annual ACM Web Science Conference, pages 24–32. ACM, 2012. [2] P. Costa Jr and R. McCrae. Neo personality inventory–revised (NEO-PI-R) and NEO five-factor inventory (NEO-FFI) professional manual. Odessa, FL: Psychological Assessment Resources, 1992. [3] G. Golub and W. Kahan. Calculating the singular values and pseudo-inverse of a matrix. Journal of the Society for Industrial & Applied Mathematics, Series B: Numerical Analysis, 2(2):205–224, 1965. [4] S. D. Gosling, S. Gaddis, S. Vazire, et al. Personality impressions based on Facebook profiles. In ICWSM, 2007. [5] M. Kosinski, P. Kohli, D. Stillwell, Y. Bachrach, and T. Graepel. Personality and website choice. In ACM Web Science Conference, pages 251–254, 2012. [6] M. Kosinski, D. Stillwell, and T. Graepel. Private traits and attributes are predictable from digital records of human behavior. Proceedings of the National Academy of Sciences, 110(15):5802–5805, 2013. [7] C. Ross, E. S. Orr, M. Sisic, J. M. Arseneault, M. G. Simmering, and R. R. Orr. Personality and motivations associated with Facebook use. Computers in Human Behavior, 25(2):578–586, 2009. [8] M. Russell, D. Karol, I. for Personality, and A. Testing. The 16PF fifth edition administrator’s manual. Institute for Personality and Ability Testing Champaign, IL, 1994. [9] H. A. Schwartz, J. C. Eichstaedt, M. L. Kern, L. Dziurzynski, S. M. Ramones, M. Agrawal, A. Shah, M. Kosinski, D. Stillwell, M. E. Seligman, et al. Personality, gender, and age in the language of social media: The open-vocabulary approach. PloS one, 8(9):e73791, 2013.

Your Digital Image: Factors Behind Demographic ... - Dr Michal Kosinski

May 5, 2014 - Psychometric Predictions From Social Network Profiles ... (SVD) [3], regression methods are used to predict a user's traits given their SNS ...

46KB Sizes 1 Downloads 155 Views

Recommend Documents

Personality and Intrinsic Motivational Factors in ... - Dr Michal Kosinski
A key objective in End-User Programming (EUP) research, as in the VLIHCC conference series more broadly, is to provide people with the capability to create and modify software. In particular, the goal of EUP is to extend that capability to a wider ra

Personality and Intrinsic Motivational Factors in ... - Dr Michal Kosinski
practical utility implicit in bricoleurism. We hypothesise that: (H2) People with a high technophilia personality factor might have intrinsic motivation for EUP due to their curiosity about new technologies and learning how to use them. This might be

Tracking the Digital Footprints of Personality - Dr Michal Kosinski
KEYWORDS | Big data; personality; psychology; social networks. I. INTRODUCTION .... types of online media, for instance, by focusing on website browsing logs ...

Are You Satisfied with Life?: Predicting ... - Dr Michal Kosinski
Abstract. Social media can be beneficial in detecting early signs of emo- tional difficulty. We utilized the Satisfaction with Life (SWL) index as a cognitive health ...

Are You Satisfied with Life?: Predicting ... - Dr Michal Kosinski
Keywords: Social networking · Facebook · Satisfaction with life ... are posting about their lives, family, and social interactions making sites like. Twitter, Facebook .... qualitative interpretation of SWL scores [10], we consider models with aver

Mining Facebook Data for Predictive Personality ... - Dr Michal Kosinski
ing correlation patterns between personality and variety of user's data captured from multiple sources. Generally, two approaches were adopted for studying personality traits of social network users. The first approach uses a variety of machine learn

Do Facebook Status Updates Reflect Subjective ... - Dr Michal Kosinski
was 3.9 % (SD = 2.0%) and 1.8% (SD = 1.1%), respectively. In both samples, positive emotional .... affect health, income, and social relationships.6 Understanding how contents in users'. Facebook status updates reflect SWB ... They open up the opport

Do Facebook Status Updates Reflect Subjective ... - Dr Michal Kosinski
experiences in the last nine to ten months were negatively related to life satisfaction. These findings have important theoretical and practical implications. First, our study shows that Facebook status updates reveal users' SWB. This is consistent w

The Song Remains the Same: A Replication and ... - Dr Michal Kosinski
We use information technology and tools to increase productivity and facilitate new forms ... Studies 2 and 3 show that the MUSIC structure is recoverable ..... on the other hand, preferences are based on the degree to ..... Through the Years.

The Song Remains the Same: A Replication and ... - Dr Michal Kosinski
We use information technology and tools to increase productivity and facilitate new forms of scholarship. ... The overarching aim of the present research program is to broaden our ..... on the other hand, preferences are based on the degree to which

Protocol validity indices in the sample from an on ... - Michal Kosinski
providing instant feedback to the participants and administrators. Due to the above advantages, there is a growing interest in the web-mediated personality assessment and several well established paper-and-pencil questionnaires have been used in the

Digital Image Processing Digital Image Processing - CPE, KU
Domain Filtering. Band reject filter ... Reject Filter: Degraded image. DFT. Notch filter. (freq. Domain). Restored image. Noise ..... Then we get, the motion blurring transfer function: dt e. vuH. T ..... Another name: the spectrum equalization filt

Digital Image Processing
companion web site offers useful support in a number of important areas. For the Student or Independent Reader the site contains: Brief tutorials on probability, ...

DIGITAL IMAGE PROCESSING.pdf
There was a problem previewing this document. Retrying... Download. Connect more ... DIGITAL IMAGE PROCESSING.pdf. DIGITAL IMAGE PROCESSING.pdf.

Digital Image Processing
Eye (image sensor or camera). ➢ Optic nerve ... –Choroid. • A network of blood vessels for eye nutrition .... Digital Camera --CCD array → 4000 * 4000 element.

Digital Image Processing
transmission, and representation for autonomous machine perception. Ex. of fields that use DIP. ○ Categorize by image ... Gamma rays, X-rays, Ultraviolet, Visible, Infrared,. Microwaves, Radio waves. Gamma-Ray Imaging ... Imaging in Microwave Band.