A Privacy-Protecting Architecture for Collaborative Filtering via Forgery and Suppression of Ratings Javier Parra-Arnau, David Rebollo-Monedero and Jordi Forné http://sites.google.com/site/javierparraarnau/ Department of Telematics Engineering Technical University of Catalonia (UPC) Barcelona, Spain Leuven, Belgium

September 15, 2011

1

Outline 2

 Introduction  State of the Art

 An Architecture for Privacy Protection in Collaborative

Filtering based Recommendation Systems  Formulation of the Optimal Trade-Off between Privacy and

Utility  Conclusions

Introduction

Information Overload 3

 The amount of information on the Web has grown exponentially since

the advent of the Internet

Collaborative Filtering 4

 A recommendation system is a filtering system that suggest

information items that are likely to be of interest to the user  Recommendation systems based on collaborative filtering (CF) algorithms  Examples include Amazon, Digg, Movielens and Netflix

on overload

ormation ov ion overload

ion overload ion overload tion overloa ion overload ion overload mation overl ion overload ion overload rmation ove on overload ion overload ion overload formation o rmation ove ion overload Information

User Profiles 5

 Users need to communicate their preferences to the recommender

in order to obtain a prediction for those items they have not yet considered

80 76 71 71 67 62 54 51 38 34 25 25 16 12 7

7

7 3

3

AX ir IM -No m n Fil ter es W ical us n M dre il Ch ror r Ho asy n nt io y Fa at tar im en An um c Do ery ist M ar W i i-F ce Sc an m Ro e e im ur Cr ent v Ad o n ti Ac edy m Co ller ri Th a am Dr

Privacy Risk 6

 The privacy risks perceived by users include computers “figuring

things out” about them, unsolicited marketing, court subpoenas, and government surveillance [Cranor 03]

Recommendation System predictions

she’s pregnant!





Forgery and Suppression of Ratings 7

 Submitting false information and refusing to give private information

are strategies accepted by users concerned with their privacy [Fox 00, Hoffman 99]  Our approach relies upon the forgery and suppression of ratings

SUPPRESSION

… predictions the user has read these books

Recommendation System



Contribution (I) 8

 Our architecture protects user privacy to a certain extent

 utility loss measured as forgery rate and suppression rate

Contribution (II) 9

 Mathematical formulation of the optimal trade-off among privacy,

forgery rate ½ and suppression rate ¾

 Privacy as the Shannon entropy of the user’s apparent profile

P(½; ¾) =

max r;s

P ri >0; P ri =½ qi >si >0; si =¾

µ

q+r¡s H 1+½¡¾



 Our proposal could be used in combination with other existing

approaches

State of the Art

Privacy Protection in Recommendation Systems 10

 The state-of-the-art approaches may be classified according to these

main strategies  perturbing the information provided by users [Pollat 03, 05, Agrawal 01, Kargupta 03, Huang 05],

 using cryptographic techniques [Canny 02, Ahmad 07, Zhan 10], and  distributing the information collected [Miller 04, Berkovsky 07] 3.2 + 1.5, 2.9 – 0.7, 4.1, 4.4 – 2.7

5.6, 3.3 + 1.0, 1.1, 3.4 – 0.1

recommendation system

[Pollat 03]

Privacy Protection in Recommendation Systems 10

 The state-of-the-art approaches may be classified according to these

main strategies  perturbing the information provided by users [Pollat 03, 05, Agrawal 01, Kargupta 03, Huang 05],

 using cryptographic techniques [Canny 02, Ahmad 07, Zhan 10], and  distributing the information collected [Miller 04, Berkovsky 07] q5

q4

q1

Enc(q1)+: : : + Enc(q5)= = Enc(q1 + : : : + q5)

q3 q2 [Canny 02]

Privacy Protection in Recommendation Systems 10

 The state-of-the-art approaches may be classified according to these

main strategies  perturbing the information provided by users [Pollat 03, 05, Agrawal 01, Kargupta 03, Huang 05],

 using cryptographic techniques [Canny 02, Ahmad 07, Zhan 10], and  distributing the information collected [Miller 04, Berkovsky 07]

ratings

central server

[Miller 04]

An Architecture for Privacy Protection in CF-based Recommendation Systems

Overview 11

 Profiling is accomplished on the basis of user ratings  Information items are classified as known or unknown

 Users may wish to submit ratings to unknown items (forgery) and

refrain from rating known items (suppression)

Recommendation System

known items

unknown items

User Profile Model 12

Witty

Buddies

Clever

Fall in Love

Humorous Couple Relations Parents and Children Feel

80 76 71 71 67 62

Good

54

Best Friends Offbeat

51 38 34 25 25 16

Emotional

12 7

7

7 3

3

Human Spirit AX ir IM -No m n Fil ter es W ical us n M dre il Ch ror r Ho asy n nt io y Fa at tar im en An um c Do ery ist M ar W i i-F ce Sc an m Ro e e im ur Cr ent v Ad n tio Ac edy m Co ller ri Th a am Dr

Movielens

Slow Teenage Life

Sincere

Human Nature

Parents and Children Coming of Age

Touching Village Life

Jinni

 [Toubiana 10, Fredrikson 11] suggest representing user profiles as

histograms of absolute frequencies  We model the profile of a user as a probability mass function (PMF)

User Profile Construction 13

 Our architecture requires to estimate the actual profile of a user to help

them decide which items should be rated and which should not  Histogram based on the categories provided by the recommender  Categorize items by exploring web pages and using the vector space model [Salton 75]

books \ literature & fiction \ genre fiction

…?

Adversarial Model 14

 Passive attacker capable of crawling through the items rated by a user  The attacker observes the apparent user profile t, a perturbed

version of the actual user profile q

ratings predictions





q

forgery and suppression NO PROTECTION! of ratings

Recommender





tq

Privacy Measure 15

 We measure privacy as the Shannon entropy of the user’s apparent

profile t

number of categories

H(t) =

n X

ti log2 ti

i=1

 Accordingly, privacy is compromised whenever the user’s preferences

are biased towards certain categories of interest

1

2

3

4

minimum privacy

1

2

3

4

maximum privacy

Architecture 16

User side

Known / Unknown Items Classifier

Network side

Category Extractor

!

Information Provider

Forgery Alarm ...

! Suppression Alarm

Communication Manager

x2

Recommendation System

User Profile Constructor

Forgery and Suppression Generator

uncategorized item categorized item known item unknown item rated item

Architecture 17 Block Functionality  Communication with the recommender

User and side Network side Description - Starting at the beginning, the book explores how JavaScript originated evolved into what it is today. A detailed discussion of the components that make up a JavaScript implementation follows, with specific focus on standards such as ECMAScript and the Document Object Model (DOM).

Category - books \ computers & internet \ web development Average Customer Review 4.5/5

 Retrieve information about the items explored by the user

Known / Unknown Items Classifier

Category Extractor

Description !- Stephen Hawking, one of the most brilliant theoretical physicists in history, wrote the Forgeryof Time to help nonscientists understand the questions being asked by modern classic A Brief History Alarm scientists today.

Information Provider

...

Category - books \ science Average Customer Review 4/5 !

Communication

Suppression Description - Written by soccer great and championship Stanford coach Bobby Clark, this book tells Manager Alarm you how, starting at point zero, an uninitiated coach can meld kids into a team and help them enjoy one of the most rewarding experiences of their youth. x2

Category - books \ sports \ coaching \ soccer Average Customer Review 4.5/5

Recommendation System Description - You’ve made it! Your baby has turned one! Now the real fununcategorized begins. From temper item tantrums to toilet training, raising a toddler brings its own set of challenges and questions —item and Toddler categorized 411 has the answers. Forgery and User Profile known item Category - books \ parenting & families \ parenting Constructor Suppression Generator unknown item Average Customer Review 3/5 rated item

Architecture 18 Block Functionality  Obtain categories associated with the items downloaded by the Communication Manager

User side

Known / Unknown Items Classifier

Network side

Category Extractor

!

Information Provider

Forgery Alarm ...

! Suppression Alarm

Communication Manager

x2

Recommendation System

User Profile Constructor

Forgery and Suppression Generator

uncategorized item categorized item known item unknown item rated item

Architecture 19 Block Functionality  The user classifies the items as known or unknown

User side

Known / Unknown Items Classifier

Category Extractor

! books \ computers & internet \ web development

Forgery books \ Alarm science

Network side

...

books \ sports \ coaching \ soccer

! Suppression Alarm

books \ parenting & families \ parenting

Information Provider

Communication Manager

x2

Recommendation System

User Profile Constructor

Forgery and Suppression Generator

known items

unknown

uncategorized item categorized item known item unknown item itemsrated item

Architecture 20 Block Functionality  Computes the actual user profile

User side

Known / Unknown Items Classifier

Network side

Category Extractor

q !

Information Provider

Forgery Alarm ...

! Suppression Alarm



Communication Manager



x2

Recommendation System

User Profile Constructor

Forgery and Suppression Generator

uncategorized item categorized item known item unknown item rated item

Architecture 21 Block Functionality  Centerpiece of the architecture

User side

 The user specifies a forgery rate ½ and a suppression rate ¾

Known / Unknown Items Classifier

Category Extractor

5%

! Suppression Alarm

¾ = 10%

Communication Manager

Information Provider

FORGERY

x2

Forgery

8% Alarm ...

SUPPRESSION

!

2%

Network side

½ = 5% Recommendation System

User Profile Constructor

Forgery and Suppression Generator

uncategorized item categorized item known item unknown item rated item

Architecture 22 Block Functionality  Generate an alarm when an item should be suppressed

User side

Known / Unknown Items Classifier

Network side

Category Extractor

!

Information Provider

Forgery Alarm ...

! Suppression Alarm

Communication Manager

science parenting 8%

x2

2%

Recommendation System

User Profile Constructor

Forgery and Suppression Generator

uncategorized item categorized item known item unknown item rated item

Architecture 23 Block Functionality  Generate an alarm when an item should be forged

User side

Known / Unknown Items Classifier

Category Extractor

population’s rating

!

Information Provider

Forgery Alarm ...

computers

computers

! Suppression Alarm

Communication Manager

5%

x2

sports

User Profile Constructor

Network side

Forgery and Suppression Generator

Recommendation System uncategorized item categorized item known item unknown item rated item

Formulation of the Optimal Trade-Off between Privacy and Utility

Trade-Off between Privacy and Utility 24

 The degradation in the accuracy of predictions is measured as ¾ and ½  We model items as r.v.’s taking on values in a common finite alphabet

of n categories  We define

 q as the actual user profile  ½ 2 [0; 1) as the forgery rate  ¾ 2 [0; 1) as the suppression rate

 Accordingly, the user’s apparent profile is defined as

q+r¡s 1+½¡¾

X r = (r ; : : : ; r ); r > 0; ri = ½  1 n i X si = ¾  s = (s1; : : : ; sn); qi > si > 0;

Trade-Off between Privacy and Utility 25

 Privacy is measured as the Shannon entropy of the user’s apparent

profile  The privacy-forgery-suppression function

P(½; ¾) =

max r;s

P ri >0; P ri =½ qi >si >0; si =¾

µ

q+r¡s H 1+½¡¾



 This formulation specifies the key functional block of our architecture, namely the ‘Forgery and Suppression Generator’

Forgery and Suppression Generator

Conclusions

Conclusions 26

 The forgery and suppression of ratings arise as two simple

mechanisms in terms of infrastructure,  but it comes at the cost of a loss in utility, namely the degradation in the accuracy of the predictions  We propose an architecture that implements these two mechanisms in

those CF-based recommendation systems that profile users exclusively from their ratings

 The centerpiece of our approach is a module responsible for computing the tuples of forgery r and suppression s

 This information is used to warn the user when their privacy is being compromised

 It is up to the user to decide whether to forge or eliminate a rating  We present a formulation of the optimal trade-off among privacy,

forgery rate and suppression rate

A Privacy-Protecting Architecture for Collaborative Filtering via Forgery and Suppression of Ratings Javier Parra-Arnau, David Rebollo-Monedero and Jordi Forné http://sites.google.com/site/javierparraarnau/ Department of Telematics Engineering Technical University of Catalonia (UPC) Barcelona, Spain Leuven, Belgium

September 15, 2011

39

A Privacy-Protecting Architecture for Collaborative ...

Sep 15, 2011 - Rom ance. Sci-FiWar M. isteryDocum entary. Anim ation. FantasyHorrorChildren. M usical. W estern. Film. -N oir. IM .... User side. Network side.

2MB Sizes 0 Downloads 164 Views

Recommend Documents

A Privacy-Protecting Architecture for Collaborative ...
Despite the many advantages recommendation systems are bringing to users, the information ... about the idea that their profiles may reveal sensitive information such as health- ..... For this purpose, the module keeps a record of all the items that

PDF Collaborative Enterprise Architecture: Enriching ...
digitization cloud computing agile software development and Web 2.0 among other developments ... Lean Enterprise: How High Performance Organizations ...

PDF Download Collaborative Enterprise Architecture
Book synopsis. Rapid advances in information technologies and ever-changing business needs have prompted large companies to embark on enterprise-wide ...

PDF Download Collaborative Enterprise Architecture
digitization cloud computing agile software development and Web 2.0 among other developments demand fundamental ... trends with 80% of initiatives failing.

A Collaborative Tool for Synchronous Distance Education
application in a simulated distance education setting. The application combines video-conference with a networked virtual environment in which the instructor and the students can experiment ..... Virtual Campus: Trends for Higher Education and. Train

ECHO for - Virtual Community for Collaborative Care
ECHO. Colorado faculty, staff and partners have dedicated themselves to de- monopolizing knowledge in order to expand access to best-practice care.

A Sketch-Based Interface for Collaborative Design
tems (e.g., CAD systems), on the other hand, provide con- siderable ... A more flexible approach for sketch- ing 3D shapes of ..... phous elements, such as clouds, fire and water, tradition- .... Human Factors in Computing Systems (1995), pp.

A Robust Solution for Collaborative Data Mining - IJRIT
... is the approach in which multiple data providers share their data for data mining tasks .... He is Microsoft Certified System Engineer & CISCO Certified Network ...

A research agenda for collaborative commerce
Collaborative commerce is the collaborative, electronically enabled business .... connected legal entities bound together by volumes of contractual documents to ..... Gartner symposium 'C-Commerce: The New Enterprise in the Internet Age', ...

A Constraint-based Collaborative Environment for ...
and therefore a collaborative system must be able to address collaboration issues as ... Various strategies for computationally supporting online collaborative ...

Open-PEOPLE, A Collaborative Platform for Remote ...
One is dedicated for high precision ... with the server, upload the power measurement test case (an ... 2) Server side: On the server side, the role of the software.

A Distributed Multi-Agent System for Collaborative ...
collaborative agents to help users access, manage, share and exchange information. ... superior to unstructured lists, hierarchical folder organization forces users to think in ..... Yates R., Information Retrieval, Data Structures and. Algorithms ..

A Hybrid Probabilistic Model for Unified Collaborative ...
Nov 9, 2010 - automatic tools to tag images to facilitate image search and retrieval. In this paper, we present ... semantic labels for images based on their visual contents ... related tags based on tag co-occurrence in the whole data set [48].

A Distributed Multi-Agent System for Collaborative ...
Mail Stop 269-2 ... aided by easy sharing utilities as well as automated information .... between agents is supported with automatic indexing methods in.

A Constraint-based Collaborative Environment for ...
Department of Computer Science and Software Engineering. University of Canterbury ... year university students taking a course in Introduction to Software Engineering. Section 4 ... Tutor [19] and DEGREE [6], and an example of the systems addressing

Pay-per-Tracking: A Collaborative Masking Model for ...
addthis.com weborama.fr adtech.de gemius.pl outbrain.com criteo.com theadex.com betrad.com smartadserver.com akamaihd.net tumblr.com openx.net turn.com amazon-adsystem.com gstatic.com cedexis.com serving-sys.com adverticum.net casalemedia.com adnxs.c

Collaborative Planning as a Tool for Strengthening ...
Dec 6, 2001 - University of Memphis. Joy Clay. University of ..... Boulder, CO: Institute of Behavioral Science, University of Colorado. Gray, Barbara. 1989.

A Tool for Model-Driven Development of Collaborative Business ...
In [13, 15] a model-driven development method for collaborative business processes, which is based on the Model-Driven Architecture (MDA) [10], has been ... encourages a top-down approach and supports the modeling of four views: ..... Workshop of Req

A Robust Solution for Collaborative Data Mining - IJRIT
His research interests include networking and cloud computing. ... He is Microsoft Certified System Engineer & CISCO Certified Network Administrator, ...

Collaborative, Trust-Based Security Mechanisms for a ...
ures, in order to illustrate the operation of the trust system in a sample scenario ..... Data, folders, and files could have a data type as well as a re- lease restriction ...