Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Feasibility of a privacy preserving collaborative filtering scheme on the Google App Engine – a performance case study Anirban Basu

Jaideep Vaidya Theo Dimitrakos Hiroaki Kikuchi

Department of Electrical Engineering Faculty of Engineering Tokai University (Japan)

ACM SAC 2012 March 27, 2012 Anirban Basu, et al.

PPCF performance case study on the GAE/J

1/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What and why?

What are we doing and why? Cloud computing is attractive for many reasons: low total cost of ownership through virtualised resource sharing, rapid on-demand scaling, high speed network access, and so on. Recommender systems help users tackle vast amounts of information, but recommendation (e.g., using collaborative filtering) requires computing power. Cloud is a solution for building a recommendation system, but there is a problem. . . . . . privacy of users’ preferential data, for which there is privacy preserving collaborative filtering (PPCF). Anirban Basu, et al.

PPCF performance case study on the GAE/J

2/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What and why?

What are we doing and why? Cloud computing is attractive for many reasons: low total cost of ownership through virtualised resource sharing, rapid on-demand scaling, high speed network access, and so on. Recommender systems help users tackle vast amounts of information, but recommendation (e.g., using collaborative filtering) requires computing power. Cloud is a solution for building a recommendation system, but there is a problem. . . . . . privacy of users’ preferential data, for which there is privacy preserving collaborative filtering (PPCF). Anirban Basu, et al.

PPCF performance case study on the GAE/J

2/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What and why?

What are we doing and why? Cloud computing is attractive for many reasons: low total cost of ownership through virtualised resource sharing, rapid on-demand scaling, high speed network access, and so on. Recommender systems help users tackle vast amounts of information, but recommendation (e.g., using collaborative filtering) requires computing power. Cloud is a solution for building a recommendation system, but there is a problem. . . . . . privacy of users’ preferential data, for which there is privacy preserving collaborative filtering (PPCF). Anirban Basu, et al.

PPCF performance case study on the GAE/J

2/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What and why?

What are we doing and why? Cloud computing is attractive for many reasons: low total cost of ownership through virtualised resource sharing, rapid on-demand scaling, high speed network access, and so on. Recommender systems help users tackle vast amounts of information, but recommendation (e.g., using collaborative filtering) requires computing power. Cloud is a solution for building a recommendation system, but there is a problem. . . . . . privacy of users’ preferential data, for which there is privacy preserving collaborative filtering (PPCF). Anirban Basu, et al.

PPCF performance case study on the GAE/J

2/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What and why?

This talk:

is about the feasibility (in terms of performance) of a privacy preserving collaborative filtering (PPCF) scheme on the Google App Engine for Java (GAE/J). refers to the PPCF scheme proposed in:

Anirban Basu, et al.

PPCF performance case study on the GAE/J

3/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What and why?

This talk: is about the feasibility (in terms of performance) of a privacy preserving collaborative filtering (PPCF) scheme on the Google App Engine for Java (GAE/J). refers to the PPCF scheme proposed in: A. Basu, H. Kikuchi, and J. Vaidya. 2011. Privacy Preserving weighted Slope One predictor for Item-based Collaborative Filtering, In: International Workshop on Trust and Privacy in Distributed Information Processing (workshop at the IFIPTM), Copenhagen, Denmark.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

3/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What and why?

This talk: is about the feasibility (in terms of performance) of a privacy preserving collaborative filtering (PPCF) scheme on the Google App Engine for Java (GAE/J). refers to the PPCF scheme proposed in: A. Basu, H. Kikuchi, and J. Vaidya. 2011. Privacy Preserving weighted Slope One predictor for Item-based Collaborative Filtering, In: International Workshop on Trust and Privacy in Distributed Information Processing (workshop at the IFIPTM), Copenhagen, Denmark.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

3/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What is Slope One The generalised weighted Slope One

Collaborative filtering (CF), briefly! We often have user-item rating data like this1 :

Alice Bob Carol Dave

Canon 7D 5 3 4

Leica M9 4 5 ? 3

Nikon D7000 2 4 -

... ... ... ... ...

Olympus OM-D 3 3 -

The objective is to find the rating for Leica M9 for Carol. A well-known recommendation technique, based on the preferences of the community, is collaborative filtering (CF). 1

“-” indicates the absence of a rating. Anirban Basu, et al.

PPCF performance case study on the GAE/J

4/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What is Slope One The generalised weighted Slope One

Collaborative filtering (CF), briefly! We often have user-item rating data like this1 :

Alice Bob Carol Dave

Canon 7D 5 3 4

Leica M9 4 5 ? 3

Nikon D7000 2 4 -

... ... ... ... ...

Olympus OM-D 3 3 -

The objective is to find the rating for Leica M9 for Carol. A well-known recommendation technique, based on the preferences of the community, is collaborative filtering (CF). 1

“-” indicates the absence of a rating. Anirban Basu, et al.

PPCF performance case study on the GAE/J

4/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What is Slope One The generalised weighted Slope One

CF using Slope One predictors

Based on: Lemire, D., Maclachlan, A. 2005. Slope one predictors for online rating-based collaborative filtering. In: Society for Industrial Mathematics.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

5/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What is Slope One The generalised weighted Slope One

CF using Slope One predictors

A CF scheme with predictors of the form f (x) = x + b, hence “slope one”. Simple yet efficient: compared with CF using cosine similarity and singular value decomposition, Slope One has the lowest mean absolute error (MAE) and root mean squared error (RMSE) on the Apache Mahout reference implementation for MovieLens 100K and 1M datasets.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

5/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What is Slope One The generalised weighted Slope One

CF using Slope One predictors

A CF scheme with predictors of the form f (x) = x + b, hence “slope one”. Simple yet efficient: compared with CF using cosine similarity and singular value decomposition, Slope One has the lowest mean absolute error (MAE) and root mean squared error (RMSE) on the Apache Mahout reference implementation for MovieLens 100K and 1M datasets.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

5/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What is Slope One The generalised weighted Slope One

CF using Slope One predictors Items

i_1 i_2 i_3 . . . i_k . . . i_n u_1 u_2 Users

. . .

Sparse user-item rating matrix (m x n)

u_m Predict:

u_x

?

i_k

Figure: The general CF problem.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

5/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What is Slope One The generalised weighted Slope One

CF using Slope One predictors To predict using Slope One, two things ought to be pre-computed: Deviation matrix or ∆: each element is the total deviation of ratings between a pair of items, calculated over cases where both items have been rated by the same user. If the ratings matrix is of dimension m × n (i.e., n items) then ∆ is of dimension n × n. Cardinality matrix or φ: each element is the count of the cases where items in a pair have been both rated by the same user. It is of the same dimension as ∆.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

5/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What is Slope One The generalised weighted Slope One

CF using Slope One predictors To predict using Slope One, two things ought to be pre-computed: Deviation matrix or ∆: each element is the total deviation of ratings between a pair of items, calculated over cases where both items have been rated by the same user. If the ratings matrix is of dimension m × n (i.e., n items) then ∆ is of dimension n × n. Cardinality matrix or φ: each element is the count of the cases where items in a pair have been both rated by the same user. It is of the same dimension as ∆.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

5/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What is Slope One The generalised weighted Slope One

CF using Slope One predictors Items

i_1 i_2 i_3 . . . i_k . . . i_n u_1 u_2 Users

. . .

Items

i_1 i_2

Sparse user-item rating matrix (m x n)

.

.

. i_n

i_1 i_2

u_m Items Slope One pre-computation phase

. . .

Sparse item-item deviation and cardinality matrices (n x n)

i_n

Figure: Slope One pre-computation creates a ‘model’ which is used for prediction.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

5/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What is Slope One The generalised weighted Slope One

The weighted Slope One predictor The average deviations of ratings from item a to item b is given as: P P ∆a,b (ri,a − ri,b ) i δi,a,b δa,b = = = i (1) φa,b φa,b φa,b where φa,b is the number of the users who have rated both items while δi,a,b = ri,a − ri,b is the deviation of the rating of item a from that of item b both given by user i. Thus, the rating for user u and item x using the weighted Slope One is predicted as: P P a|a6=x (δx,a + ru,a )φx,a a|a6=x (∆x,a + ru,a φx,a ) P P ru,x = = a|a6=x φx,a a|a6=x φx,a (2) Anirban Basu, et al.

PPCF performance case study on the GAE/J

6/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

What is Slope One The generalised weighted Slope One

The weighted Slope One predictor The average deviations of ratings from item a to item b is given as: P P ∆a,b (ri,a − ri,b ) i δi,a,b δa,b = = = i (1) φa,b φa,b φa,b where φa,b is the number of the users who have rated both items while δi,a,b = ri,a − ri,b is the deviation of the rating of item a from that of item b both given by user i. Thus, the rating for user u and item x using the weighted Slope One is predicted as: P P a|a6=x (δx,a + ru,a )φx,a a|a6=x (∆x,a + ru,a φx,a ) P P ru,x = = a|a6=x φx,a a|a6=x φx,a (2) Anirban Basu, et al.

PPCF performance case study on the GAE/J

6/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Additively homomorphic encryption

Preserving privacy with Slope One CF Uses an additively homomorphic public key cryptosystem – the Damgärd-Jurik cryptosystem. homomorphic addition (we denote encryption and decryption functions by E() and D() respectively): E(m1 + m2 ) = E(m1 ) · E(m2 ) and homomorphic multiplication: E(m1 · π) = E(m1 )π

Anirban Basu, et al.

PPCF performance case study on the GAE/J

7/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Additively homomorphic encryption

Preserving privacy with Slope One CF Uses an additively homomorphic public key cryptosystem – the Damgärd-Jurik cryptosystem. homomorphic addition (we denote encryption and decryption functions by E() and D() respectively): E(m1 + m2 ) = E(m1 ) · E(m2 ) and homomorphic multiplication: E(m1 · π) = E(m1 )π

Anirban Basu, et al.

PPCF performance case study on the GAE/J

7/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Additively homomorphic encryption

Preserving privacy with Slope One CF Based on the previous equation for plaintext Slope One predictors, we can write: X Y (∆x,a + ru,a φx,a ) = D( (E(∆x,a )(E(ru,a )φx,a ))) (3) a|a6=x

a|a6=x

and thus, the final prediction is given as: Q D( a|a6=x (E(∆x,a )(E(ru,a )φx,a ))) P ru,x = a|a6=x φx,a

Anirban Basu, et al.

PPCF performance case study on the GAE/J

(4)

7/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Additively homomorphic encryption

Preserving privacy with Slope One CF Based on the previous equation for plaintext Slope One predictors, we can write: X Y (∆x,a + ru,a φx,a ) = D( (E(∆x,a )(E(ru,a )φx,a ))) (3) a|a6=x

a|a6=x

and thus, the final prediction is given as: Q D( a|a6=x (E(∆x,a )(E(ru,a )φx,a ))) P ru,x = a|a6=x φx,a

Anirban Basu, et al.

PPCF performance case study on the GAE/J

(4)

7/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Additively homomorphic encryption

Preserving privacy with Slope One CF The E(∆) and φ matrices are pre-computed. An encrypted query is run and the result is decrypted using threshold decryption keys. Note that the encrypted query can be answered by any of the collaborating sites without leakage of private information. Supports horizontal and vertical partitioning, using secure scalar product (Vaidya and Clifton), of the rating dataset.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

7/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Additively homomorphic encryption

Preserving privacy with Slope One CF The E(∆) and φ matrices are pre-computed. An encrypted query is run and the result is decrypted using threshold decryption keys. Note that the encrypted query can be answered by any of the collaborating sites without leakage of private information. Supports horizontal and vertical partitioning, using secure scalar product (Vaidya and Clifton), of the rating dataset.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

7/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Additively homomorphic encryption

Preserving privacy with Slope One CF The E(∆) and φ matrices are pre-computed. An encrypted query is run and the result is decrypted using threshold decryption keys. Note that the encrypted query can be answered by any of the collaborating sites without leakage of private information. Supports horizontal and vertical partitioning, using secure scalar product (Vaidya and Clifton), of the rating dataset.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

7/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Additively homomorphic encryption

Preserving privacy with Slope One CF The E(∆) and φ matrices are pre-computed. An encrypted query is run and the result is decrypted using threshold decryption keys. Note that the encrypted query can be answered by any of the collaborating sites without leakage of private information. Supports horizontal and vertical partitioning, using secure scalar product (Vaidya and Clifton), of the rating dataset.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

7/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

Google App Engine for Java (GAE/J)

User can query any of the CF sites k collaborative filtering (CF) sites

CF site 1 (GAE app)

CF site 2 (GAE app)

...

CF site k (GAE app)

Google App Engine (GAE/J)

Figure: An expected deployment on the Google App Engine. Anirban Basu, et al.

PPCF performance case study on the GAE/J

8/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

Google App Engine for Java (GAE/J)

A Software-as-a-Service (SaaS) construction Platform-as-a-Service (PaaS) cloud. It offers on-demand transparent scalability with low costs, including a daily free quota. Java servlet based computation model but also allows for batch computations using task queues, computationally more powerful backend instances and cron.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

8/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

Google App Engine for Java (GAE/J)

A Software-as-a-Service (SaaS) construction Platform-as-a-Service (PaaS) cloud. It offers on-demand transparent scalability with low costs, including a daily free quota. Java servlet based computation model but also allows for batch computations using task queues, computationally more powerful backend instances and cron.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

8/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

The experiments and our test platform

We run three experiments – building blocks for the PPCF scheme: Cryptographic primitives of the Damgärd-Jurik cryptosystem. Secure scalar product. Reading in and storing the MovieLens 100K dataset.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

9/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

The experiments and our test platform

We run three experiments – building blocks for the PPCF scheme: Cryptographic primitives of the Damgärd-Jurik cryptosystem. Secure scalar product. Reading in and storing the MovieLens 100K dataset.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

9/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

The experiments and our test platform

We run three experiments – building blocks for the PPCF scheme: Cryptographic primitives of the Damgärd-Jurik cryptosystem. Secure scalar product. Reading in and storing the MovieLens 100K dataset.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

9/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

The experiments and our test platform on: the Google App Engine for Java 1.5.2 production with the daily free quota. the GAE/J SDK 1.5.2 on a 2.53 GHz Intel Core 2 Duo 64-bit processor, 8 GB RAM running Mac OS X 10.6.8 and 64-bit Java 1.6. with, the tests correct as of July 22, 2011 noting that the GAE/J billing model has changed substantially after August 31, 2011.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

9/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

The experiments and our test platform on: the Google App Engine for Java 1.5.2 production with the daily free quota. the GAE/J SDK 1.5.2 on a 2.53 GHz Intel Core 2 Duo 64-bit processor, 8 GB RAM running Mac OS X 10.6.8 and 64-bit Java 1.6. with, the tests correct as of July 22, 2011 noting that the GAE/J billing model has changed substantially after August 31, 2011.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

9/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

The experiments and our test platform on: the Google App Engine for Java 1.5.2 production with the daily free quota. the GAE/J SDK 1.5.2 on a 2.53 GHz Intel Core 2 Duo 64-bit processor, 8 GB RAM running Mac OS X 10.6.8 and 64-bit Java 1.6. with, the tests correct as of July 22, 2011 noting that the GAE/J billing model has changed substantially after August 31, 2011.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

9/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

Cryptographic primitives

Figure: Damgärd-Jurik threshold key generation. The timings for GAE/J at 1024 bits are linearly estimated because the test failed due to the 30s execution time limit. Anirban Basu, et al.

PPCF performance case study on the GAE/J

10/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

Cryptographic primitives

Figure: Damgärd-Jurik encryption and threshold decryption. Anirban Basu, et al.

PPCF performance case study on the GAE/J

10/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

Cryptographic primitives

Figure: Damgärd-Jurik homomorphic addition and multiplication. Anirban Basu, et al.

PPCF performance case study on the GAE/J

10/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

Secure scalar product (run on the task queue)

Target GAE/J SDK GAE/J SDK GAE/J SDK GAE/J SDK

Bits 256 256 256 256 256 256 256 256

v-size2 100 100 1000 1000 10K 10K 100K 100K

enc3 157.5ms 43.7ms 1511ms 362ms 15649ms 3689ms 154995ms 36056ms

ssp4 22.9ms 9.55ms 223ms 88.5ms 2231ms 835ms 22674ms 7962ms

t-dec5 13.7ms 4.5ms 12.8ms 3.4ms 13.1ms 4.1ms 12.6ms 3.5ms

2

v-size: Secure scalar product vector size enc: Total encryption 4 ssp: Secure scalar product 5 t-dec: Threshold decryption 3

Anirban Basu, et al.

PPCF performance case study on the GAE/J

11/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

Reading in and storing the MovieLens 100K dataset Multiple approaches had only partial successes on the GAE/J due to time quota restrictions. Blob storage not allowed in the free quota! Even using just the in-memory cache, on the SDK the task completed in 53,233ms while it failed after 598,055ms as a DeferredTask on the task queue on the GAE/J. Deletion of the dataset was fast using MapReduce, but the Google App Engine only supports Map function on the Java version so far.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

12/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

Reading in and storing the MovieLens 100K dataset Multiple approaches had only partial successes on the GAE/J due to time quota restrictions. Blob storage not allowed in the free quota! Even using just the in-memory cache, on the SDK the task completed in 53,233ms while it failed after 598,055ms as a DeferredTask on the task queue on the GAE/J. Deletion of the dataset was fast using MapReduce, but the Google App Engine only supports Map function on the Java version so far.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

12/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

Reading in and storing the MovieLens 100K dataset Multiple approaches had only partial successes on the GAE/J due to time quota restrictions. Blob storage not allowed in the free quota! Even using just the in-memory cache, on the SDK the task completed in 53,233ms while it failed after 598,055ms as a DeferredTask on the task queue on the GAE/J. Deletion of the dataset was fast using MapReduce, but the Google App Engine only supports Map function on the Java version so far.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

12/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

The limitations of the GAE/J and alternatives

Execution time limit. High replication but slow access to datastore. Lack of support for concurrency. Lack of control over resource allocation (it is a bit better now). Alternative similar SaaS engine: Amazon Elastic Beanstalk.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

13/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

The limitations of the GAE/J and alternatives

Execution time limit. High replication but slow access to datastore. Lack of support for concurrency. Lack of control over resource allocation (it is a bit better now). Alternative similar SaaS engine: Amazon Elastic Beanstalk.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

13/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

The limitations of the GAE/J and alternatives

Execution time limit. High replication but slow access to datastore. Lack of support for concurrency. Lack of control over resource allocation (it is a bit better now). Alternative similar SaaS engine: Amazon Elastic Beanstalk.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

13/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

The limitations of the GAE/J and alternatives

Execution time limit. High replication but slow access to datastore. Lack of support for concurrency. Lack of control over resource allocation (it is a bit better now). Alternative similar SaaS engine: Amazon Elastic Beanstalk.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

13/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

The Google App Engine Experimental results Inference

The limitations of the GAE/J and alternatives

Execution time limit. High replication but slow access to datastore. Lack of support for concurrency. Lack of control over resource allocation (it is a bit better now). Alternative similar SaaS engine: Amazon Elastic Beanstalk.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

13/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Conclusions The GAE/J in its current state (as of version 1.5.2) is unusable for our PPCF implementation scenario. In general: the cloud may not always provide better performance, especially with public key encryption. GAE/J performance should be improved to match competitors. theoretical algorithms that intend to be deployed on the cloud must be tested on real cloud platforms.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

14/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Conclusions The GAE/J in its current state (as of version 1.5.2) is unusable for our PPCF implementation scenario. In general: the cloud may not always provide better performance, especially with public key encryption. GAE/J performance should be improved to match competitors. theoretical algorithms that intend to be deployed on the cloud must be tested on real cloud platforms.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

14/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Conclusions The GAE/J in its current state (as of version 1.5.2) is unusable for our PPCF implementation scenario. In general: the cloud may not always provide better performance, especially with public key encryption. GAE/J performance should be improved to match competitors. theoretical algorithms that intend to be deployed on the cloud must be tested on real cloud platforms.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

14/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Conclusions The GAE/J in its current state (as of version 1.5.2) is unusable for our PPCF implementation scenario. In general: the cloud may not always provide better performance, especially with public key encryption. GAE/J performance should be improved to match competitors. theoretical algorithms that intend to be deployed on the cloud must be tested on real cloud platforms.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

14/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Conclusions The GAE/J in its current state (as of version 1.5.2) is unusable for our PPCF implementation scenario. In general: the cloud may not always provide better performance, especially with public key encryption. GAE/J performance should be improved to match competitors. theoretical algorithms that intend to be deployed on the cloud must be tested on real cloud platforms.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

14/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Future work

Run a more exhaustive set of experiments and compare the various SaaS engines. Investigate, design and implement: less computationally demanding and more suitable for the cloud privacy preserving recommender systems.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

15/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Future work

Run a more exhaustive set of experiments and compare the various SaaS engines. Investigate, design and implement: less computationally demanding and more suitable for the cloud privacy preserving recommender systems.

Anirban Basu, et al.

PPCF performance case study on the GAE/J

15/16

Overview Collaborative filtering Privacy-preserving collaborative filtering (PPCF) Feasbility on the cloud Conclusions Question time!

Thank you for listening!

Any questions?

Anirban Basu, et al.

PPCF performance case study on the GAE/J

16/16

Feasibility of a privacy preserving collaborative ...

App Engine – a performance case study. Anirban Basu Jaideep Vaidya Theo Dimitrakos ... filtering) requires computing power. Cloud is a solution for building a recommendation system, but there is a problem. . . ...privacy ...... High replication but slow access to datastore. Lack of support for concurrency. Lack of control over ...

313KB Sizes 1 Downloads 223 Views

Recommend Documents

Feasibility of a privacy preserving collaborative filtering ... - Anirban Basu
cloud for running web applications developed in Python,. 3Report available at .... Extensions in the GAE/J, the open-source University of. Texas (Dallas) Paillier ...

Feasibility of a privacy preserving collaborative filtering ... - Anirban Basu
running on a development machine with a 2.53 GHz Intel. Core 2 Duo 64-bit .... standard Apache Tomcat13 application servers. Beanstalk is part of the bigger ...

Practical privacy preserving collaborative filtering on ...
A recommendation example: Amazon's “people who buy x also buy y”. Recommendation .... Amazon Web Services Elastic Beanstalk (AWS EBS)2. PaaS cloud.

Practical privacy preserving collaborative filtering on the Google App ...
Google App Engineにおけるプライバシー保護協調フィルタリング ... 方式を Platform-as-a-Service (PaaS) cloud によって実現されている Software-as-a-Service (SaaS).

Privacy-preserving collaborative filtering on the cloud ...
which implements a small subset of SQL. ... used the Amazon Relational Database Service (RDS), where a ... The performance also degrades if the database.

Privacy-preserving collaborative filtering for the cloud
Your private rating data may not be safe on the cloud because of insider and outsider threats. Anirban Basu, et al. Cloud based privacy preserving CF. 4/22 ...

Efficient privacy-preserving collaborative filtering based ...
Recently, more web-based services offered through cloud computing have only exacerbated the problem. User-tailored ...... Springer-Verlag, August 2000.

Privacy-Preserving Incremental Data Dissemination
In this paper, we consider incremental data dissemination, where a ..... In other words, the data provider must make sure that not only each ...... the best quality datasets, these data are vulnerable to inference attacks as previously shown.

MobiShare: Flexible Privacy-Preserving Location ...
ests, habits, and health conditions, especially when they are in ... Electronic Frontier Foundation (EFF), can provide the location .... tower keeps a record of A's current location in its user info ..... Social serendipity: Mobilizing social softwar

Privacy Preserving Support Vector Machines in ... - GEOCITIES.ws
public key and a signature can be used. .... authentication code (MAC) which is derived from the ... encryption-decryption and authentication to block the.

Privacy-Preserving Protocols for Perceptron ... - Semantic Scholar
the case of client-server environment, and it is assumed that the neural ... Section 4 is dedicated ... preserving protocol neural network for client-server environ-.

Slicing: A New Approach for Privacy Preserving Data ...
Computer Science at Purdue University, West Lafayette, IN 47906. E-mail: {li83, ninghui ..... be an original tuple, the matching degree between t and B is the product of ...... online privacy protection, privacy-preserving data publishing, and oper-.

Privacy-Preserving Protocols for Perceptron ... - Semantic Scholar
School of Information Technology and. Engineering (SITE). University ... to the best of our knowledge, there is no privacy-preserving technique to collaboratively ...

Privacy Preserving and Scalable Processing of Data ...
tremendously in accordance with the Big Data trend, thereby making it a challenge for commonly-used software tools to capture, manage and process such large ... most important research topics in data security field and it have become a serious concer

Privacy Preserving and Scalable Processing of Data ...
tremendously in accordance with the Big Data trend, thereby making it a ... Cloud computing is a model for enabling convenient, on-demand network access to a .... We briefly review recent research on data privacy preservation and privacy ...

A Privacy-Preserving Architecture for the Semantic Web ...
Although RDF provides the technology to describe meaning, the se- mantic Web .... Population Profile. Constructor. Database. Profile Tag. Generator. Profile Tag.

A Privacy-Preserving Architecture for the Semantic Web ...
vergence between a user's apparent tag distribution and the population's. Sec. 2 explores the .... tained from specific modules integrated into the user's system. Before giving any .... Population Profile. Constructor. Database. Profile Tag. Generato

A Privacy-Protecting Architecture for Collaborative ...
Sep 15, 2011 - Rom ance. Sci-FiWar M. isteryDocum entary. Anim ation. FantasyHorrorChildren. M usical. W estern. Film. -N oir. IM .... User side. Network side.

A Privacy-Protecting Architecture for Collaborative ...
Despite the many advantages recommendation systems are bringing to users, the information ... about the idea that their profiles may reveal sensitive information such as health- ..... For this purpose, the module keeps a record of all the items that

PRIVACY PRESERVING k-MEANS CLUSTERING IN ...
Extracting meaningful and valuable knowledge from databases is often done by ... Cluster analysis is a technique in data mining, by which data can be di-.

Perturbation based privacy preserving Slope One ...
If we are to predict Y from X, we can use the basic Slope One predictor as Y = X +(Y − X) ..... OS X 10.7.2 and 64-bit Java 1.6.0 29 environment on an Apple Macbook Pro ... requirement informs us that a 2-dimensional array (e.g. long[][]) is an ...

Gmatch Secure and Privacy-Preserving Group Matching in Social ...
Each group member generate his pub- lic/private key pair (pki. , ski) for computing ring signatures. The ring signature scheme we used is BGLS [4], which is.

PReFilter: An Efficient Privacy-preserving Relay ...
†Department of Electrical and Computer Engineering, University of Waterloo, Waterloo, Ontario, Canada. ‡Faculty of Business and Information Technology, University of Ontario Institute of Technology, Oshawa, Ontario, Canada. §INRIA Lille - Nord E

Privacy-preserving query log mining for business ... - ACM Digital Library
transfer this problem into the field of privacy-preserving data mining. We characterize the possible adversaries interested in disclosing Web site confidential data and the attack strategies that they could use. These attacks are based on different v