Rating Prediction using Feature Words Extracted from Customer Reviews Masanao Ochi1

Makoto Okabe2

Rikio Onai3

The University of Electro-Communications1,2,3, JST PRESTO2 Our objective is predicting the target customer’ s review rate more accurate.

Changing Feature Vector Restaurant C

Rest. A Rest. B Rest. C

5 c1 It is very delicious

“Too many customers” About 85,000 customers c2 In experimental dataset.

delicious 2 The food is terrible

4 I enjoyed delicious cuisine

Increasing the Data density, reducing the RankLoss (Accuracy is improving). 0.28

“Sparse reviews” 0.5% full in experimental dataset.

3 Not so terrible food

c6

RankLoss-DataDensity 0.29

2.5

3

“Dense data” 30% full in experimental dataset.

3 It is not so terrible

c4 c5

3

terrible

3 dishes looked delicious

c3

4.5

RankLoss

Restaurant B

Restaurant A

“Reducing a dimension” 100 extracted words In experimental dataset.

0.27

0.26

Using the customers’ review rates feature vector (Existing)

0.25

Using the polarized words feature vector (Contribution)

0

We applied words feature vector to the Pranking algorithm and evaluated it on a corpus of Japanese golf course reviews. Our results outperformed the results of the original sparse dataset for all rating aspect.

Experimental Result word average nasty 2.5 cheap 2.6 bad 2.7 asking 2.8 apology 2.8 anger 2.8 finally 2.8 embarrassingly 2.8 ass 2.9 dirty 2.9

Course word average Inoue(name) 4.5 Seiichi(name) 4.5 great 4.4 skillful 4.4 Ishioka(golf course) 4.4 agitate 4.4 motivate 4.3 rarity 4.3 variety 4.3 championship 4.3

word lack nasty cheap weed enforced river sand pit monotone gully patchy

average 3.0 3.0 3.0 3.0 3.0 3.1 3.1 3.1 3.1 3.1

Cost_performance word average killer price 4.6 real cheap 4.6 performance 4.5 better 4.4 cost 4.4 CP 4.4 cheap 4.4 fire-sale 4.3 great 4.3 outclassing 4.3

word average relative expense 3.0 penny-wise 3.1 asking 3.2 terrible 3.2 apology 3.2 embarrassingly 3.3 arrogant 3.3 messy 3.3 ass 3.3 complaint 3.3

Extracted words example

We select only 100 words (50 most positive/negative polarized words) for each aspect using more than 100 times in the review corpus, and we adopt this word lists as feature vector elements for each rating aspect.

Accuracy Measurement

overall

course

1.1

cost_performance

The RankLoss accuracy was improved 15.6% on the average.

1.2 1.15

1.2

1.05

1.1

0.95 0.9 0.85

1.1

RankLoss

1

RankLoss

RankLoss

These words lists show that our method successfully extracts interesting feature words. Our method extracts not only positive and negative words but also words that explain the semantic context of the aspect. For example, “Inoue" and “Seiichi," shown in the Course table, is known as a famous golf course designer who has designed many golf courses in Japan. The negative side of the Course table includes words such as “weed," “river," and “sand pit." Because a customer's low rating is caused by complaints about the conditions of a golf course.

1.3

1.15

T is the number of products, is the t-th predicted output score, and is the t-th desired output score. RankLoss averages the sum of the difference and for each iteration (in this case, each restaurant.)

0.15

The relationship between the data density and the RankLoss (prediction accuracy) using the Book-Crossing Dataset.

The difference of the existing feature vector and our feature vector

Overall word average great 4.6 Rope(golf course) 4.5 praise 4.5 splendid 4.4 mindful 4.4 beautiful 4.4 thoughtful 4.4 perfect 4.4 rich 4.4 thrilling 4.4

0.05 0.1 DataDensity

1 0.9

1.05 base

1

customer

0.95

word_avg

0.9

0.8

0.8

0.75 0.7

0.85 0.8

0.7 1

6

11 iteration

1

16

6

11 iteration

1

16

6

11 iteration

16

Result graphs for each review aspect

Extracting Feature Words

We extract feature words to use the feature vector. We define feature words which have polarized average rates.

1. Score Distribution

2. Extracting Polarized Words "delicious" rate distribution for food aspect

"terrible" rate distribution for food aspect

1200

“5” is distributed all words equally.

2

… monotone taste, and terrible foods

5

… tastes are also better than these other terrible foods

Frequency

“2” is distributed all words equally.

It’ s easy to find negative rates when the word “terrible” was used in review comments on food aspect or positive rates when the word “delicious” was used.

1400

100

Frequency

Food Review 1 … foods are terrible, too.

1600

120

“1” is distributed all words equally.

80 60 40

1000 800 600 400

Distributing a review rate to each word in a review comments, and doing it for all review & rate pairs (about 320,000 reviews).

20

200 0

0 1

2

3

Rate

4

5

1

2

3

Rate

4

5

Rate distribution for each word. If you have any question, please mail me to [email protected]. If you are recruiting new Ph.D. students, offer to me!

Rating Prediction Background Existing Task 1 (Task A)

Finding Users who have similar preference & Predicting “5”

Tokyo Steakhouse’s all reviews

2/5

5/5

User1

3/5

User2

?/5

User3

Target User

Problem: There are No users who have same buying history!

Many Rating Predictors use other users’ rating rates as the feature vector.

Rating Prediction which informs customers about their evaluation for things they want in advance is very practical task. Users can decide things they want very quickly, because they only check the predicted rate without seeking useful few reviews in the other too many and long reviews. As existing rating predictors basically use other users’ review rates, they have a common problem which the feature vector is too large and sparse. Because all review sites often includes too many things for each customer to buy and evaluate, and the dimension of the feature vector increases as fast as the increase in the number of products. On the other hand, the task which transforms a review comment to review rates each review aspects use words as the feature vector. To approach the problem which the feature vector is too large and sparse on the rating prediction, we propose a new task which integrates both other users’ review rates and comments. We developed a simple method of improving the accuracy of rating prediction and reducing the dimension of the feature vector using feature words extracted from customer reviews.

Existing Task 2 (Task B) The target user’s review on Tokyo Steakhouse

... Everything was perfect! The food is mostly delicious twists on classics. ...

Review for the item

?/5, Price ?/5, Overall ?/5, ...

Problem: No practical use for the rating prediction!

The task which transforms a review comment to a review rate each review aspects use words as the feature vector.

Task C

Review sentences written by other users

Food

Review values written by other users

Target User

Other Users who bought the item

Transform to numerical order from a review sentence

Task A Proposing Task (Task C)

Finding the target user’s favorite words & Predicting “5”

Tokyo Steakhouse’s all reviews

4.5/5

2.8/5

“delicious”

“terrible”

Word1

Word2

Target User

?/5

4.4/5

Review Value

Review Sentence

Task B

“crazy” Word3

The illustration of the difference of each task.

Target User

We propose a new task that predicts a review rate using both other users’ review rates and comments.

We use the rakuten golf review dataset in this experiment. This dataset is provided from Japanese e-commerce company, and all reviews are written in Japanese.

Setting of the Experiment

This review daset have about 320,000 reviews about 1,700 golf courses written by 85,000 customers. Customers rate a golf course to 5 scale for 8 rating aspects. Review comments is written the total of 15.6 million words, and consist of 43,000 vocabulary. We extract feature words using the simple score distribution method. We select 100 feature words that have high and low average scores using more than 100 times in this review dataset, i.e., reducing the dimension of the feature vector from 85,000 to 100. We experiment for 520 customers writing comments for more than 20 golf courses.

The example of a review (It is written in Japanese language and we translate it to English.) Rate 1 2 3 4 5 Avg. Var.

Overall 0.9% 3.2% 26.0% 50.2% 19.7% 3.85 0.64

Staff 1.5% 4.4% 32.7% 42.8% 18.6% 3.73 0.75

Equipment 1.1% 8.3% 44.4% 34.2% 11.9% 3.47 0.72

Food 1.2% 5.9% 44.1% 35.8% 13.1% 3.54 0.70

Course 0.5% 4.1% 34.1% 45.8% 15.4% 3.71 0.63

Cost 0.7% 4.0% 30.6% 37.7% 27.0% 3.86 0.78

Length 2.2% 14.9% 52.3% 24.9% 5.8% 3.17 0.69

Width 1.7% 13.4% 46.9% 29.0% 9.1% 3.30 0.76

The summary of review rates.

If you are interested in my research and want more information, please visit my web site at https://sites.google.com/site/masanaoochi/ !

Rating Prediction using Feature Words Extracted from ...

“Seiichi," shown in the Course table, is known as a famous golf course designer who has designed many golf courses in Japan. The negative side of the Course table includes words such as “weed,". “river," and “sand pit." Because a customer's low rating is caused by complaints about the conditions of a golf course.

1MB Sizes 1 Downloads 228 Views

Recommend Documents

Rating Prediction using Feature Words Extracted from ...
1-5-1, Chofugaoka, Chofu, Tokyo, Japan1,2,3. {ochi ... Data sparseness reduces prediction accuracy. To ... We found that by successfully reducing data sparse-.

Feature Selection using Probabilistic Prediction of ...
selection method for Support Vector Regression (SVR) using its probabilistic ... (fax: +65 67791459; Email: [email protected]; [email protected]).

Extracted pages from Liquor Permit.pdf
Home Phone: 8708792362. Business Phone: 2720 S CAMDEN RD. PINE BLUFF. AR. 71603. On June 16th of this year, $1,202 was drafted from my bank.

Experimental Results Prediction Using Video Prediction ...
RoI Euclidean Distance. Video Information. Trajectory History. Video Combined ... Training. Feature Vector. Logistic. Regression. Label. Query Feature Vector.

Utilization of natural pigment extracted from ...
pigments|CuCNS cell may be due to high resistivity of the. CuSCN film. The adsorbed water molecules are known to be degradated dye molecules in N|D|P type ...

Prediction-time Active Feature-value Acquisition for ...
where we see that the models built using Active Feature-value Acquisition (AFA) perform .... In our study, we use unlabeled margins [9] as our measure; which gives us .... alternative feature-reduction techniques may lead to a more meaningful ...

Feature Term Subsumption using Constraint ...
Feature terms are defined by its signature: Σ = 〈S, F, ≤, V〉. ..... adding symmetry breaking constraints to the CP implementation. ... Tech. Rep. 11, Digital. Research Laboratory (1992). [3] Aıt-Kaci, H., Sasaki, Y.: An axiomatic approach to

Anesthesia Prediction Using Fuzzy Logic - IJRIT
Thus a system proposed based on fuzzy controller to administer a proper dose of ... guide in developing new anesthesia control systems for patients based on ..... International conference on “control, automation, communication and energy ...

Sparse-parametric writer identification using heterogeneous feature ...
The application domain precludes the use ... Forensic writer search is similar to Information ... simple nearest-neighbour search is a viable so- .... more, given that a vector of ranks will be denoted by ╔, assume the availability of a rank operat

Feature Adaptation Using Projection of Gaussian Posteriors
Section 4.2 describes the databases and the experimental ... presents our results on this database. ... We use the limited memory BFGS algorithm [7] with the.

Sparse-parametric writer identification using heterogeneous feature ...
Retrieval yielding a hit list, in this case of suspect documents, given a query in the form .... tributed to our data set by each of the two subjects. f6:ЮаЯвбЗbзбйb£ ...

Unsupervised Feature Selection Using Nonnegative ...
trix A, ai means the i-th row vector of A, Aij denotes the. (i, j)-th entry of A, ∥A∥F is ..... 4http://www.cs.nyu.edu/∼roweis/data.html. Table 1: Dataset Description.

FEATURE NORMALIZATION USING STRUCTURED ...
School of Computer Engineering, Nanyang Technological University, Singapore. 4. Department of Human ... to multi-class normalization for better performance.

Tender Words from a Mentor
Committed to Excellence in Communicating Biblical Truth and Its Application. S02 ..... or call USA 1-800-772-8888 • AUSTRALIA +61 3 9762 6613 • CANADA ...

Tender Words from a Mentor
I have been sent out to tell others about the life he has promised through faith in Christ Jesus. I am writing to Timothy, my dear son. May God the Father and ...

Detecting Product Review Spammers using Rating ...
[email protected]. Nitin Jindal. Department of Computer. Science. University of ... to measure the degree of spam for each reviewer and apply them on an ...

From Words to Action.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. From Words to ...

Treacherous words from VocaBuilder.pdf
Sign in. Loading… Whoops! There was a problem loading more pages. Retrying... Whoops! There was a problem previewing this document. Retrying.Missing:

evidence from the sovereign rating business
the action in the sovereign rating business used to be focussed prior to the current spate of activity in continental. Europe—and robust empirical tests, we scrutinized the actions of the three leading rating agencies in the sovereign debt market a

AMIFS: Adaptive Feature Selection by Using Mutual ...
small as possible, to avoid increasing the computational cost of the learning algorithm as well as the classifier complexity, and in many cases degrading the ...

A baseline feature set for learning rhetorical zones using
Rhetorical zone analysis is an application of natural lan- guage processing in .... Support for recognizing specific types of information and ex- ploiting discourse ...