Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

Deriving Customer Loyalty and Its Determinants from Online Reviews using Support Vector Machine Thomas Sebastian Curtin University of Technology, Australia

About the Authors Thomas Sebastian is an Honours student at the School of Information Systems of Curtin Business School at Curtin University of Technology, Australia.

Corresponding Author Mailing Address, Phone and E-mail Thomas Sebastian Email: [email protected]

61

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

Deriving Customer Loyalty and Its Determinants from Online Reviews using Support Vector Machine Thomas Sebastian Curtin University of Technology, Australia

Abstract With the advent of the Social Web, a great number of end users constantly generate content through various social media including online reviews. It is known that customer opinions and loyalty play important roles in facilitating retailing. In this paper, we aim to address two key research problems, (1) can we accurately predict customer loyalty from online reviews, and (2) can we identify the main driving factors behind customer loyalty from online reviews? To achieve this goal, we developed a software system that crawls the social web, extracts useful information and employed the Support Vector Machine (SVM) to conduct feature selection and classification prediction. Through experimentation, we showed that our method can accurately predict customer loyalty towards a brand or service based on online social reviews. Keywords: Support Vector Machine, Customer Loyalty, Online Reviews

62

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

Introduction It has been reported that 3.5 billion brand-related conversations were generated through various online social media in the US everyday1. In these conversations, customers may express their opinions about a certain brand or a feature of a product. Many more customers review these opinions in order to discover real user experiences before purchasing the product/service. In fact, nearly 80% of consumers consider recommendations from “influential” persons (e.g. family members and friends) and persons “like themselves” are more trustworthy than any forms of advertisements (Keller, Wester, & Green, 2006). Bickart and Schindler (2001) suggested that online customer forums hold far more trustworthiness, understanding, and significance than information found on the official corporate Web pages. On the other hand, the rapid growth of e-commerce allows more manufacturers and consumers to participate in the global e-commerce. As a result, there are millions of goods and services ready to be sold across the Web. On one hand, customers will have more variety of options, but on the other hand, customers are now faced with a great deal of offers without knowing which one best their needs at a reasonable price range. To tackle this issue, people and experts start to provide online reviews about the products/services that they have experience with. These comments and reviews could be found almost anywhere, most notably in retail websites or in discussion forums. There are also some websites that are specialized in collecting customer’s review. In these websites, people usually give their comments and ratings for the product they have bought before. For some popular products there could be hundreds or even thousands of reviews. It is known that customer opinions and customer loyalty are important in e-commerce. Therefore, retailers around the world strive to glean such consumer intelligence through many methods, including surveys, interviews and field studies. However, to collect opinions through these methods is costly prohibitive, labour-intensive, and time-consuming. In this paper, we aim to address two key research questions, namely, (1) can we accurately predict customer loyalty from online reviews, and (2) can we identify the main driving factors behind customer loyalty from online reviews?

To achieve this goal, we

developed a software system that crawls the social web, extracts useful information and 1

http://www.pqmedia.com/word-of-mouth-marketing-forecast-2007.html

63

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

employed the Support Vector Machine (SVM) to conduct feature selection and classification prediction. The paper is organized as follows. Section 2 provides succinct theory on customer loyalty. Section 3 and Section 4 describes our proposed customer review analysis framework and loyalty prediction model. We implement our framework in section 5 and refine our model in section 6. A brief discussion on the results is provided in section 7. We conclude this paper in section 8. Background Customer loyalty is defined as (Oliver, 1999) “a deeply held commitment to rebuy or repatronise a preferred product/service consistently in the future, thereby causing repetitive same-brand or same brand-set purchasing, despite situational influences and marketing efforts having the potential to cause switching behaviour”. Heskett (1994) and Reichheld and Teal (1996) argue that customer loyalty is the key for long-term profitability. Reichheld and Schefter (2000) stated that in e-commerce customer loyalty is not only one of many ways to make profit, but also essential for survival. They also noted that by increasing customer retention by only 5%, it could increase the profits by 25% to 95% in the long run. One of the quotes in their paper explains it all: “Price does not rule the Web; trust does.” In a competitive market like e-commerce, building customer loyalty is a key factor in capturing larger market share and producing sustainable competitive advantage (Jarvis & Mayo, 1986). What factors determine customer loyalty? Ribbink et al. (2004) proposed three major factors. The first and the most prominent factor is satisfaction, which has been emphasised in (Heskett, et al., 1994). Second to satisfaction is trust, which prolong customer relationships (Papadopoulou, Andreou, Kanellis, & Martakos, 2001; Singh & Sirdeshmukh, 2000) and decrease the perceived risk in transaction (Garbarino & Johnson, 1999). The last one is quality, or in other words, perceived value received by the customer (Van Riel, Liljander, & Jurriens, 2001). It is one thing to recognize the importance of customer loyalty and its main driving factors, but it is quite another to pinpoint whether customers are satisfied or not, or if they

64

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

receive the intended value, or if they believe that the price is worth the product. The answer to these questions can be found in the feedback in the form of customer reviews. These reviews may reflect the loyalty determinant factors such as satisfaction, trust and perceived value received by the customers. Their judgment towards the services and their experience during the transaction can also be derived from reviews. However, capturing reviews and comments from websites is not an easy task. One product can be sold in more than one website and one website can sell products from different manufacturers, which makes it harder to collect all of the data. Moreover, not all websites allow access to their database. Textual information also poses problems. For instance, it is not feasible for retailers to read all the comments or blogs one by one. While there is a great body of knowledge for determining customer loyalty, it has been difficult to apply it to online based social reviews. The solution to this problem is a method that enables people to capture these reviews and create a Loyalty Prediction Model from these reviews, and use the model to predict customer loyalty and derive the preferences (namely, the loyalty driving factors). To build this model, we will employ a supervised machine learning approach. Supervised Machine Learning Support Vector Machine (SVM) is one of the popular machine learning algorithms, which is used to perform accurate classification.

Liu (2007) claims that in most of the

application, SVM is the best methods for classification, especially for very high dimensional data. SVM was introduced by Vapnik and co-workers (Boser, Guyon, & Vapnik, 1992; Cortes & Vapnik, 1995; Vapnik, 2000) and has been developed greatly ever since. The aim of SVM is to generate a model from training data that can predicts the target value of vectors or instances in testing data, given only the attributes (Hsu, Chang, & Lin, 2003). There are several common kernels: Polynomial, Radial Basis Function (RBF) and Sigmoid. For classification purpose, Hsu, Chang, and Lin (2003) suggest to use RBF as kernel function. They stated that RBF use nonlinear approach so it should be more flexible. Previous studies (Keerthi & Lin, 2003) have shown that linear SVM with a penalty parameter C shows the same performance with RBF kernel with some parameters (C,

). Furthermore, sigmoid kernel

65

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

behaves like RBF for some parameters (Lin & Lin, 2003). RBF also uses less number of hyperparameters in the model selection and compared with other methods, RBF has less numerical difficulties. Therefore, in this project, we use the Radial Basis Function (RBF) as a kernel function that transforms the features of the original space. Linear classification requires a clear separation between positives and negatives vectors, meaning that the decision boundary must be a hyperplane. However, in many datasets, the decision boundaries are nonlinear. This kind of datasets must be transformed from the original space to much higher dimensional space called Feature space. When using RBF kernels, there are two parameters (C, γ) that can affect the prediction result. Unfortunately, the best parameters for each problem are not known beforehand. The only way to know the best value for C and γ is by the process of trial and error. Cross validation and grid-search techniques is used in this process. Cross validation is a technique that split the dataset to v number of subsets. Each subset has equal size. Then each subset is chosen as a testing data while the rests are used to generate a classifier model. After the model is tested, the method chooses another subset to be used as testing data against new model generate by the rest of subsets, therefore each vector in the dataset is predicted once. Then grid-search pattern is performed to find the best C and γ using cross validation. Basically, combinations of (C, γ) are tested using cross validation and the one with the best accuracy is picked. It can be thought of as finding the grid points in X-Y plane. Conceptual Framework The conceptual framework is shown in Figure 1. It includes information extraction, web crawling, data conversion and machine learning. These processes were used to capture the data in the website, and then stored in a database, which would be processed later by the Loyalty Prediction Model to produce the prediction. Each shape shown above represents components that perform those tasks. As these components are not integrated, connectors are needed to connect these components and make them work together. The other important this is the nature of the output. The output produced by each of these components cannot be directly used by other components, therefore data conversion and scaling

66

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

is needed in this project. Even the components like Web Crawler and Converter must be built from scratch since it should be built based on the specific website that we used.

Each

component, connector and data element of this conceptual framework will be explained briefly on this section and in detail in later sections. The first component is the source of data for this project, which is a website. For this project, the website that we used is a review-specialised website that store customer reviews and ratings about services that they experienced before. The uniqueness of this website is that the reviews come from real people that had experienced the services, with no professional or paid reviewer on posting comments. For this project, we only use reviews on one of the services offered on the website. However, this website does not provide APIs that can be used to extract the information posted there. Some kind of information extractor is needed to retrieve the data. For this purpose, a web crawler was created specifically for this website. The data sent by this module is the source code of the website.

Website

extracted by

Crawler

stored in

Converter

supply

Supervised Learning

converted by

Database

Loyalty Prediction Model

produce

Prediction

Figure 1. Conceptual Framework

67

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

Web crawler is the second component of this project. The function of web crawler is to extract the selected data from the source code and store them in a format that suits the schema of the relational database. Each website has a different structure and content. Therefore, it is necessary to custom build the crawler in order to compensate for anomalies or structure specific to this website. The crawler will select and retrieve parts of the source code that is essential or useful by using a combination of obvious structure marker, string splitter and regular expression. The use of each of these tools will be explained in the next section. The main focus of this project is the review written by the people. There are two types of review data in the website. We term the first type of review as star ratings and the second one is yes/no questions. The star ratings is used to rate the quality of the services in five categories and in five levels. The yes/no questions are used to answer some of the questions related to the quality as well. Data conversion is needed to convert the star ratings to a format that is acceptable by the database. The result of this component is directly stored to the database after the crawling process is completed. This project uses relational database as the platform for the database, such as MySQL, since it is already widely used and proven reliable. The database for this project consists of four tables, each with its own function. Three tables store the data that was taken from the website and one table functions as a connector. The purpose of the database is to hold any information extracted by the web crawler, so it can be used by other components. Mostly the data that is stored in the database consist of two types: numbers and string. The focus of this project will be on the Review table that store reviewer’s information as well as their ratings and comments. This review data then needs to be converted before it can be used by the supervised learning algorithm. Therefore, this data is sent to the next component. The Converter is a component that is able to convert the format of the data, since supervised learning only accepts certain formats. The Converter retrieves data from the Review table in the database and converts it to a format that is compatible with supervised learning. Supervised learning only accepts data in forms of vectors of real numbers and in the supported range, thus the task of the converter is to rearrange the data and scale it. The integer data does not need to be converted, only rearranged, while the string data needs to be converted to a numerical value that can represent the value of data accurately. One record in the Review table

68

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

will be converted to one vector on real numbers. The result of this conversion is another type of file that has thousands of lines of numbers. However, that is not the only task that the converter does; supervised learning takes two inputs: training data and testing data.

Therefore the

converter is also programmed to split the data into two sets of data: training and testing data. The type of data at this state is lines of real numbers. The aim of this project is to predict whether customers will return to rebuy or repatronise a particular type of service based on the review that they wrote on the website. Supervised learning is used to create the Loyalty Prediction Model that will be further explained in the next section. The scaled and converted training data is fed to the supervised learning method to be processed and produce a classification model. This model is then used to predict the testing data to see the accuracy of this model. One of the questions that are answered by a reviewer is whether they will return or not. The answer for this question becomes the label or class for each record in database. The produced model is actually trying to predict this class or label based on the rest of the data. The result of this prediction is then compared with the label of testing data to test the accuracy of the model. The outcome of this process is the accuracy of the model and the prediction for each of record in the testing. The Loyalty Prediction The process of creating Loyalty Prediction Model is shown in Figure 2. Loyalty Prediction Model. The stored data in the database is sent to the conversion and scaling mechanism, which will convert the records on database to a format that readable by supervised learning method in supported scale. To be able to validate our model, we need testing data. Thus the data is split to two sets of data: training and testing data. To produce the model, training data needs to be processed by the training algorithm first. The model can be created with or without cross validation training. We decided to use both options to compare the results and see which one is better. If the model is trained without cross validation training, it means that the model is created with default parameters. When using cross validation training, the training set will be processed by another function first to generate new parameters. These new parameters then will be used in re-training the data to create another

69

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

model. Both models will be validated by the testing data to produce the prediction. From the prediction we can derive the accuracy of the prediction. Furthermore, in order to provide clear structural understanding to marketers, it is essential to find out the contribution level of each attribute of a product or service. We use feature selection method which will determine the importance of each attribute. In particular, we use Fscore to achieve this goal.

Database

Conversion and Scaling Accuracy split

Training Data

Testing Data Prediction validate produce

Training

Model

train

Re-Training Feature Selection Method

no Cross Validation?

New C and γ

Features Importance Level yes

produce Grid Search

Figure 2. Loyalty Prediction Model

70

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

F-score is defined as “a simple technique which measures the discrimination of two sets of real numbers” (Chen & Lin, 2006). Given the training vectors xk, k = 1, ..., m, and the number positive instances is

and negative instances is

, the F-score of the ith feature can be defined

as:

where

(whole),

data sets.

(positive),

(negative) are the average of the ith feature of the

refers to the ith feature of the kth positive instance while

is the ith

feature of the kth negative instance. Implementation and Results We use the www.we8there.com to build the original data set. This website is one of the top review websites that specialise in restaurant, hotel and bed and breakfast reviews.

Each

restaurant review includes the textual comment and numeric ratings towards a restaurant a consumer recently patronized. For the rating, the website has defined a five-star scale: 5 – Excellent, 4 – Above Average, 3 – Average, 2 – Below Average, and 1 – Poor. The ten rating criteria is shown in Figure 3 as an example review. With SVM, we built a loyalty model that predicts whether or not a customer will return to the restaurant using a supervised learning algorithm. As discussed in Section 2, "rebuy or repatronise" is deemed as customer loyalty, we consider the answer to the question "Would you return" a reliable indicative to customer loyalty. One may doubt the usefulness of prediction since the data also records the answer to this question. It should be noted that our aim is to study the loyalty determinants (namely, the nine criteria in Table 1) that provides customer insights, i.e. the reason why some customers are loyal while others are not or not sure. The answers to this question will become either part of the training set to 'learn' the model or the prediction set to validate the model. Data Processing Since this website does not provide API that allows access to their data, we built a web crawler to extract the data and store it in the MySQL 5.0 database. Next, the data was classified

71

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

into two classes and converted to real numbers within the range of -1, 0, and 1. Each attribute then was given an ordered number that serves as an index, which is consistent throughout the whole dataset. Each record in the database was converted to one line in the file, starting with the label that indicates which class this line of instance belongs to. The attributes of each restaurant that were used in this research are: Food, Service, Price over value, Atmosphere, Overall, Experience, Reservation, Number of people seated, and Credit Card facility.

Figure 3. A Sample Review

After the data was converted, we created the Loyalty Prediction Model by training the data. To do that, the data was split into two sets for training and testing data respectively. The training data was used to create a model for predicting future instances, while testing data was used to validate the model. The records on the testing data were not altered at all; therefore the first number of each line (i.e. the label or class) remains the same. This label was used to compare the predicted result with the actual result. There were 7539 records of review data in the database. 3000 records were used as training data, while the remaining 4539 records were converted as testing data.

72

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

Learning the First Model To realise the SVM algorithms, we used open source software LIBSVM (Chang & Lin, 2001). The prediction function was used after the training to validate the model and produce the accuracy of the model. The accuracy of the prediction is defined by:

The accuracy of the first Loyalty Prediction Model was 92.4653% which is already quite high. However we did not stop there. As stated above, the SVM training model uses RBF as the type of kernel. This model has a number of parameters, with the two most important parameters are C and γ. At the beginning, the best C and γ is unknown for every given problem. It is known that a good C and γ may result in better accuracy. Therefore, this project tries to find the best parameters for the Loyalty Prediction Model. For the previous validation, the C and γ were not specified, meaning the default values were used. The default value for γ is 1/k where k is the number of records in training set and the default value for C is 1. Normally, the process of finding the best C and γ is by trial and error, which means the user has to try using a range of numbers as arguments and test them against several randomly split training sets until the best result is acquired. The process of experimenting with the arguments within training sets is often realized through cross validation. Hsu, Chang, and Lin (2003) found that it is better to find the best C and γ using exponentially growing sequences, for example: C = 2-6, 2-4, … , 28 and γ = 210, 28, … , 2-2. It can also be considered as finding grid points in an X-Y plane, therefore it is also known as grid search. Model Selection The grid-search function is provided by the LIBSVM by utilising the Python language and gnuplot to produce a two-dimensional graph which is shown in Figure 4. Grid Search Graph. The grid-search function only needs the training data as the input.

The function will

automatically find the best C and γ by cross validating the training set and trying all argument values and comparing their accuracy. First it will split the training data into n-sets of training sets randomly. Depending on the value of n, each sub training set is used to validate a model that is produced from the rest of the subsets. For example, supplying 3 to n will split the training

73

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

set to three sets. The training starts with data sets 1 and 2 to create a model and predict data set 3, followed by training with data set 1 and 3 and predicts data set 2. This process will continue until the best C and γ is found. Figure 4. Grid Search Graph simply shows that based on the grid-search, the best C is 213 or 8192 and γ is 2-9 or 0.001953125, with an accuracy of 94%. The graph shows the accuracy with coloured lines within X-Y plane of log2C and log2γ. The new C and γ is then used to produce a new Loyalty Prediction Model by retraining the original data with new parameters. Thus a new model is created from this process and validated against the testing data. The result of the prediction is an accuracy of 92.5975%. Notice that there is a 0.1322% increase in accuracy compared to the original result.

Figure 4. Grid Search Graph Loyalty Determinants The next step is to determine the importance of each loyalty determinant in customer loyalty using feature selection. The function of feature selection is to find out the level of contribution that each feature or attribute possesses in the prediction model based on the F-score. Feature selection process takes two parameters, training and testing data. To determine the importance of the ith attribute in a record, the record, containing the original ith attribute, is

74

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

compared with its self but, the ith value is now replaced with a randomly generated value. The result of feature selection process is shown in Table 1. Prioritize Loyalty Determinants below. The result shows the rank of each attribute’s importance in descending order with a higher score representing the more important feature in determining customer loyalty. It can be seen that the Experience attribute has the highest F-score while the Overall attribute comes in second.

The difference between the top two attributes and remaining attributes is quite

substantial, therefore it can be concluded that these two attributes (i.e. Experience and Overall) are the most influential factors in determining customer loyalty. The last three attributes (Credit Card, Reservation, Number of people seated) have the least contribution towards customer loyalty. Loyalty Determinants

F-Score

Feature 6 (Experience)

3.850842

Feature 5 (Overall)

3.024809

Feature 1 (Food)

1.942560

Feature 3 (Price/Value)

1.672877

Feature 2 (Service)

1.604872

Feature 4 (Atmosphere)

1.008777

Feature 9 (CreditCard)

0.004103

Feature 7 (Reservation)

0.001464

Feature 8 (Seat No.)

0.000192

Table 1. Prioritize Loyalty Determinants

Further Experiment and Results While we have achieved more than 92 percentage accuracy rate, marketers and managers are more interested in those who answered "no" or "not sure". As these customers provide insightful intelligence for firms to improve their services and are the potential prospects. To understand these "no/not sure" answers, we further study the result. We found that our dataset collected from the we8there.com is rather unbalanced. In fact, more than 70% reviews said "yes" to the

75

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

"Would you return" answer. We need to understand what effect this will have on our prediction result. To do this, we first provide a break-down analysis of our result. Table 2. Prediction Breakdown without Weighting presents the result breakdown under current learning model. It shows the detailed prediction for each label of loyalty response. The shaded cells represent the number of correctly predicted instances for each class label. For example, there are 3560 reviews with actual "yes" response. Out of these 3560 reviews, our model predicts 3492 as "yes" (thus correct), but mistakes 67 of them for "No" (-1), and 1 as "Not Sure" (0), giving the prediction accuracy 98.09%. The worst prediction performance lies in the "Not Sure" loyalty response. Out of 219 reviews, our model only correctly identifies 2 instances, thus yielding a 0.91% poor prediction accuracy rate. Thus, for majority of those customers who are not sure whether or not they will come back, our model put them either as "loyal" or "not loyal". This is undesirable for managers to really identify customers to whom they need to specifically target and may have provided misleading intelligence. Predicted 1 Actual 1 3492 0 101 -1 56

0

-1

1 2 1

67 116 703

sub-total

accuracy

3560 219 760

98.09% 0.91% 92.5%

Table 2. Prediction Breakdown without Weighting

We believe this issue is caused by the unbalanced nature of our dataset as mentioned earlier in this section. To overcome this limitation, we re-train our model using a penalty parameter. The basic idea is to penalize the model when it makes a wrong decision on the "minorities" (e.g. the "not sure" label) by putting a larger "weight" on the cost function during the optimization. Thus each class label has a unique weighted version of the original cost function as a constraint for optimization. This is referred to as class imbalance SVM (Osuna, Freund, & Girosi, 1997).

76

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

A practical issue is how to decide the value of these new weights for each class label. We will adopt the method used in (Shi, Zhang, Pan, Cheng, & Xie, 2007) to assign the weight – Wc = N / N c where c = 1, 2, ..., m (m is the total number of labels), Nc is the sample number of class c in the training set and N is the total sample number in the training set. Table 3. Weights for Three Class Labels presents the weights based on this method. Class label

Number

Weight

Yes (1)

2315

2315/3000 = 1.30

Not sure (0)

105

105/3000 = 28.57

No (-1)

580

580/3000 = 5.17

Table 3. Weights for Three Class Labels It can be seen that we provided a much larger cost weight for the "Not sure" class label in order to reduce the error for this class. We re-trained our model and re-selected the C and gamma value for the new model selection based on these weights. The new prediction result is provided in Table 4. Prediction Breakdown with Weighting based on this new weighting scheme.

Actual 1 0 -1

1

0

3379 83 50

62 26 42

Predicted -1 sub-total 119 110 668

3560 219 760

accuracy 94.92% 11.87% 87.89%

(C = 2.0, γ = 0.5, Overall Accuracy = 89.7334%) Table 4. Prediction Breakdown with Weighting

Table 4. Prediction Breakdown with Weighting shows that the prediction accuracy for the "Not sure" class has been improved by almost 12 times from 0.91% to 11.87%. However, this accuracy is still rather slim compared to other classes, out of 219 reviews with "Not sure" loyalty responses, our new model can only pick up 26 of them. In addition, the accuracy of "No" class

77

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

prediction has dropped, which is not desirable from the managerial perspective. Future work needs to be done to improve the imbalanced and multi-class SVM model to fit our data. Discussion With regards to the two key research problems defined in Section 1, we have found the following answers: 1.

For the problem (Can we accurately predict customer loyalty from customer reviews?), our finding suggests that, in the food industry, using our developed Loyalty Prediction Model we can predict customer loyalty from customer reviews with the accuracy of 92.5975%. However, our model has shown poor prediction accuracy for the reviews with "Not sure"

loyalty response. Future work has to be carried out to investigate this issue. 2.

For the problem (Can we identify the main driving factors behind customer loyalty from customer reviews?), our finding suggests that, in the food industry, customer experience and overall impression are the most influential factors determining customer loyalty based on their F-scores. These two factors are followed by Food, Price/Value, Service, Atmosphere, CreditCard, Reservation, and Number of people seated. The results of loyalty determinants show that customer loyalty in the restaurant domain is

collectively determined by the two most important factors: Experience and Overall. They have substantial differences in terms of F-score with other determinants.

This implies that the

experiences customers had and the overall impressions during their visit in a restaurant are the main determinants that influence customer loyalty in the future. Restaurant managers need to address these two factors first in order to gain customer loyalty. We believe that our Loyalty Prediction Model can potentially help people, especially managers, to predict customer loyalty in the future from the current reviews they have written. Furthermore, managers can also improve the important factors that influence customers to return.

78

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

Conclusion The experiment showed that machine learning can help in making decisions on classification and predicting the outcome of the data in our proposed Loyalty Prediction Model. The first model was created used default parameters, which resulted in an accuracy of 92.4653%. To improve the performance of the model, a function called grid-search was used to produce new parameters, which proved to be better, improving the accuracy of the model by 0.1322% or 6 records, to 92.5975%. This shows that grid-search helps in improving accuracy. The feature selection method is conducted to determine the factors behind customer loyalty. By inputting the training and testing data, the feature selection method was able to produce the importance of each attribute. The result shows that there are two main influential attributes: Customer Experience and Overall Impressions, with a large margin between these two attributes and the rest. References Bickart, B., & Schindler, R. M. (2001). Internet forums as influential sources of consumer information. Journal of Interactive Marketing, 15(3), 31-40. Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. Chang, C. C., & Lin, C. J. (2001). LIBSVM: a library for support vector machines. Chen, Y. W., & Lin, C. J. (2006). Combining SVMs with various feature selection strategies. STUDIES IN FUZZINESS AND SOFT COMPUTING, 207, 315. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273-297. Garbarino, E., & Johnson, M. S. (1999). The different roles of satisfaction, trust, and commitment in customer relationships. The Journal of Marketing, 70-87. Heskett, J. L., Jones, T. O., Loveman, G. W., Sasser, W. E., & Schlesinger, L. A. (1994). Putting the Service-Profit Chain to Work. Harvard business review. Hsu, C. W., Chang, C. C., & Lin, C. J. (2003). A practical guide to support vector classification. Jarvis, L. P., & Mayo, E. J. (1986). Winning the market-share game. Cornell Hotel and Restaurant Administration Quarterly, 27(3), 72.

79

Journal of Internet Business Issue 8 – 2010 _____________________________________________________________________________________________

Keerthi, S. S., & Lin, C. J. (2003). Asymptotic behaviors of support vector machines with Gaussian kernel. Neural computation, 15(7), 1667-1689. Keller, E., Wester, G., & Green, A. (2006). Stats, and Data, and Numbers, oh my! All the facts and figures about word of mouth that you can eat. Paper presented at the Master the Art of Word of Mouth, Viral, Buzz and Blog Marketing, San Francisco, USA. Lin, H. T., & Lin, C. J. (2003). A study on sigmoid kernels for SVM and the training of non-PSD kernels by SMO-type methods. submitted to Neural Computation. Liu, B. (2007). Web data mining: exploring hyperlinks, contents, and usage data: Springer. Oliver, R. L. (1999). Whence consumer loyalty? The Journal of Marketing, 33-44. Osuna, E., Freund, R., & Girosi, F. (1997). Support vector machines: Training and applications. CBCL-144. Papadopoulou, P., Andreou, A., Kanellis, P., & Martakos, D. (2001). Trust and relationship building in electronic commerce. Internet Research: Electronic Networking Applications and Policy, 11(4), 322-332. Reichheld, F. F., & Schefter, P. (2000). E-loyalty: your secret weapon on the web. Harvard business review, 78(4), 105-113. Reichheld, F. F., & Teal, T. (1996). The loyalty effect: The hidden force behind growth, profits, and lasting value: Harvard Business School Pr. Ribbink, D., van Riel, A. C. R., Liljander, V., & Streukens, S. (2004). Comfort your online customer: quality, trust and loyalty on the internet. Managing Service Quality, 14(6), 446-456. Shi, J., Zhang, S., Pan, Q., Cheng, Y., & Xie, J. (2007). Prediction of protein subcellular localization by support vector machines using multi-scale energy and pseudo amino acid composition. Amino Acids, 33(1), 69-74. Singh, J., & Sirdeshmukh, D. (2000). Agency and trust mechanisms in consumer satisfaction and loyalty judgments. Journal of the Academy of Marketing Science, 28(1), 150-167. Van Riel, A. C. R., Liljander, V., & Jurriens, P. (2001). Exploring consumer evaluations of eservices: a portal site. International Journal of Service Industry Management, 12(4), 359377. Vapnik, V. N. (2000). The nature of statistical learning theory: Springer Verlag.

80

Deriving Customer Loyalty and Its Determinants from ...

through various social media including online reviews. It is known that ... We implement our framework in section 5 and refine our model in section 6. A brief ...

310KB Sizes 0 Downloads 232 Views

Recommend Documents

Customer Loyalty - Using Valutec Loyalty Cards
When using Customer Loyalty in CRE across multiple stores (via the web portal for example) it may be required to use Valutec to handle the bonus point earned per each card. When using Valutec all incentives and plans are handled through CRE/RPE, the

1 Measuring Customer Satisfaction and Loyalty: Improving the 'Net ...
the best and sufficient measure of customer satisfaction. ... customers placing themselves at points 9 or 10 (called 'promoters') and the ... experiences with the company, exposure to communication in mass media such as reviews ... campaigns. ... who

The Value of Different Customer Satisfaction and Loyalty Metrics in ...
Managers commonly use customer feedback data to set goals and monitor performance ... value in predicting future business performance and that Top 2 Box ...

1 Measuring Customer Satisfaction and Loyalty: Improving the 'Net ...
Many business leaders believe that they can trust the Net-Promoter score and its ... enabling local field management to follow up quickly on problems and ..... stronger than satisfaction in all but one of the regressions (all respondents: p=.15;.

Deriving Operational Semantics from Denotational ...
aim is to build equivalence between the operational and de- ... transition types for Verilog and define a phase semantics of ... support parallel expansion laws.

Achieving Big Customer Loyalty in a Small Business World
are those best suited to engaging existing customers. (see Figure 1).4 ... It can cost up to ten times moreto acquire a .... via social media or from the point of purchase, is important in driving ... most effective ways to kick-off a loyalty campaig

Deriving Configuration Interfaces from Feature Models ...
as of task, user, discourse and business models found in the ... plications have permeated a number of markets such as car manufacturers, clothing or computer ...

A Study on deriving Respiratory Signals from ECG
The AV node thus acts as a "relay station" delaying stimulation of the ventricles long enough to allow the two atria to finish emptying. 4. The electrical ...... starting pulses for the complete pumping action of the heart, is also affected by the au

Deriving information structure from convergent derivations
DP VP. Nom. DP]]. From (4), the alternative configuration in which Dat internally merges above T is convergent, as it no longer is an intervenor. We assume that ...

Inventory Management and Its Effects on Customer ...
terviews to measure the satisfaction level of customers and inventory management system of KFC. Questionnaire is based on the personal preferences of customers which help us to know that how much they are satisfied with KFC products and what the impo

Towards deriving conclusions from cause-effect relations
Department of Computer Science ... a central aim of the special sciences. ..... aand P5) we may understand causal literals in the top part of the program as a ...

A Study on deriving Respiratory Signals from ECG
After 1st week of liberal search into this area, the group with the consent of the Instructor ... (all multi-lead ECG data and Respiratory data only available in hard copies) and lack of implementation of ... A period of recovery follows called diast

A Study on deriving Respiratory Signals from ECG
ECG for heart rate monitoring could use a reduced bandwidth 0.5 – 50 Hz. ..... This signal, like other biomedical signals, is not free from the artifacts and noise.