Fraud Detection by Humans Agents: A Pilot Study

Viewer
Transcript

Fraud Detection by Humans Agents: A Pilot Study Vinicius Almendra and Daniel Schwabe Departamento de Informática, Pontifícia Universidade Católica do Rio de Janeiro, Rio de Janeiro, Brazil [email protected], [email protected]

Abstract. Fraud is a constant problem for online auction sites. Besides failures in detecting fraudsters, the currently employed methods yield many false positives: bona fide sellers that end up harassed by the auction site as suspects. We advocate the use of human computation (also called crowdsourcing) to improve precision and recall of current fraud detection techniques. To examine the feasibility of our proposal, we did a pilot study with a set of human subjects, testing whether they could distinguish fraudsters from common sellers before negative feedback arrived and looking just at a snapshot of seller profiles. Here we present the methodology used and the obtained results, in terms of precision and recall of human classifiers, showing positive evidence that detecting fraudsters with human computation is viable. Keywords: fraud, human computation, e-commerce, classification.

1 Introduction In the last years we have witnessed a tremendous growth in electronic markets, of which online auction sites have an important share. Fraud levels have been increasing at a similar pace [8,12,15]. Online auction sites actively try to reduce fraud levels, through internal controls that filter out suspicious users or limit the amount of negotiations they can be involved in, for example. These restrictions reduce losses with frauds and reduce the perceived risk, as less people end up being swindled. However, these controls are far from perfect, as fraudsters actively try to disguise themselves as honest users. This means that many honest sellers will be erroneously treated as suspicious. When treated as fraudsters, hitherto honest sellers might abandon the market, go to a competitor or at least slow down their sales. As online auction sites’ primary source of revenue is the amount of commissions charged to sellers, such mistakes must be kept at a minimum. The investigations spawned by fraud detection mechanisms also bear some cost, so the number of users that have to be examined also impacts negatively on profit. We do not have access to the online auction sites’ controls, but we can regard them as classifiers that categorize users to support risk analysis: users with a higher risk of fraudulent behavior have more operational restrictions. The challenge is to balance the benefits of fraud reduction with the losses due to misclassification. In other terms, online auction sites need to balance the precision and the recall of their classifiers, as both influence the net profit from fraud detection activities. We advocate a new T. Di Noia and F. Buccafurri (Eds.): EC-Web 2009, LNCS 5692, pp. 300–311, 2009. © Springer-Verlag Berlin Heidelberg 2009

Fraud Detection by Humans Agents: A Pilot Study

301

approach to improve this balance. Instead of relying solely on automatic classifiers to find suspects, we propose an ulterior analysis of their results with human computation [1,9]: “outsourcing” this refinement task to people recruited on Internet, who would check suspicious seller profiles in order to confirm suspicion or reject it. Given the previous classification done by current mechanisms, the number of seller profiles to be analyzed would be manageable. The paradigm of human computation has been successfully applied by the site Mechanical Turk1, where thousands of tasks are distributed each day to virtually anonymous users. This has also been called “crowdsourcing” [5]. In order to check the viability of this proposal, we have done a pilot study with a group of unskilled people. The task we assigned to them was to predict non-delivery fraud, the most common complaint in online auction sites [8,11,12]. Given a group of active sellers from MercadoLivre2, the biggest Brazilian auction site, we wanted to check whether those people could predict which sellers would commit fraud in the near future; we also wanted to know how precise this prediction would be. In other words, looking at humans as classifiers, we wanted to know their precision and recall. 1.1 Why Use Human Computation? There are several reported uses of human computation in the literature: labeling images from the web [1], conducting user studies [14], credit analysis [7], collection of human-reviewed data [18], moderation of textual comments in discussion forums [13]. All these studies confirm the usefulness of this new paradigm. In the domain of fraud detection, we argue that human computation provides a scalable solution, as the use of human agents can be adjusted to match current needs: if fraud increases, one can employ more agents; less fraud, fewer agents can be used. This is much harder to accomplish with traditional organizational structures, as hiring and dismissing people are expensive operations. The performance of human agents recruited arbitrarily on the Internet is certainly lower than those who an enterprise could hire through traditional focused methods (interviews, resume analysis etc.). However, this can be counterbalanced by the use of multiple agents to solve the same problem, a strategy already employed by Mechanical Turk. 1.2 Related Work There is an ongoing research on finding fraudsters through data mining in social networks [6,16,19]. The basic idea is to uncover one of the strategies used by fraudsters in online auction sites, which leads to the formation of a bipartite negotiation graph: “honest” identities on one side and fraudulent identities on the other. The “honest” identities are used to boost the reputation of fraudulent identities, through false trades. Then the reputation obtained is used to commit fraud. The “honest” identities are also used in normal negotiations, making it difficult to link them with fraudulent activity. These identities can be “reused” many times by fraudsters, reducing the total cost to carry out deceptive activities. 1 2

www.mturk.com www.mercadolivre.com.br

302

V. Almendra and D. Schwabe

While offering an interesting solution, they target a very specific fraudulent strategy. Our work complements theirs and gives a more general and long-lasting solution: fraudsters change strategy, but are always fraudsters, which implies in using some sort of lying detectable by humans [15]. On the side of human computation, the best known work is the ESP game [1], which is a computer game were people generate textual labels for arbitrary images. Players are matched in pairs and on each round both are exposed to an image. They have some time to type words, knowing that they gain points when one of them writes a word that the other has already typed. As the only means of communication between them is the image, it turns out that when a match occurs, there is a high probability the matching word is a good descriptive label for the image. The game is based on the fact the problem of describing images is very easy for people, but extremely difficult for computers. If one can make lots of people solve those problems for a low-cost reward, then we have some sort of viable “human computer”. A notable difference between our work and the preceding one is the complexity of the task. Detecting fraudsters is certainly more difficult than giving descriptive labels for images. This raises two important questions: whether this task is viable, i.e. the results are good enough, and whether the amount spent with rewards does not exceed what the online auction site recovers with aborted frauds. One may argue that Mechanical Turk “crowdsources” tasks that are not easy. However, we should first check if people are really capable of detecting fraudsters, at least in a reasonable time; this is the objective of our work. Gentry et al. [9] proposed a framework for human computation, deriving some theoretical results but without empirical ones. Belenkiy et al. [4] present an incentive mechanism for outsourced computation in a strategic setting, where “contractors” solve computational problems to a “boss”. They focus on computational tasks solvable through some known deterministic algorithm, which is not our case.

2 Problem Statement In order to use human computation, we need to model the problem we are going to solve. Let VE be set of sellers with active listings at a certain moment of time. We have the following subsets of interest: • VEF: the subset of VE containing sellers whose listings are fraudulent, that is, who will commit non-delivery fraud. • VEN: the subset of VE containing bona fide users (hereafter called normal users), that is, those who will deliver the merchandise currently advertised in their listings. This set is the complement of the previous one. Obviously the auction site does not know beforehand which sellers belong to each category. The true class of fraudsters is revealed when buyers start complaining of non-delivery fraud; this happens several days after sales have been made. When this condition does not happen for a specific seller, then his/her true class is the other one, of the normal sellers. The problem in question is to predict the correct classification of each seller. This prediction yields two new subsets of VE:

Fraud Detection by Humans Agents: A Pilot Study

303

• VECF: the subset of sellers classified as fraudsters. • VECN: the subset of sellers classified as normal. The problem of interest for the auction site is to find a binary classifier that better approximates the true nature of sellers. We can measure this approximation using information retrieval metrics – precision, recall, fall-out and F-measure [17] –, adapted to our scenario. Precision is the fraction of true fraudsters classified as such among all sellers classified as fraudsters:

PREC =

VECF ∩ VEF VECF

Recall is the fraction of true fraudsters classified as such among all fraudsters:

REC =

VECF ∩ VEF VEF

To obtain a joint value of these two metrics, we can use the F-measure:

F = 2⋅

PREC ⋅ REC PREC + REC

Another relevant metric is the fall-out, which is the fraction of false positives:

FO =

VECF ∩ VEN VEN

This last measure is important, as it quantifies the impact of fraud detection on normal sellers. A big fall-out means more harassed (and dissatisfied) sellers. We will say that a human classifier contributes if s/he does better than a random classifier, that is, his precision is greater than

VEF VE

. To order human classifiers, we

will use the F-measure.

3 Pilot Test Design We designed a pilot test to evaluate the performance of human agents in the classification task outlined in the previous section. We selected a sample of people and, for each subject, followed this procedure: 1. We presented a written training material about fraud in auction sites. We suggested them to study that material for a maximum of 30 minutes; 2. We presented a written explanation of what s/he was expected to do in the pilot test, how to fill the questionnaire, what they could not do etc.; 3. We provided a link to a worksheet where answers and some personal data should be recorded: age, profession, and if had already used MercadoLivre (hereafter referred to as ML); 4. We presented a list of links to seller profiles, where each one pointed to an offline snapshot of the profile of a seller in ML; 5. We instructed them to fill the worksheet, recording their opinion about each seller (whether or not s/he would commit fraud); the moment they started

304

V. Almendra and D. Schwabe

analyzing the seller; the moment they gave the answer; and the certainty degree of his/her answer (more certainty or less certainty). We also instructed them to do the test sequentially, in order to better capture time spent on each seller and to prevent influence of posterior experiences on previous answers. After all subjects had done the test, we compared the answers supplied with the real outcome and we computed the metrics presented in the previous section for each one. In the next sections we will give a more detailed account of some aspects. 3.1 Sample Selection We used a convenience sample of 26 people, recruited by the authors. None of them had earlier background on fraud detection, although seven had already done a simplified version of the pilot study, gaining some experience. We highlight the demographics of the sample in Table 1. Table 1. Demographics of pilot study sample Age

From 14 to 56 years old Mean: 28,3 years / median: 25,5 years IT professionals

Occupation

Experience with ML Earlier experience with pilot study?

5

Engineers

1

Physicians

3

High school teacher

1

Economists

3

Undergraduate students

11

Lawyers

1

High school student

1

13 have already used ML before 7 did a previous version of the pilot study

3.2 Seller Profiles Subjects were asked to evaluate 20 profiles of real sellers from ML, containing seven fraudsters and 13 normal sellers. Instead of evaluating the current profile of the seller, they were allowed to view only a snapshot of it. This snapshot contained a local copy of the following web pages, crawled from the online auction site: • The seller profile, containing received qualifications (positive, negative or neutral), along with textual comments; • Profiles of users who qualified the seller; • Seller’s current listings; • Seller’s past listings. Those pages are interlinked, mirroring the actual user experience of a buyer while examining a seller profile in the real live site. Links to web pages not contained in the snapshot were removed. This approach was adopted for two reasons: (i) to guarantee that all members of the sample analyzed the same information; (ii) to guarantee a minimum number of fraudsters. The latter one deserves some explanation. In order to build a set of profiles containing normal sellers and fraudsters, we could not choose sellers randomly, as the probability to find a fraudster would be very small. Besides that, we could not ask the human agents to evaluate a huge number of

Fraud Detection by Humans Agents: A Pilot Study

305

profiles. It was necessary to build a set of profiles in a more controlled manner. For that, we chose a set of active sellers, took snapshots of their profiles, and started monitoring their evolution over time, i.e. the outcome of their sales. When some of them displayed signals of fraudulent activity, we labeled their past snapshots as belonging to a fraudster. When they did not display any signs of fraudulent activity after some weeks, we labeled them as belonging to normal sellers. Dubious cases were excluded from analysis. Auction sites do not give an official confirmation that a seller has committed nondelivery fraud. So, we used the following set of indicators, validated in a previous work [3]: • The seller received several negative feedbacks over a short period of time; • His/her account was suspended by the auction site and was not reinstated; • Textual comments explicitly mention non-delivery fraud. When none of those criteria were satisfied, we considered the seller to be a normal one. When just one or two of them were satisfied, we treated it as a dubious case. The selection of sellers to be monitored and the moment chosen to take the snapshot of their profiles also obeyed some criteria, in order to increase the probability of finding fraudsters and, at the same time, avoiding profiles that already displayed some signs of fraud, as we wanted to test the ability of predicting fraud occurrence when there are no clear signs indicating that (e.g. recent negative qualifications). The chosen criteria to decide the moment to take a snapshot were based on the temporal model of non-delivery fraud shown in Figure 1. Many cases of non-delivery fraud fit in this model. First, the fraudster obtains an account with enough reputation. Then, s/he lists many good selling products with attractive prices, becoming part of the set of sellers with fraud potential. After a short time, people start buying from the fraudster, as his/her listings are very attractive. When some buyers realize the fraudulent scheme, they complain to the auction site, besides giving negative qualifications to the seller. In a short period of time after that the auction site suspends the account and the fraudster disappears with buyer’s money. We called fraud window the time interval between first sales and account suspension. This is the “latency” period of the swindle. Of course many normal sellers also “fit” in this model, except for the final part, as they usually deliver merchandise sold.

ȋǡ ǤȌ Time

Fig. 1. Temporal model of non-delivery fraud

306

V. Almendra and D. Schwabe

Our intention was to take the snapshot after the occurrence of the first three events: account acquisition, becoming part of the set of sellers with fraud potential and beginning of sales. We devised the following criteria to guarantee this: • • • • • • • •

Seller was active: suspended sellers were ignored. Seller had no negative qualifications in the previous 30 days: recent negative qualifications trigger suspicion and a fraudster would avoid them. Seller had received less than 400 positive qualifications: we had evidence that most fraudsters use identities with fewer positive qualifications. Active listings were posted no earlier than 16 days: in many cases, complaints about the seller appear earlier than this. Seller has negotiated fewer than 100 products in the last six months (excluding the current one): again, we have evidence that fraudsters display this behavior. Four or more products from current listings were already sold: we focus on attractive listings, i.e. those that end up in sales. Value at stake (sum of listed products’ price) is more than US$ 1,500.00: a bigger value at stake means more profit to the fraudster. At least one of the listed products belonged to a category known to attract fraudsters (cell phones, laptops, digital cameras etc.).

We did a daily inspection of thousands of seller profiles along several weeks. When one of them satisfied the conditions above, we took the snapshot and started monitoring it. When we finished this process, we had snapshots of profiles from 7 fraudsters and 59 normal sellers. The average fraud window was of five days, that is, in average sellers were suspended five days after we had taken the snapshot. We took 13 normal seller profiles and mixed randomly with the 7 fraudster profiles. As we predicted people doing the test would be insecure in the first answers, we added 3 more normal sellers on the beginning of the list. Those three first answers were ignored in the data analysis. We informed subjects that the number of fraudsters among sellers was random: there could be many, but there could be none. 3.3 Training Material The paradigm of human computation demands that human agents obtain benefits from their collaboration to solve problems. These benefits should depend on the quality of the job done; otherwise opportunistic agents would supply bogus answers just to get the benefits, as it happens in some systems based on distributed computation [4]. The benefits may range from fun, as in the “games with a purpose” scenario [1], to real money, like in Mechanical Turk. The important point here is that the presence of benefits creates an incentive to solve the problem well: agents will end up learning how to improve their performance. Capturing this aspect involves long-running experiments, which we could not afford to, due to practical limitations. However, putting it aside would be a hindrance to our test, as the phenomenon we wish to observe is not the fraud detection abilities of a

Fraud Detection by Humans Agents: A Pilot Study

307

random Internet user; we are focusing on unspecialized users with incentive and means to improve their results. The simplest way to improve fraud detection skills is to search the Internet, as there are many sites discussing this issue, presenting famous cases, giving hints on how to avoid fraudsters etc. In order to take into account the presence of incentive, we opted to give people from the sample a 10-page training material, containing information about fraud in online auction sites. Following is a more detailed description of this material’s content: • •

• • •

Links to ML help pages, for those who were not familiarized with the site; An explanation of non-delivery fraud. This explanation presented some common strategies to obtain reputation: selling items of small value; identity theft of good sellers; simulating negotiations through the use of other identities; building up reputation through buying instead of selling, since ML does not make a distinction when earning points; Hints about common misunderstandings on the meaning of some features that may unduly increase trust in sellers (e.g. “Safe User” label, posting tracking numbers of delivered merchandise); Text excerpts from fraud information sites, three detailed descriptions of alleged frauds, and an explanation of non-delivery fraud, all of them with links to their sources; Links to five seller profiles that fit into our description of non-delivery fraud. For each profile, we pointed out some signals that could raise suspicion on the seller. These signals were based on the preceding information and on the fraud strategies depicted in a research paper [15]. The links pointed to the real seller profiles, that is, after the alleged fraud has been committed.

We also urged those who would do the test to be as precise as possible, as their performance would depend on their recall and precision: classifying many sellers as fraudsters to “guarantee” a good recall would be of no help. To avoid bias in this training material, the author did not examine the seven profiles of fraudsters used in the pilot study.

4 Data Collected In Table 2 we display the metrics associated with each person in the sample. Notice that true positives is the number of fraudsters correctly pointed as such and false positives is the number of normal sellers wrongly pointed as fraudsters. Some outstanding numbers of precision, recall and fall-out are highlighted. In Table 3 we show the number of “votes” received by each seller. A participant “voted” on a seller when s/he pointed him/her as a fraudster. In the fourth line we indicate with an “X” whether the seller was really a fraudster.

308

V. Almendra and D. Schwabe Table 2. Metrics of participant performance (ordered by F-measure)

True positives False positives Precision (out of 7) (out of 13) 1 7 2 78% 10 6 2 75% 4 5 1 83% 23 4 0 100% 9 4 0 100% 7 4 0 100% 12 4 0 100% 18 6 4 60% 11 5 3 63% 5 4 1 80% 13 6 6 50% 26 6 7 46% 20 6 7 46% 2 5 5 50% 21 5 5 50% 15 5 6 45% 14 3 1 75% 6 5 7 42% 25 5 7 42% 22 5 8 38% 8 3 2 60% 19 4 6 40% 17 2 0 100% 3 2 0 100% 16 2 2 50% 24 1 2 33% Averages 4,4 3,2 66% Person

Recall

Fall-out

F-measure

100% 86% 71% 57% 57% 57% 57% 86% 71% 57% 86% 86% 86% 71% 71% 71% 43% 71% 71% 71% 43% 57% 29% 29% 29% 14% 63%

15% 15% 8% 0% 0% 0% 0% 31% 23% 8% 46% 54% 54% 38% 38% 46% 8% 54% 54% 62% 15% 46% 0% 0% 15% 15% 25%

88% 80% 77% 73% 73% 73% 73% 71% 67% 67% 63% 60% 60% 59% 59% 56% 55% 53% 53% 50% 50% 47% 44% 44% 36% 20% 60%

Table 3. Results by seller (ordered by the number of votes, from left to right) Rank Seller Votes Fraudster?

1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th 11th 12th 13th 14th 15th 16th 17th 18th 19th 20th

16 15 7 12 19 21 8 9 10 17 18 4 13 11 14 20 6 22 21 19 19 17 13 11 9 8 7 7 7 6 6 5 5 5 X

X

X

X

X

X

-

-

-

-

-

-

-

-

-

-

-

5 23 22 5 3 3 -

-

X

5 Results Analysis The results obtained are encouraging: only one person (no. 24) did not contribute, i.e., had a precision below 35%, which is the proportion of fraudsters in the set of seller profiles. There were 10 subjects with precision and recall greater than 50%. It is worth of noticing that five subjects achieved a precision of 100%, i.e., did not have false positives; those subjects also had a recall greater than 25%: they were able to catch more than a quarter of fraudsters without harassing normal sellers.

Fraud Detection by Humans Agents: A Pilot Study

309

Observing Table 2, we notice that almost 70% of the subjects (18) classified correctly a number of fraudsters equal or greater than the number of normal sellers incorrectly classified as fraudsters. That is remarkable, given that there are almost twice more normal sellers then fraudsters. The relevance of these results shows up in Table 3. If we count each answer as a vote, all fraudsters except one received more votes than normal sellers. This result is important, as it supports the use of human computation to find fraudsters using a voting schema. Gentry et al. [9] discussed the theoretical viability of majority voting on human computation; in our case, majority voting (more than 13 votes) would uncover 5 of 7 fraudsters without penalizing any normal seller, a relevant result. Another interesting fact is the time spent by subjects. We told them to avoid spending more than ten minutes analyzing each seller. This time limit was chosen based on Mechanical Turk: most tasks with small payment have a time allotted around ten minutes. In fact, only 1% of the analyses took more than ten minutes to be completed and the average time spent on each seller profile was 3,72 minutes. Nevertheless, we must look at these results with caution, due to methodological limitations. A convenience sample was used, preventing the use of average values as estimators of population mean. The small number of fraudsters and normal sellers analyzed also reduces the utility of precision and recall values as estimators, as their variances will be high. It should be noted, however, that average measures of precision and recall were high (66% and 63%, respectively), giving more confidence on results. The representativeness of the set of sellers examined in the pilot test is limited, especially of fraudsters; perhaps those ones were much easier to detect. It would be necessary to choose fraudsters from a greater sample. We also analyzed one type of fraud (non-delivery), although it is the most common one, according to the literature. The fraction of fraudsters (35%) present in the set of sellers is also significantly higher than what is observed in online auction sites (reportedly between 0,01% to 0,2% [3,10]). Even though we count on the existence of a previous automatic classification to increase the proportion of fraudsters, we did not find any published results about of how well this automatic classification could perform.

6 Conclusions and Future Work We presented a pilot study on fraudster identification by non-specialized people, where each one was asked to analyze serially snapshots of seller profiles from an online auction site, pointing out the ones they believed were fraudsters. Those snapshots were taken prior to first signs of fraudulent behavior, in many cases several days before, mimicking what a common buyer would see. Results were positive: more than a third of the subjects had a precision and recall greater than 50% and several of them achieved a precision of 100% with recall greater than 25%, indicating the viability of the approach and encouraging further research. Nonetheless, these results cannot be easily generalized, due to methodological limitations. A full-fledged experiment would be necessary to confirm our findings. This pilot study is part of a larger research on fraudster identification in electronic markets through human computation [2], also comprising an empirical model for

310

V. Almendra and D. Schwabe

assessing fraud occurrence in online auction sites using publically available data; an empirical evaluation of fraud occurrence at MercadoLivre; and a fraudster identification mechanism using human computation with incentives, dealing with strategic issues like remuneration and fraudsters attempts to manipulate results. The next step of this research is to strengthen empirical results through a more refined methodology. The pilot study with human subjects can be improved through a prototype implementation of a game with the purpose of predicting fraudulent behavior, rewarding “players” just with points, as ESP game does. The challenge would be to motivate enough people to use the system in order to obtain a good sample. Acknowledgments. This research was sponsored by UOL (www.uol.com.br), through its UOL Bolsa Pesquisa program, process number 20060601215400a, and by CNPq, process number 140768/2004-1. Daniel Schwabe was partially supported by a grant from CNPq.

References 1. Ahn, L.V., Dabbish, L.: Labeling images with a computer game. In: Conference on Human factors in computing systems. ACM Press, New York (2004) 2. Almendra, V.: A study on fraudster identification in electronic markets through human computation. PhD thesis. PUC-Rio (2008) 3. Almendra, V., Schwabe, D.: Analysis of Fraudulent Activity in a Brazilian Auction Site. In: Latin American Alternate Track of 18th International World Wide Web Conference (2009) 4. Belenkiy, M., Chase, M., Erway, C.C., Jannotti, J., Küpçü, A., Lysyanskaya, A.: Incentivizing outsourced computation. In: Third International Workshop on Economics of Networked Systems. ACM Press, New York (2008) 5. Brabham, D.C.: Crowdsourcing as a Model for Problem Solving: An Introduction and Cases. Convergence 14(1), 75–90 (2008) 6. Chau, D.H., Faloutsos, C.: Fraud Detection in Electronic Auction. In: European Web Mining Forum (2005), http://www.cs.cmu.edu/~dchau/papers/chau_fraud_detection.pdf 7. Duarte, J., Siegel, S., Young, L.A.: Trust and Credit. SSRN eLibrary (2009), http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1343275 8. Gavish, B., Tucci, C.: Fraudulent auctions on the Internet. Electronic Commerce Research 6(2), 127–140 (2006) 9. Gentry, C., Ramzan, Z., Stubblebine, S.: Secure distributed human computation. In: 6th ACM conference on Electronic commerce. ACM Press, New York (2005) 10. Gregg, D.G., Scott, J.E.: The Role of Reputation Systems in Reducing On-Line Auction Fraud. International Journal of Electronic Commerce 10(3), 95–120 (2006) 11. Gregg, D.G., Scott, J.E.: A typology of complaints about ebay sellers. Communications of the ACM 51(4), 69–74 (2008) 12. Internet Crime Complaint Center: 2007 Internet Crime Report, http://www.ic3.gov/media/annualreport/2007_IC3Report.pdf (Accessed, June 10 2008) 13. Jøsang, A., Ismail, R., Boyd, C.: A survey of trust and reputation systems for online service provision. Decis. Support Syst. 43(2), 618–644 (2007)

Fraud Detection by Humans Agents: A Pilot Study

311

14. Kittur, A., Chi, E.H., Suh, B.: Crowdsourcing user studies with Mechanical Turk. In: 26th annual SIGCHI conference on Human factors in computing systems. ACM Press, New York (2008) 15. Nikitkov, A., Stone, D.: Towards a Model of Online Auction Deception. In: ECAIS European Conference on Accounting Information Systems (2006), http://www.fdewb.unimaas.nl/marc/ecais_new/files/2006/ Paper5.pdf 16. Pandit, S., Chau, D.H., Wang, S., Faloutsos, C.: NetProbe: A Fast and Scalable System for Fraud Detection in Online Auction Networks. In: International World Wide Web Conference. ACM Press, New York (2007) 17. Rijsbergen, C.J.V.: Information Retrieval. Butterworth-Heinemann (1979) 18. Su, Q., Pavlov, D., Chow, J., Baker, W.C.: Internet-scale collection of human-reviewed data. In: 16th International Conference on World Wide Web. ACM Press, New York (2007) 19. Zhang, B., Zhou, Y., Faloutsos, C.: Toward a Comprehensive Model in Internet Auction Fraud Detection. In: 41st Annual Hawaii International Conference on System Sciences. IEEE Computer Society Press, Los Alamitos (2008)

Fraud Detection by Humans Agents: A Pilot Study

ers), that is, those who will deliver the merchandise currently advertised in their listings. This set is the complement of the previous ..... files of fraudsters used in the pilot study. 4 Data Collected. In Table 2 we display the ... him/her as a fraudster. In the fourth line we indicate with an âXâ whether the seller was really a fraudster.

Download PDF

124KB Sizes 1 Downloads 194 Views

Report

Fraud Detection by Humans Agents: A Pilot Study

Recommend Documents