© Detica 2005

Data Matching Can fraud detection techniques be used for CRM? Authors: David Porter & Nicolas Mallison

Detica ƒ We are the UK’s leading specialist IT consultancy ƒ As one of SAS UK’s partners we help clients generate intelligence from large, enterprise-wide sources of information and apply it to solving business problems – we call this Information Intelligence ƒ Like SAS, our clients are blue-chip companies and major government departments Key facts •660 people •200 SAS consultants •Turnover £71m (2005) •Offices in UK and USA

2

Key markets •Financial Services •Telecoms •Government •Utilities •Life Sciences

Agenda ƒ Detica overview ƒ From CRM to fraud and back again ƒ Introducing Social Network Analysis ƒ Why traditional data matching doesn’t work ƒ Demonstration of how Network Matching works ƒ Social Network Analysis Vs traditional approaches ƒ Insurance fraud case study ƒ What Network Matching cannot do ƒ Viral marketing thoughts ƒ Questions 3

Detica ‘solutions & technologies’ framework ƒ Our Information Intelligence specialisation delivers expertise in a number of solution areas, leveraging a number of technologies Solutions Technologies

4

Detica services ƒ Within the solutions and technologies we deliver a number of services BUSINESS CONSULTING

Business case development Business requirements capture Business process design Creative services

TECHNOLOGY CONSULTING

Technology strategy System requirements capture System architecture design Creative services

SYSTEM INTEGRATION

Data warehouse/analytics platform build Bespoke Java/.NET development Content management, workflow, portal implementation High performance hardware design & build System integration & testing

SYSTEM SUPPORT & MAINTENANCE

Application management 24/7, SLA-based support Reactive & proactive support

5

Detica markets and clients ƒ We have experience in working with a large number of industry leaders and government departments Financial Services

Telecoms

Utilities & Life Sciences

6

Government clients

Commercial clients

National Security

Government

Agenda ƒ Detica overview ƒ From CRM to fraud and back again ƒ Introducing Social Network Analysis ƒ Why traditional data matching doesn’t work ƒ Demonstration of how Network Matching works ƒ Social Network Analysis Vs traditional approaches ƒ Insurance fraud case study ƒ What Network Matching cannot do ƒ Viral marketing thoughts ƒ Questions 7

From CRM to Fraud and back again ƒ Original methods taken from CRM • First we considered the problem of data matching for fraud investigations as simply another CRM Single Customer View (SCV) project • However, commercial products were found to be inadequate because they could not identify aliases and associates, or deal with the typically poor and heterogeneous nature of the data available

ƒ Background to discovery of method – insurance data • We asked the question: “Why could a human see the obvious relationships within the data and yet the algorithms would not create matches?” • We began to modify rules, loosen the criteria, use our knowledge of criminal investigations and Social Network Analysis and we took a more holistic approach • The tools that worked in the end bore little resemblance to their CRM origins and this prompted another question: “What would happen if we applied these fraud detection techniques back to marketing?” 8

Agenda ƒ Detica overview ƒ From CRM to fraud and back again ƒ Introducing Social Network Analysis ƒ Why traditional data matching doesn’t work ƒ Demonstration of how Network Matching works ƒ Social Network Analysis Vs traditional approaches ƒ Insurance fraud case study ƒ What Network Matching cannot do ƒ Viral marketing thoughts ƒ Questions 9

What is Social Network Analysis? ƒ Family of tools and techniques ƒ Visual and quantitative analysis of any kind of complex system ƒ Used to identify characteristics of networks, emerging sub-groups and the key players involved. ƒ Types of Network Analysis include: Structural Analysis of Networks: Key Players, groups and links

Network Visualisation: Qualitative behavioural analysis

Simulation and Agent based network modelling: Generating what if scenarios to inform decision making and deal with problems of imperfect data.

10

Critical Path Analysis: Identify cut points in networks that flow

Time Series and Trend Analysis (Traffic Analysis): Activity based identification of areas of interest.

Why is Network Analysis useful? ƒ Appropriately undertaken, Network Analysis enables activities and provides results that other forms of analysis are unable to deliver, including: • Exploiting available information to extract and reconstruct network; • Dealing with missing, imperfect data and establish level of confidence through practical techniques and human validation; • Dealing with networks as complex adaptive systems to model likely scenarios; • Generating sets of targets / strategies to effect the network in desired way (given level of confidence in data); and • Measuring the effectiveness of the operations conducted. 11

Ability to identify, understand and evaluate networks of collaborating individuals and organisations

Agenda ƒ Detica overview ƒ From CRM to fraud and back again ƒ Introducing Social Network Analysis ƒ Why traditional data matching doesn’t work ƒ Demonstration of how Network Matching works ƒ Social Network Analysis Vs traditional approaches ƒ Insurance fraud case study ƒ What Network Matching cannot do ƒ Viral marketing thoughts ƒ Questions 12

Why traditional data matching does not work ƒ There are a number of reasons why traditional data matching is less successful than expected, including: 8 Bad data (e.g. corrupted by transfer or inaccurately keyed); 8 Sparse data (e.g. missing fields including names); and 8 Poor data, (e.g. un-related data, resulting in little useful information).

ƒ In addition: 8 Criminals don’t want to be identified so they will try to put minimum information into documents or lie about their identity 8 Fundamentally, there is rarely enough information contained within two individual records to categorically match them to one another – ever!

13

Agenda ƒ Detica overview ƒ From CRM to fraud and back again ƒ Introducing Social Network Analysis ƒ Why traditional data matching doesn’t work ƒ Demonstration of how Network Matching works ƒ Social Network Analysis Vs traditional approaches ƒ Insurance fraud case study ƒ What Network Matching cannot do ƒ Viral marketing thoughts ƒ Questions 14

Network Matching – Insurance fraud example ƒ Our example highlights the use of two separate policies and two separate claims within a potential fraud ring Policy A Driver: Tom Brown, 55 Acacia Gdns, 0207 435 7660, 01.07.67 2nd Driver: Betty Brown, 55 Acacia Gdns, 6.8.72 Claim A Injured Passenger: John Smith, 0208 675 8876, 11.8.73

We can link claim to policy (not always as easy as it should be)

Policy B

Note: lack of address in some records and phone number in others, also mis-spellings in address etc.

Driver: T Smith, 44 Hape Street, 0208 675 8876, 07.01.67 2nd Driver: B Smith, 44 Hope Street, 0208 675 8876, 6.8.72 Claim B Witness: Thomas Braun, 0207 435 7660, 01.07.67

15

Network Matching – Insurance fraud example (cont…) ƒ With only limited information a potential fraud ring is detected Policy A

Claim A

Injured Passenger: John Smith, 0208 675 8876, 11.8.73

Named Driver: Betty Brown, 55 Acacia Gdns, 6.8.72 Policy A: Driver Tom Brown, 55 Acacia Gdns, 0207 435 7660, 01.07.67 Policy B: Driver The ring connects! T Smith, 44 Hape Street, 0208 675 8876, 07.01.67

B Smith, 44 Hope Street, 0208 675 8876, 6.8.72

Witness: Thomas Braun, 0207 435 7660, 01.07.67 Softer Links – coincidence or identity abuse? 16

Claim B

Policy B

Techniques used ƒ Complex SAS data steps & SQL were used to: • Extract & identify meaningful information from multiple sources of different quality; • Generate multiple possible versions of the truth through combination; • Enable fuzzy information comparison using phonetical encoding (such as SOUNDEX) and asymmetric spelling distance (such as SPEDIS); and • Code logic to create link list between records when “enough” commonality exists.

ƒ Complex SAS macro statements were used to: • Merge link lists recursively to themselves in order to: - determine entities (such as locations, persons, vehicles, businesses, etc); and - build networks of original records linked together via those entities. 17

Agenda ƒ Detica overview ƒ From CRM to fraud and back again ƒ Introducing Social Network Analysis ƒ Why traditional data matching doesn’t work ƒ Demonstration of how Network Matching works ƒ Social Network Analysis Vs traditional approaches ƒ Insurance fraud case study ƒ What Network Matching cannot do ƒ Viral marketing thoughts ƒ Questions 18

Social Network Analysis vs. traditional methods ƒ Social Network Analysis is more likely to present an accurate ‘picture’ of who we are Demographics: Assumes that we are who we live near.

However, are we really like our neighbours?

Behavioural Analysis: Assumes that we are what we eat/buy/do.

Whilst it is a better predictor than demographics alone, volume data can be expensive to collect and Mahatma Ghandi, Albert Schweitzer and Ozzie Osbourne were/are all vegetarians!

Lifestyle Survey Analysis: Assumes that we are what we say we do/behave.

However, would you tell the truth about your ‘Crazy Frog’ ringtone?

Network Analysis: We are who we know, live with and share our bank accounts and telephones with.

Network Analysis is based on facts and does not try to guess associations e.g. would you share the same bank account with a stranger?

19

Why Network Matching works in CRM ƒ There are a number of reasons why Network Matching works within a Customer Relationship Management context, including: 9 It swaps data quality for data volume; 9 It makes use of any available additional name address data - all data adds value not just customer lists; 9 It is ‘provable’ - you can see how it works; 9 It can absorb any existing customer metrics that you know works and makes them better. For example, you could judge the value of a network by how many high spenders it included; and 9 It has a precedent in ‘Household’ analytics and Viral Marketing.

20

Agenda ƒ Detica overview ƒ From CRM to fraud and back again ƒ Introducing Social Network Analysis ƒ Why traditional data matching doesn’t work ƒ Demonstration of how Network Matching works ƒ Social Network Analysis Vs traditional approaches ƒ Insurance fraud case study ƒ What Network Matching cannot do ƒ Viral marketing thoughts ƒ Questions 21

Insurance fraud case study ƒ Crime is spread across insurers, the same individuals target lots of companies ƒ The criminals techniques have evolved to evade current matching systems ƒ “Multiple” identities are prevalent, but within the bounds of data quality ƒ Key individuals can be identified as knowledge propagators 22

Example is based on a proof of concept in the insurance industry

Identification of fraud networks ƒ The representation opposite provides a sample set from the hundreds of potentially suspect networks detected in 0.5% of the UK ƒ The largest network contains around fifty claims ƒ The 100th largest in 0.5% of the UK contained 20 claims ƒ Some individuals on some rings were known, but this was not the case for all 23

Our approach ƒ Our approach brought together innovative thinking to solve a problem that has been costing the industry £millions each year for a significant period of time, by: Creating a new way to match the customer data using SAS software Using data from 20 companies (80+ file formats)

Scoring each network of associated customer records to prioritise probable fraudulent behaviour

24

Creating a set of web based tools so that the investigators could:

visualise the data as networks visualise the data geographically and temporally search through the underlying unstructured data with ‘google’ like functionality collaborate panorganisation on investigations

Agenda ƒ Detica overview ƒ From CRM to fraud and back again ƒ Introducing Social Network Analysis ƒ Why traditional data matching doesn’t work ƒ Demonstration of how Network Matching works ƒ Social Network Analysis Vs traditional approaches ƒ Insurance fraud case study ƒ What Network Matching cannot do ƒ Viral marketing thoughts ƒ Questions 25

Caveat – what Network Analysis cannot do ƒ Network Analysis is based on facts, not hypotheses, which naturally presents some limitations, including: • You will need data - so the method will predominantly favour existing customer retention; • It will not help you directly acquire unknown customers; and • It can still be wrong – it can link incorrectly – hence investigators are taught not to assume guilt by association.

26

Agenda ƒ Detica overview ƒ From CRM to fraud and back again ƒ Introducing Social Network Analysis ƒ Why traditional data matching doesn’t work ƒ Demonstration of how Network Matching works ƒ Social Network Analysis Vs traditional approaches ƒ Insurance fraud case study ƒ What Network Matching cannot do ƒ Viral marketing thoughts ƒ Questions 27

Viral marketing overview ƒ Viral marketing: • Designing marketing campaigns that are passed word of mouth between associates • Enables effortless (cost free) transfer to others • Exploits common motivations and behaviours • Utilises existing communication networks ƒ To be successful, it requires: • The recipient to ‘buy-in’ to the message; and • The recipient to forward the offer to an associate. … and this is difficult to achieve ƒ In reality, organisations therefore need to deliver ‘targeted viral marketing’, to ensure that marketing messages cut through the ‘spam’ we all receive and minimises the need to offer generic incentives to pass messages on

28

How Network Analysis can help ƒ Network Analysis assists in identifying who lies at the hub of targeted ‘human’ networks and therefore who can lead and drive the exponential growth of your offer/message ‰

Network analysis provides the tools to identify who the ‘movers and shakers’ are in human networks and in understanding their motivations and behaviours, you can begin to reinvent your viral and target marketing activities

*** RESULT *** Greater return from your viral marketing and a more integral tool for supporting all your customer communication activities

29

Final thoughts ƒ The focus is on ‘evolution’ not ‘revolution’ – your trusted analytics can be successfully absorbed by network analysis ƒ Traditionally, success in marketing has been about targeting a group and measuring success by feedback - this is no different ƒ The 90’s saw the emergence of 1-to-1 marketing, customer relationship management and micro segmentation – Network Analysis is simply a way to rationalise spending and target resources ƒ Hence… Only spend marketing effort on the most important member of a network, but monitor all of its nodes to see where the responses come from

30

Agenda ƒ Detica overview ƒ From CRM to fraud and back again ƒ Introducing Social Network Analysis ƒ Why traditional data matching doesn’t work ƒ Demonstration of how Network Matching works ƒ Social Network Analysis Vs traditional approaches ƒ Insurance fraud case study ƒ What Network Matching cannot do ƒ Viral marketing thoughts ƒ Questions 31

Questions

32

Contact details

Head Office Surrey Research Park Guildford Surrey GU2 7YP Tel: +44 (0)1483 816000 Fax: +44 (0)1483 816144 London Office

David C. Porter Tel:

01483 816809

Email: [email protected]

2 Arundel Street London WC2R 3AZ Tel: +44 (0)20 7812 4000 Fax: +44 (0)20 7812 4100 Cheltenham Office 1220 Lansdowne Court Gloucester Business Park Gloucester GL3 4AB UK Tel: +44 (0)1452 632400 Fax: +44 (0)1452 632424

33

Network Intelligence Systems: for enacting Network Effects Web/ Web/Open Open Source Source

Open Source Media Monitoring

Existing SNA Database

Transaction Data

Data capture/ Preparation

Database/ Document Repository

Automated Extraction of Networks Entities/ Links

Extraction of themes and context

Network Filtering -Temporal -Geographic -Thematic/ Context Networks entities and links (e.g. who, where) Context themes (e.g. what and why)

Automated Extraction of Networks Entities/ Links from SNA/ Transactional

Network Analysis and Monitoring Trend Analysis: network and themes Gist summaries of network context Collation of information of key nodes in networks

34

Analyst Interface Detica Network Monitor -Network Viz -Trend Analysis Cross Source Search and Information Retrieval Export to tools (e.g. I2, UCINET VI etc)

Advance Network Tools Automated data mining and pattern detection Network Validation & Sensitivity analysis Network Effects : planning, prediction tools

Insurance Model Data store Data store

Data Preparation Load data into SAS

Data cleanse and fix

Review

Data store Construct table for visualisation

Evaluate result Interpret / Monitor

Investigation Deploy Utilise insight

35

Handover product is network list ready for investigation

Identify linking fields Produce networks (use custom SAS process)

Perform network analytics/score

Construct matching keys

Build link list

Optional step Update

Required step

Datalab data analytics process time boxed

Collect & Load data

Audit data

Cleanse & Fix data

Review time boxed

time boxed

Link / Merge data

Unsupervised Analytics: Explore data Discover patterns Evaluate result Interpret / Monitor

Review

Review Supervised Analytics: Model data Predict outcomes

Deploy Utilise insight

36

Enhance data

Update

Optional step

Transform data Derive information

Required step

Engagement time scales typically vary from 3 weeks to 3 months

Data Matching - sasCommunity.org

Social Network Analysis Vs traditional approaches. ▫ Insurance fraud case study. ▫ What Network Matching cannot do. ▫ Viral marketing thoughts. ▫ Questions ...

510KB Sizes 4 Downloads 281 Views

Recommend Documents

Data Matching - sasCommunity.org
criminal investigations and Social Network Analysis and we took a more ... 10. What is Social Network Analysis? ▫ Family of tools and techniques. ▫ Visual and ...

pdf-1851\schema-matching-and-mapping-data-centric-systems-and ...
Connect more apps... Try one of the apps below to open or edit this item. pdf-1851\schema-matching-and-mapping-data-centric-systems-and-applications.pdf.

Semantic Matching across Heterogeneous Data Sources
Semantic correspondences across heterogeneous data sources include schema-level ... Other semantics and business rules may simply reside in human.

Read Data Matching: Concepts and Techniques for Record Linkage ...
Data matching (also known as record or data linkage, entity resolution, object ... process. Part II "Steps of the Data Matching Process" then details its main steps ...

Semantic Matching across Heterogeneous Data Sources
semi-automated tools are therefore desired to assist human analysts in the semantic matching process. ..... Decision Support Systems 34, 1 (2002), 19-39. 4.

RF Input Impedance Matching Data for the ... - Linear Technology
In a real world application, a. DC block function must be included as part of the matching network design. The measured LTC5564 input impedance is shown in ...

Respondent Anonymity and Data-Matching Erik ...
Jul 16, 2007 - The JSTOR Archive is a trusted digital repository providing for ... Signature in Personal Reports," lournal of Applied Psychology, 20 (1936), pp.

Respondent Anonymity and Data-Matching Erik ...
Jul 16, 2007 - Respondent Anonymity and Data-Matching ... The JSTOR Archive is a trusted digital repository providing for long-term preservation and access ...

pdf-1851\schema-matching-and-mapping-data-centric-systems-and ...
Connect more apps... Try one of the apps below to open or edit this item. pdf-1851\schema-matching-and-mapping-data-centric-systems-and-applications.pdf.

Download Data Matching: Concepts and Techniques for Record ...
Data matching (also known as record or data linkage, entity resolution, object ... process, especially on how to improve the accuracy of data matching, and its ...

Instrumen Verifikasi Data Matching Program KIS, KIP dan KKS.pdf ...
There was a problem loading more pages. Instrumen Verifikasi Data Matching Program KIS, KIP dan KKS.pdf. Instrumen Verifikasi Data Matching Program KIS, ...

Tree Pattern Matching to Subset Matching in Linear ...
'U"cdc f f There are only O ( ns ) mar k ed nodes#I with the property that all nodes in either the left subtree ofBI or the right subtree ofBI are unmar k ed; this is ...

matching houps.DOC
Prado. 2 . The high collar, hem treatments, belting, fullness and cut are found in numerous paintings shown in Les. Tres Riches Heures. 3 . As these garments were designed for summer wear, they are lined only in limited areas for structural and desig

Logarithms – Matching Activity
This activity will require a bit of advance preparation due to the fact that I could not figure out a way to rotate text and still type superscripts for exponents, subscripts for bases, etc. The next pages contain 18 problems involving introductory p

Pattern Matching
basis of the degree of linkage between expected and achieved outcomes. In light of this ... al scaling, and cluster analysis as well as unique graphic portrayals of the results .... Pattern match of program design to job-related outcomes. Expected.

http://myfreeworksheet.blogspot.in KINDERGARTEN-MATCHING ...
Circle the matching lower case letter to the upper case letter in each row. U r u v. V v a x. W r q w. X t x k. Page 2. http://myfreeworksheet.blogspot.in.

Investing before Stable Matching
†Universitat Aut`onoma de Barcelona and Barcelona GSE, email: .... 4The complete market is the benchmark situation where investments and partnerships are ...

Latent Palmprint Matching
[8] D. Zhang, W. K. Kong, J. You, and M. Wong, “Online Palmprint .... Science Board and The National Academies committees on Whither. Biometrics and ...

Answers Matching Graphs
Matching Graphs. Determine which letter best represents the information in the table. 1. 2. 3. 4. 1-4. 75 50 25 0. Color. Blue. Green. Orange Yellow. Red. People.

Multipath Matching Pursuit - IEEE Xplore
Abstract—In this paper, we propose an algorithm referred to as multipath matching pursuit (MMP) that investigates multiple promising candidates to recover ...