© Detica 2005
Data Matching Can fraud detection techniques be used for CRM? Authors: David Porter & Nicolas Mallison
Detica We are the UK’s leading specialist IT consultancy As one of SAS UK’s partners we help clients generate intelligence from large, enterprise-wide sources of information and apply it to solving business problems – we call this Information Intelligence Like SAS, our clients are blue-chip companies and major government departments Key facts •660 people •200 SAS consultants •Turnover £71m (2005) •Offices in UK and USA
2
Key markets •Financial Services •Telecoms •Government •Utilities •Life Sciences
Agenda Detica overview From CRM to fraud and back again Introducing Social Network Analysis Why traditional data matching doesn’t work Demonstration of how Network Matching works Social Network Analysis Vs traditional approaches Insurance fraud case study What Network Matching cannot do Viral marketing thoughts Questions 3
Detica ‘solutions & technologies’ framework Our Information Intelligence specialisation delivers expertise in a number of solution areas, leveraging a number of technologies Solutions Technologies
4
Detica services Within the solutions and technologies we deliver a number of services BUSINESS CONSULTING
Business case development Business requirements capture Business process design Creative services
TECHNOLOGY CONSULTING
Technology strategy System requirements capture System architecture design Creative services
SYSTEM INTEGRATION
Data warehouse/analytics platform build Bespoke Java/.NET development Content management, workflow, portal implementation High performance hardware design & build System integration & testing
SYSTEM SUPPORT & MAINTENANCE
Application management 24/7, SLA-based support Reactive & proactive support
5
Detica markets and clients We have experience in working with a large number of industry leaders and government departments Financial Services
Telecoms
Utilities & Life Sciences
6
Government clients
Commercial clients
National Security
Government
Agenda Detica overview From CRM to fraud and back again Introducing Social Network Analysis Why traditional data matching doesn’t work Demonstration of how Network Matching works Social Network Analysis Vs traditional approaches Insurance fraud case study What Network Matching cannot do Viral marketing thoughts Questions 7
From CRM to Fraud and back again Original methods taken from CRM • First we considered the problem of data matching for fraud investigations as simply another CRM Single Customer View (SCV) project • However, commercial products were found to be inadequate because they could not identify aliases and associates, or deal with the typically poor and heterogeneous nature of the data available
Background to discovery of method – insurance data • We asked the question: “Why could a human see the obvious relationships within the data and yet the algorithms would not create matches?” • We began to modify rules, loosen the criteria, use our knowledge of criminal investigations and Social Network Analysis and we took a more holistic approach • The tools that worked in the end bore little resemblance to their CRM origins and this prompted another question: “What would happen if we applied these fraud detection techniques back to marketing?” 8
Agenda Detica overview From CRM to fraud and back again Introducing Social Network Analysis Why traditional data matching doesn’t work Demonstration of how Network Matching works Social Network Analysis Vs traditional approaches Insurance fraud case study What Network Matching cannot do Viral marketing thoughts Questions 9
What is Social Network Analysis? Family of tools and techniques Visual and quantitative analysis of any kind of complex system Used to identify characteristics of networks, emerging sub-groups and the key players involved. Types of Network Analysis include: Structural Analysis of Networks: Key Players, groups and links
Network Visualisation: Qualitative behavioural analysis
Simulation and Agent based network modelling: Generating what if scenarios to inform decision making and deal with problems of imperfect data.
10
Critical Path Analysis: Identify cut points in networks that flow
Time Series and Trend Analysis (Traffic Analysis): Activity based identification of areas of interest.
Why is Network Analysis useful? Appropriately undertaken, Network Analysis enables activities and provides results that other forms of analysis are unable to deliver, including: • Exploiting available information to extract and reconstruct network; • Dealing with missing, imperfect data and establish level of confidence through practical techniques and human validation; • Dealing with networks as complex adaptive systems to model likely scenarios; • Generating sets of targets / strategies to effect the network in desired way (given level of confidence in data); and • Measuring the effectiveness of the operations conducted. 11
Ability to identify, understand and evaluate networks of collaborating individuals and organisations
Agenda Detica overview From CRM to fraud and back again Introducing Social Network Analysis Why traditional data matching doesn’t work Demonstration of how Network Matching works Social Network Analysis Vs traditional approaches Insurance fraud case study What Network Matching cannot do Viral marketing thoughts Questions 12
Why traditional data matching does not work There are a number of reasons why traditional data matching is less successful than expected, including: 8 Bad data (e.g. corrupted by transfer or inaccurately keyed); 8 Sparse data (e.g. missing fields including names); and 8 Poor data, (e.g. un-related data, resulting in little useful information).
In addition: 8 Criminals don’t want to be identified so they will try to put minimum information into documents or lie about their identity 8 Fundamentally, there is rarely enough information contained within two individual records to categorically match them to one another – ever!
13
Agenda Detica overview From CRM to fraud and back again Introducing Social Network Analysis Why traditional data matching doesn’t work Demonstration of how Network Matching works Social Network Analysis Vs traditional approaches Insurance fraud case study What Network Matching cannot do Viral marketing thoughts Questions 14
Network Matching – Insurance fraud example Our example highlights the use of two separate policies and two separate claims within a potential fraud ring Policy A Driver: Tom Brown, 55 Acacia Gdns, 0207 435 7660, 01.07.67 2nd Driver: Betty Brown, 55 Acacia Gdns, 6.8.72 Claim A Injured Passenger: John Smith, 0208 675 8876, 11.8.73
We can link claim to policy (not always as easy as it should be)
Policy B
Note: lack of address in some records and phone number in others, also mis-spellings in address etc.
Driver: T Smith, 44 Hape Street, 0208 675 8876, 07.01.67 2nd Driver: B Smith, 44 Hope Street, 0208 675 8876, 6.8.72 Claim B Witness: Thomas Braun, 0207 435 7660, 01.07.67
15
Network Matching – Insurance fraud example (cont…) With only limited information a potential fraud ring is detected Policy A
Claim A
Injured Passenger: John Smith, 0208 675 8876, 11.8.73
Named Driver: Betty Brown, 55 Acacia Gdns, 6.8.72 Policy A: Driver Tom Brown, 55 Acacia Gdns, 0207 435 7660, 01.07.67 Policy B: Driver The ring connects! T Smith, 44 Hape Street, 0208 675 8876, 07.01.67
B Smith, 44 Hope Street, 0208 675 8876, 6.8.72
Witness: Thomas Braun, 0207 435 7660, 01.07.67 Softer Links – coincidence or identity abuse? 16
Claim B
Policy B
Techniques used Complex SAS data steps & SQL were used to: • Extract & identify meaningful information from multiple sources of different quality; • Generate multiple possible versions of the truth through combination; • Enable fuzzy information comparison using phonetical encoding (such as SOUNDEX) and asymmetric spelling distance (such as SPEDIS); and • Code logic to create link list between records when “enough” commonality exists.
Complex SAS macro statements were used to: • Merge link lists recursively to themselves in order to: - determine entities (such as locations, persons, vehicles, businesses, etc); and - build networks of original records linked together via those entities. 17
Agenda Detica overview From CRM to fraud and back again Introducing Social Network Analysis Why traditional data matching doesn’t work Demonstration of how Network Matching works Social Network Analysis Vs traditional approaches Insurance fraud case study What Network Matching cannot do Viral marketing thoughts Questions 18
Social Network Analysis vs. traditional methods Social Network Analysis is more likely to present an accurate ‘picture’ of who we are Demographics: Assumes that we are who we live near.
However, are we really like our neighbours?
Behavioural Analysis: Assumes that we are what we eat/buy/do.
Whilst it is a better predictor than demographics alone, volume data can be expensive to collect and Mahatma Ghandi, Albert Schweitzer and Ozzie Osbourne were/are all vegetarians!
Lifestyle Survey Analysis: Assumes that we are what we say we do/behave.
However, would you tell the truth about your ‘Crazy Frog’ ringtone?
Network Analysis: We are who we know, live with and share our bank accounts and telephones with.
Network Analysis is based on facts and does not try to guess associations e.g. would you share the same bank account with a stranger?
19
Why Network Matching works in CRM There are a number of reasons why Network Matching works within a Customer Relationship Management context, including: 9 It swaps data quality for data volume; 9 It makes use of any available additional name address data - all data adds value not just customer lists; 9 It is ‘provable’ - you can see how it works; 9 It can absorb any existing customer metrics that you know works and makes them better. For example, you could judge the value of a network by how many high spenders it included; and 9 It has a precedent in ‘Household’ analytics and Viral Marketing.
20
Agenda Detica overview From CRM to fraud and back again Introducing Social Network Analysis Why traditional data matching doesn’t work Demonstration of how Network Matching works Social Network Analysis Vs traditional approaches Insurance fraud case study What Network Matching cannot do Viral marketing thoughts Questions 21
Insurance fraud case study Crime is spread across insurers, the same individuals target lots of companies The criminals techniques have evolved to evade current matching systems “Multiple” identities are prevalent, but within the bounds of data quality Key individuals can be identified as knowledge propagators 22
Example is based on a proof of concept in the insurance industry
Identification of fraud networks The representation opposite provides a sample set from the hundreds of potentially suspect networks detected in 0.5% of the UK The largest network contains around fifty claims The 100th largest in 0.5% of the UK contained 20 claims Some individuals on some rings were known, but this was not the case for all 23
Our approach Our approach brought together innovative thinking to solve a problem that has been costing the industry £millions each year for a significant period of time, by: Creating a new way to match the customer data using SAS software Using data from 20 companies (80+ file formats)
Scoring each network of associated customer records to prioritise probable fraudulent behaviour
24
Creating a set of web based tools so that the investigators could:
visualise the data as networks visualise the data geographically and temporally search through the underlying unstructured data with ‘google’ like functionality collaborate panorganisation on investigations
Agenda Detica overview From CRM to fraud and back again Introducing Social Network Analysis Why traditional data matching doesn’t work Demonstration of how Network Matching works Social Network Analysis Vs traditional approaches Insurance fraud case study What Network Matching cannot do Viral marketing thoughts Questions 25
Caveat – what Network Analysis cannot do Network Analysis is based on facts, not hypotheses, which naturally presents some limitations, including: • You will need data - so the method will predominantly favour existing customer retention; • It will not help you directly acquire unknown customers; and • It can still be wrong – it can link incorrectly – hence investigators are taught not to assume guilt by association.
26
Agenda Detica overview From CRM to fraud and back again Introducing Social Network Analysis Why traditional data matching doesn’t work Demonstration of how Network Matching works Social Network Analysis Vs traditional approaches Insurance fraud case study What Network Matching cannot do Viral marketing thoughts Questions 27
Viral marketing overview Viral marketing: • Designing marketing campaigns that are passed word of mouth between associates • Enables effortless (cost free) transfer to others • Exploits common motivations and behaviours • Utilises existing communication networks To be successful, it requires: • The recipient to ‘buy-in’ to the message; and • The recipient to forward the offer to an associate. … and this is difficult to achieve In reality, organisations therefore need to deliver ‘targeted viral marketing’, to ensure that marketing messages cut through the ‘spam’ we all receive and minimises the need to offer generic incentives to pass messages on
28
How Network Analysis can help Network Analysis assists in identifying who lies at the hub of targeted ‘human’ networks and therefore who can lead and drive the exponential growth of your offer/message
Network analysis provides the tools to identify who the ‘movers and shakers’ are in human networks and in understanding their motivations and behaviours, you can begin to reinvent your viral and target marketing activities
*** RESULT *** Greater return from your viral marketing and a more integral tool for supporting all your customer communication activities
29
Final thoughts The focus is on ‘evolution’ not ‘revolution’ – your trusted analytics can be successfully absorbed by network analysis Traditionally, success in marketing has been about targeting a group and measuring success by feedback - this is no different The 90’s saw the emergence of 1-to-1 marketing, customer relationship management and micro segmentation – Network Analysis is simply a way to rationalise spending and target resources Hence… Only spend marketing effort on the most important member of a network, but monitor all of its nodes to see where the responses come from
30
Agenda Detica overview From CRM to fraud and back again Introducing Social Network Analysis Why traditional data matching doesn’t work Demonstration of how Network Matching works Social Network Analysis Vs traditional approaches Insurance fraud case study What Network Matching cannot do Viral marketing thoughts Questions 31
Questions
32
Contact details
Head Office Surrey Research Park Guildford Surrey GU2 7YP Tel: +44 (0)1483 816000 Fax: +44 (0)1483 816144 London Office
David C. Porter Tel:
01483 816809
Email:
[email protected]
2 Arundel Street London WC2R 3AZ Tel: +44 (0)20 7812 4000 Fax: +44 (0)20 7812 4100 Cheltenham Office 1220 Lansdowne Court Gloucester Business Park Gloucester GL3 4AB UK Tel: +44 (0)1452 632400 Fax: +44 (0)1452 632424
33
Network Intelligence Systems: for enacting Network Effects Web/ Web/Open Open Source Source
Open Source Media Monitoring
Existing SNA Database
Transaction Data
Data capture/ Preparation
Database/ Document Repository
Automated Extraction of Networks Entities/ Links
Extraction of themes and context
Network Filtering -Temporal -Geographic -Thematic/ Context Networks entities and links (e.g. who, where) Context themes (e.g. what and why)
Automated Extraction of Networks Entities/ Links from SNA/ Transactional
Network Analysis and Monitoring Trend Analysis: network and themes Gist summaries of network context Collation of information of key nodes in networks
34
Analyst Interface Detica Network Monitor -Network Viz -Trend Analysis Cross Source Search and Information Retrieval Export to tools (e.g. I2, UCINET VI etc)
Advance Network Tools Automated data mining and pattern detection Network Validation & Sensitivity analysis Network Effects : planning, prediction tools
Insurance Model Data store Data store
Data Preparation Load data into SAS
Data cleanse and fix
Review
Data store Construct table for visualisation
Evaluate result Interpret / Monitor
Investigation Deploy Utilise insight
35
Handover product is network list ready for investigation
Identify linking fields Produce networks (use custom SAS process)
Perform network analytics/score
Construct matching keys
Build link list
Optional step Update
Required step
Datalab data analytics process time boxed
Collect & Load data
Audit data
Cleanse & Fix data
Review time boxed
time boxed
Link / Merge data
Unsupervised Analytics: Explore data Discover patterns Evaluate result Interpret / Monitor
Review
Review Supervised Analytics: Model data Predict outcomes
Deploy Utilise insight
36
Enhance data
Update
Optional step
Transform data Derive information
Required step
Engagement time scales typically vary from 3 weeks to 3 months