Big Data Technical Workshop Sept 24 Minneapolis, MN
Data Discovery, Modern Architecture & Visualization
Big Data Discovery Demo: Financial Services Customer Journey
1
9/25/2014
AGENDA
• Key Functionality Highlighted • Business Scenario • Demo • Questions and Hands-On Time
2
9/25/2014
Key Highlights • UDA in Action: Loading Data with SQL-H • Multi-Genre analytics - SQL / SQL-MR / Statistical • Analytic Functions > Attribution > nPath > Naïve Bayes prediction
• Visualization in Tableau with custom SQL parameters • Business Insights • Iterative Discovery • Ease of use for the Business User
3
9/25/2014
Retail banking customer journey covers important interactions over time
Customer Journey 4
9/25/2014
Consumer Banking Interactions • Customers interact through multiple channels generating multi-structured data Teller Withdrawal
Online Transfer
BRANCH
ATM ID, Event, $$
ATM Deposit
5
9/25/2014
CALL CENTER
ONLINE
Teller ID, Event
ATM
Call Center Inquiry
Page Type, Action
TELLER Teller ID, Event
Teller Complaint
Event
E-MAIL Text Category (Intent)
Email Complaint
BRANCH Branch ID, Mgr ID, Event
Cancel account
Data Preparation Teradata: SQL-H
Hadoop MR
Data
{"customer_skey":"7549","customer_identifier":"607529531vmmhmxa555434805"," customer_cookie":"sj-yCLAX1MzaWsOpOA45Rc","customer_online_id":"Y","customer_offline_id":"443999128","c ustomer_type":"Lapsed Customer","customer_days_active":"1017","interaction_session_number":"3942284" ,"interaction_timestamp":"[2012/02/24:11:57:26 AM]","interaction_source":"WEBSITE","interaction_type":"STARTS_APPLICATION","s ales_channel":"-1","conversion_id":"1","product_category":"CHECKING","product_type":"-1","conversion_sales":"1","conversion_cost":"-1","conversion_margin":"-1","conversion_units":"1","marketing_code":"-1","marketing_category":"1","marketing_description":"(null)","marketing_placement":"1","mobile_flag":"N","updt":"-1"},
• SQL-H Call: CREATE VIEW hcat_banking_web AS SELECT * FROM load_from_hcatalog( ON mr_driver SERVER('39.0.32.7') USERNAME('horton') DBNAME('default') TABLENAME('ich_banking') );
6
9/25/2014
Data Filtering
• Weblogs in JSON on HDFS
HCatalog
Hive
Pig
Hadoop Layer: HDFS
Customer acquisition – Credit Cards • Business Objective > Identify which channels perform the best > Which promotions are the most effective in driving credit card
signups? > Where should I increase my marketing spend to drive higher ROI?
• Analytics > Multi-channel Marketing Attribution
• Business Impact > Quantify marketing effectiveness to drive revenue > Optimize Marketing spend & placement > Run time-sensitive promotions
7
9/25/2014
Path to Adoption – Wealth Management • Business Objective > Adoption of additional high value products > Opening Wealth Management account > An important sign of customer loyalty
• Analytics > What are the common paths customers take? – nPath
> Path/Pattern time series analysis that is hard to do in SQL > Affiliation with other accounts - What other accounts do wealth
management customers hold?
• Business Impact > Fast growing business segment > Retail banks are looking to build their business in this area
8
9/25/2014
Churn Overview • Business Objective > Detect and prevent churn
• Analytics > What are the common churn paths customers take? – nPath > Analyze the path to churn we just found to look for common
steps > Identify customers at risk of churn - Naïve Bayes prediction algorithm
• Business Impact > Identify root causes of churn through novel visualizations
and rapid, iterative analysis > Proactively retain high value customers at risk of churn > Preventing churn helps protect our brand as well as revenue
9
9/25/2014
Find the common paths to churn • Basic SQL-MR nPath function call • Easy for SQL savvy business users to use • Looking for the last 4 events before account closure 1.SELECT * FROM nPath ( 2. ON (SELECT * FROM ich_banking) 3. PARTITION BY customer_identifier 4. ORDER BY interaction_timestamp 5. MODE (NONOVERLAPPING) 6. -- Limit to a depth of 4 7. PATTERN ('(EVENT){4}.CLOSED') 8. SYMBOLS ( 9. interaction_type <> 'ACCOUNT_CLOSED' AS EVENT, 10. interaction_type = 'ACCOUNT_CLOSED' AS CLOSED 11. ) 12. RESULT ( 13. ACCUMULATE( interaction_type OF ANY(EVENT,CLOSED) ) 14. COUNT( interaction_type OF ANY(EVENT,CLOSED) ) 15. )) a;
10
9/25/2014
AS interaction_type_list, AS click_depth
Iterate to clarify the path to churn • Add in key events: > balance transfer, call complaint, fee reversal, link external
accounts steps to the pattern
• Help reduce noise and see the main churner paths 1.PATTERN ('EVENT.(TRANSFER|COMPLAINT|LINK|REVERSAL){1,3}.CLOSED') 2. SYMBOLS ( 3. interaction_type <> 'ACCOUNT_CLOSED' AS EVENT, 4. interaction_type = 'BALANCE_TRANSFER' AS TRANSFER, 5. interaction_type = 'FEE_REVERSAL' AS REVERSAL, 6. interaction_type = 'CALL_COMPLAINT' AS COMPLAINT, 7. interaction_type = 'LINK_EXTERNAL_ACCOUNT' AS LINK, 8. interaction_type = 'ACCOUNT_CLOSED' AS CLOSED 9. )
11
9/25/2014
Predict future churners before it is too late • Business Objective > Identify potential future churners
• Analytics > Aster’s Naïve Bayes prediction algorithm > Analyze the path to churn we just found to look for common steps
• Business Impact > Allow for proactive customer retention actions > Protect our brand image
12
9/25/2014
Steps leading to Prediction • Run nPath for known churners & non-churners • Generate training model (naiveBayesText function)
SELECT token, SUM(churn) AS churn, SUM(no_churn) AS no_churn FROM naiveBayesText( ON churn_nb_training TEXT_COLUMN( 'interaction_type_list' ) CATEGORY_COLUMN( 'churned' ) CATEGORIES( 'churn', 'no_churn' ) DELIMITER('[,]') PUNCTUATION('[\\[\\] ]+')) GROUP BY token;
13
9/25/2014
Step brows complete_appl balance_transf account_clos olb starts_appl account_booked_onlin account_booked_offlin loan_calc call_complaint referr click compar link_external_account fee_revers mortgage_calc
Churn 142 621 2014 2014 21 120 999 250 89 1987 1 2 432 656 1654 27
No Churn 1615 563 4 0 578 1429 335 70 328 28 4 22 1766 5 8 149
Steps leading to Prediction • Load the resulting training model into Aster (\install) • Run nPath for other customers • Run prediction algorithm (naiveBayesTextPredict function) SELECT * FROM naiveBayesTextPredict( ON churn_nb_input DOCUMENT_ID ( 'customer_identifier' ) TEXT_COLUMN ( 'interaction_type_list' ) CATEGORIES ( 'churn', 'no_churn' ) MODEL_FILE ( 'churn_model.csv' ));
14
9/25/2014
Improving Customer Churn Models
Higher Business Value with Multiple Analytics on All Data Space of all possible customers at risk of Churn Customers that can be identified through Classical Statistics
Churn Potential Current Statistical Model
Aster Behavioral Common Pattern Prediction Model
Customers identified through Aster Behavioral Pattern Model
The statistical model predicted it would happen, Aster is seeing it happen. Very likely customer will churn.
15
9/25/2014
Operationalize Insights from Aster Churn Prediction Model
Campaign Management ( e.g. RM)
Customer Path Identified 16
9/25/2014
Churn Risk Predictions
Real Time Interaction Manager (e.g. RTIM)
Retention Campaign for At Risk High Value Customers
Additional Questions?
17
9/25/2014