mAnalytics: A Big Data Analytic mAnalytics: A Big Data Analytic Platform for Precision Marketing g China Mobile Research Institute China Mobile Research Institute 2014‐04‐07 Keyun Hu Yanhong Yan, Junlan Keyun Hu, Yanhong Yan Junlan Feng
Outline Background B k d Platform Architecture & Deployment Case studies
China Mobile Research Institute
2
China Mobile Research Institute
3
Background: China Mobile is one of the Largest Internet Content Providers in China
Content
URL
Users
Resources
Reading
read.10086.cn
130M
400K Books
Music
music.10086.cn
90M
2.8M Songs/Music
Video
video.10086.cn
>100M
Unclear
Cartoon
dm.10086.cn
>100M
220K
Other Content Sites: Games, Messages, Email, Mobile Market
mAnalytics 1.0: provides operation analysis and real‐time recommendation for these sites as well as for customer care these sites as well as for customer care China Mobile Research Institute
4
mAnalytics y 2.0 : One Platform Serves Multiple p Business Needs
‐Operation Analysis Operation Analysis ‐ Ads Effectiveness Tracking
Headquarter Headquarter Market Department
Headquarter H d t Data Department
mAanlytics
‐ Local Precision Marketing
China Mobile Research Institute
‐ Cross channel Recommendation
‐ Real‐Time Operation Analysis
Network Data Analysis ‐ Network Data Analysis ‐ User Modeling
‐ Content Modeling Content Modeling
Local Company
Internet Vertical Portal
‐ Personalized Recommendation ‐ Open Capability APIs
5
Outline Background B k d Platform Architecture & Deployment Case studies
China Mobile Research Institute
6
Targeted marketing and advertising Application Layer
Operational analytics Real-time and periodical usage reports
Real-time recommendations
User experience reports
Cross-Site recommendations
Modelling Layer
Data Processing and storage Layer Data Interface Layer
Recommendation Models
Statistic Models
Pluggable marketing models
Operational statistical models
User preference models
User experience modes
Task scheduler Resource Monitoring and alerting Task Monitoring and scheduling
Distribute Computing
NoSQL Q DB
Data preparation for reports
User behavior history
computed statistics
Data preparation for models
User profiles
Meta and configuration data
Web/wap JS API
Android/IOS SDK
SQL DB SQ
Offline Data adaptor
Architecture of mAnalytics: Cloud Based, Built on Big Data Platform, Fused Data, Plug & Play Model Structure, Supports Multi‐Apps
Data Interface Layer: Integrating Real Time Data and Offline Data CContent t t Sites
EE-Business Bu i Sites
Service Platform
•Search, Click,
embedded JS,SDK to collect ll ddata
Batch Data Upload
Download, Download、Saved Link、Comment, Share, Invite, Label, Upload, Review, Page Active Time, Scroll, etc.
•Index •User Info •Updated Product
Info
•search、shopping
cart、purchase, saved links, review, return , etc.
•Review, Listen,
•Updated p Products
•Data Content
•User Info
•User Info
• Business Rule
•Updated Label
Download, Click, etc
Data Fusion Problem:Heterogeneous Data Development:For Each Application Case,Develop JS, SDK, API、Data Adapter to Form Users’ 360-Degree View
IIntegrated t t d Data
Data Fusion
…… Internet Server Payment Data Log China Mobile Research Institute
Mobile App/ Wap
Network Data 9
Data Processing and Storage Data Processing and Storage Behavior User Online Behavior/Historical Preference Track/Active Time/ Anonymous User Behavior
Content Name//Type/Property/ Label/URL
User Label/Profiling
① Data Conversion: Incoming data is pre-processed into pre-defined data format; Support Incremental Update; ② Data Validation:Data completeness, data logical relation, data rejection ③ Data Storage:Converted Log S; User Use Views e s -> Hbase; base; -> HDFS; Analytic Results -> mysql ; Data Processing Errors and Alarms -> Reports. ④ Log Monitoring:Alarm when Anomalies Occur
Data P Preprocessing i
Modelingg Layer: y Personalized Recommendation Task:Analyze and Mine Obtained Data for Precision Marketing and Product Improvement Key Technology:Domain Specific Recommendation
Read.10086
……
……
DomainSpecific PrecisionMarketing Modeling
……
Viedo Recommendation Modeling
…… …
mAnalytics
Book Recommendation Modeling
Music.10086
China Mobile Research Institute
Music Recommendation11
Configuration for Training and Online‐ Recommendation with Standard APIs Recommendation with Standard APIs Recommendation Platform Output Configure
Case Study, Study :
Computing (Real‐Time,Human can intervene)
Model Training
Training Configure
(Periodical Offline Training)
Recommendation Template
Template chosen for chosen for recommendatio n
(Predefined Recommendation Algorithm and Parameters) SlopeOne Modeling
Rule-Based Modeling
Recom mendati on‐ ouput .js
Beha‐Collect.js
Recommendation Positions
Other Models
Data Collection
Multi Way Recommendation Display Multi‐Way Recommendation Display
Recomm‐path.js
Music.10086
Music of your Interest……
Similar Users Like: …… 音乐平台为你推 荐的音乐……
Same Type of Music…
Managing Recommendation Managing Recommendation
2014/4/14
14
Recommendation Algorithms Recommendation Algorithms • Collaborative Filtering – Item Based: Slope One – Matrix Factorization: SVD – Factorization Machines(FM): Consider other properties beyond rating matrix – FM Based Mixture Recommendation: Item Similarity was Mixed with FM results Items are obtained from FM results.
• Content Based Recommendation • Association Rule Mining for Association Rule Mining for Recommendation 2014/4/14
15
Case Study: read.10086. Y Axis: Number of Click. X Axis: Number of Recommended Items.
Application Layer: 17 Applications
China Mobile Research Institute
17
China Mobile Research Institute
18
Web Hot Spot Analysis
China Mobile Research Institute
19
Configuration Data Collect
Preprocessing
Configuration Training
Generate
Managing
User Side: – – – –
JS, SDK:Parameter Configuration,Template Download JS SDK:Parameter Configuration,Template Download Algorithm Configuration: Parameter Setting Display Configuration Site Log Analysis
Server Side: – User Managing:User Registration、User Right Setting – Service Management:Network, Resource, Data Processing Configure – User / Usage Statistical Analysis User / Usage Statistical Analysis
Deployment Mode: Real Time Recommendation •Web/Mobile app Personalized News
Platform News Website
PreP re Processin g
Content Recommendation
C Product / Service Recommendation
产业市场研究所
Data Collect
Resource Website
Mobile Market
Model Conf & Training a g Result Display
Deploy Mode:Offline Log Mining •Mobile WAP/App
WAP Client
log g
Logging Server Logging Server
Periodical Synchronization
Periodical Synchronization
Product Update
Product Database WAP Server MM Server …
Periodical Synchronization
产业市场研究所
Data Collect PreP re Processin g Model Conf & Traing a g
Recommendation
Recommendation R d i Server
mAnalytics
Result Display
Clusters Easy to be Extended
2 Load Load Balance r
s00
s01
4
12
WEB 服务
s02
s03
s04
Hadoop cluster
s05 s06
5
12
s07
MooseFS
s08
s09
s10
2
HBase
s11
s12
s13
Memc ached
s14
s15
6 Recom Server cluster
s16
s17
Deployed on Local and Remote Servers
Analytic APIs mAnalytics a yt cs Manage ment
Control Control
Small Clusters
Small Clusters
Province A
Province B
Outline Background B k d Platform Architecture Case studies
China Mobile Research Institute
25
Case study 1: music recommendation Music: Personalized Recommendation music.10086 is one of the top music sharing sites with >3M songs,members>90M, Music Download >150M/Month. >150M/Month Work Flow:User Behavior mAnalytics mAnalytics calculates recommendation results music.10086 music 10086 server Display: “Guess Guess What You Like”
Results: Music Purchase Rate increases by 50% comparing to Editor Recommendation. Music Download rate doubles
Case Study 2: cartoon recommendation Personalized Cartoon Recommendation Dm.10086isthelargestcartoonpublicationandsharingplatformwithnumberof users>100M,220K 100M 220K cartoons。 t Work Flow: User Behavior mAnalytics mAnalytics calculates recommendation results dm.10086 dm 10086 server Display: “Guess Guess What You Like” Like
Recommendation conversion rate incresases by 42% comparing to editors’ version
Case study 3: book recommendation Personalized Book Recommendation Read.10086 is one of the top reading sites with users >130M,books >400K. The most popular l book b k has h 2.6 2 6 billion billi hits hit Work Flow::JS , SDK on web/ mobile app of read.10086 mAnalytics Real Time Recommendation
Book Click Rate increases by 28% comparing to a baseline
Case study 4: cross recommendation Music, Video Cross Recommendation Music .10086 and Video.10086 Cross-Site Recommendation
On music music.10086, 10086 57.4 57 4 % of the clicks are for video recommendation On video.10086, 48.2% of the clicks are for music recommendation
手机视频广告位
无线音乐广告位
THANKS!
[email protected]
2014/4/14
30