Where is the Goldmine? Finding Promising Business Locations through Facebook Data Analytics Jovian Lin, Richard Oentaryo, Ee-Peng Lim, Casey Vu, Adrian Vu, Agus Kwee

Part I:

Motivation

Location, Location, Location …. •  Location is a vital aspect of retail success – 94% of retail sales are still transacted on physical stores. •  To increase the chance of success, business owners traditionally conduct ground surveys to gather relevant data and evaluate the location of interest. •  But … this is a herculean task. Ø Time-consuming, costly, and not scalable Ø Cannot cope with fast-changing environments (e.g., neighborhood rental, local population size, etc.).

How Facebook Data Can Help •  Fortunately, we can use Facebook to capture the activities of users. •  An important type of activity is location check-ins.

Our Research Research Questions •  Where should a retail store be set up to optimize its popularity? •  What are the important factors affecting a store’s popularity? •  Can new businesses benefit from more established businesses?

Task Formulation •  Given a target location, how can we extract the relevant data of businesses within its vicinity, and use them to estimate the popularity of the target location?

Our Key Contributions 1.  New study on business location analytics using Facebook data Ø  Study on 20,887 Facebook Pages of food-related businesses in SG. Ø  Detailed analysis on key features affecting business popularity, at both chunk (feature group) and individual feature levels

2.  Location analytics framework that includes rich feature extraction module and accurate prediction model Ø  Our model can estimate on the fly the popularity of an arbitrary point on the map, unlike previous work that relies on discretized areas

3.  Interactive web application – User may select a point on a map and get an estimated popularity score of that location

Part II:

Facebook Data

How Do the Data Look Like? Example: Wimbly Lu Chocolates

Key A&ributes i. 

Business ID

ii.  Categories iii.  Check-in counts iv.  Loca@on (Lat-long)

Data Collec3on We study 20,877 foodrelated businesses in SG, collected based on a manually curated list of 133 food-related categories of business

Exploration of Facebook Data 1. Categorical data –  There are 357 unique category labels for all food-related businesses in Singapore –  Example: A Starbucks outlet in Changi Airport may have both food and non-food labels such as: “airport” , “café” , “coffee shop” , “train station” –  Categories are important features … because we can scrutinize the relationship between different categories of the neighboring businesses in a local area.

Exploration of Facebook Data 1. Categorical data Top 25 categories extracted from our dataset Food&&&Restaurant&&& Restaurant&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Cafe&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Shopping&Mall&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Coffee&Shop&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Bakery&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Chinese&Restaurant&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&

'

Top'Categories'

Fast&Food&Restaurant&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&

Food&&&Grocery&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Bar&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Japanese&Restaurant&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Train&Sta4on&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Food&Stand&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Seafood&Restaurant&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& Movie&Theatre&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& 0&

200&

400&

600&

800& 1000& 1200& 1400& 1600& 1800& 2000& 2200& 2400& 2600& 2800& 3000& 3200& 3400& 3600&

Number'of'Businesses'

Exploration of Facebook Data 2. Location data –  We want to analyze a target business’s neighbors. –  For a target location l, we define its neighborhood as the set of places p within radius r around l. •  dist(p, l) = Haversine distance between two places p and l. •  P = Set of food-related businesses in Singapore.

–  This allows us to retrieve the k-nearest neighbors of a target business location.

Exploration of Facebook Data 3. Popularity Indicator –  We use “check-ins” instead of “likes” to measure a store’s popularity, because “check-ins” indicate physical presence. –  Check-ins can be repeated—a user could check-in to a place on Mon and do so again on Tues. –  Check-ins allow us to track how many times users visit a store.

Part III

Methodology

Location Analytics Framework

Location Analytics Framework

Location Analytics Framework

Location Analytics Framework Step 1: Neighbors Extrac3on •  Extract loca@on (i.e. latlong) of a target business. •  This loca@on is used to extract all the neighbors within 1km from the target. •  For each neighbor, extract: Ø  its categories Ø  its check-ins data (a.k.a. “hotspot”)

Location Analytics Framework Step 2: Feature Engineering •  Based on the neighbors data, we construct a feature vector represen@ng the target business’ profile. •  The constructed features consist of six different groups called “chunks” (to be described shortly)

Business Location Analytics Framework

Location Analytics Framework Step 3: Regression •  We use a supervised, regression model to learn the associa@on between (i) the features and (ii) the actual check-ins score. •  The trained model is then used to predict #check-ins for a new/unseen profile. •  We tested several models, and seXled for gradient boos@ng machine (GBM).

Location Analytics Framework

Feature Engineering

Part IV

Experiments

Experiment Setup Evaluation metrics Averaged over 10-fold crossvalida@on

Predictive models 1.  Distance-based nearest neighbors (DNN) 2.  Linear support vector regression (SVR-Linear) 3.  Radial basis support vector regression (SVR-RBF) 4.  Gradient boosting machine (GBM)

Performance Assessment

•  As expected, SVM-RBF is beXer than SVM-Linear à RBF kernel maps the original features into a high-dimensional space, giving more discrimina@ve power •  GBM outperforms all the other methods à GBM combines weak learners into a strong learner whose aggregate predic@on is beXer than the cons@tuents

Chunk Contribution Observa3ons •  GBM is robust to chunk varia@ons •  Categories of the target business (chunk C1) appear in the top 10 GBM variants •  Total “check-in” chunks (C3 and C5) are ranked higher than avg. “check-in” chunks (C4 and C6) •  No substan@al difference between food-related hotspots and all (food + non-food) hotspots

Feature Importance

•  •  •  •  • 

The more “check-ins” in the neighborhood, the more popular the target loca@on Nearer “check-ins” are stronger à 14/20 hotspot features < 500 meter Total “check-ins” are more important than average “check-ins” Categories of neighbors (C2) are more crucial than those of target business (C1) Food-related categories of neighbors are more crucial than non-food categories

Part V

Application Prototype

Web Application Demo

Website: hXp://research.larc.smu.edu.sg/bizanaly@cs/

THANK YOU!

Where is the Goldmine? Finding Promising Business ...

Where should a retail store be set up to optimize its popularity? • What are the important factors affecting a store's popularity? • Can new businesses benefit from more established businesses? Task Formulation. • Given a target location, how can we extract the relevant data of businesses within its vicinity, and use them to ...

3MB Sizes 3 Downloads 160 Views

Recommend Documents

A Programmer's Perspective - thehogsniper:dis is where the home is
Nov 16, 2001 - 1.7.3 Virtual Memory . ...... 10.7.2 Linux Virtual Memory System . ...... ported to new machines, which created an even wider audience for both C ...

PROMISING KOSOVO.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. PROMISING ...

Chapter 2 The Problem with Promising - WordPress.com
cisely by communicating the intention of creating such obligations. These examples all involve the exercise of a normative power, of a power to change.

Where Is My Mind.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Where Is My ...

Where There Is No Doctor.pdf
... Karakalpak, Kazakh, Khmer, Kirundi, Korean,. Kwangali, Kyrgyz, Lao, Malayalam, Maranao, Marathi, Miskito, Mongolian, Mortlockese,. Nepali, Oriya, Oshivambo, Pashto, Pidgin, Portuguese, Quechua, Russian, Sepedi, Sebian,. Sgawkaren, Shan, Shuar, Si

Arseniev - Where is God.pdf
of mutability, of passing away? And of the silence of immense expanses giving no answer, void. of response to our anguish, to our appeal, to our challenge, and ...

Is finding security holes a good idea?
The Full Disclosure [1] mailing list, dedicated to the discussion of ... on personal computer systems. Moreover, such studies typically focus on all faults, not on security vulnerabili- ties. ... that it is of course possible for an advisory to be re

Is finding security holes a good idea?
improvement—the data does not allow us to exclude the possibility that ..... been a recording error (e.g. through external knowledge) ...... into recovering from the vulnerabilities that are found, through ... regular service releases with patches

Neon - Where the Light is Live -carrdav2014.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Neon - Where ...

Chapter 1 1.1 Where is the author? -
raised by the art forger after the emergence of the modern painting went straight to ...... In one sense, Richter's sentiment echoes the entire history of western art, ...

Chapter 2 The Problem with Promising - WordPress.com
adversely affected, because a socially valuable practice is damaged. ... is hard to see how it would make any sense for people to conform to its rules, except when .... pass (and not run out in front of me) only because I chose to drive along the.

in-the-cemetery-where-al-jolson-is-buried.pdf
in-the-cemetery-where-al-jolson-is-buried.pdf. in-the-cemetery-where-al-jolson-is-buried.pdf. Open. Extract. Open with. Sign In. Main menu.

Neon - Where the Light is Live Ver3.pdf
(most of these (0) notes are missed muted notes). S. R ... ZZZZZ. Page 2/8. 21 ! 1m 10s. S ! R ). (. ( +. S. R. ) (. "! ) (. "! R. ( "! ,!- ( * "# ,'- U ... "Hits her like a sunrise".

Where is the Gingerbread Man 2.pdf
Page 3 of 11. The gingerbread man is in the oven. The gingerbread man is in the oven. Page 3 of 11. Where is the Gingerbread Man 2.pdf. Where is the Gingerbread Man 2.pdf. Open. Extract. Open with. Sign In. Main menu. Displaying Where is the Gingerbr

Antarctica-Where-The-Emperor-Is-A-Penguin-Whizz-Bang-S.pdf
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item.

Governor's Budget Promising for Children & Families - wafca
Feb 9, 2017 - Evidence-based home visitation programs across the state are ... and supportive services such as foster care, in-home support, counseling,.

Finding Hope Finding Hope
May 31, 2015 - At Home Study Guide. For the week of May 31, 2015. Psalm 23:4. Quick Review: The suspense of Psalm 23:4 provides a beautiful place for David to affirm His complete trust and dependence on his heavenly Father. The valleys and shadows pa