Three Data Partitioning Strategies for Building Local Classiers Indrė Žliobaitė TU Eindhoven 2010, September 20
Set up
Ensembles Training set for each member Randomized procedure
Evaluation Competence of each member Assigned region of competence
Deterministic procedure
Ensembles Training set for each member Randomized procedure
Evaluation Competence of each member Assigned region of competence
Deterministic procedure
Set up
●
Specific types of ensembles, which ●
Partition the data into non intersecting regions
●
Train one classifier per partition
●
Use classifier assignment for the final decision
Classifier 4 Classifier 1
Classifier 5 Classifier 2
Classifier 3
Classifier 4 Classifier 1
Classifier 5 Classifier 2
Classifier 3
Set up ● ●
We will explore three data partitioning strategies We will build a meta ensemble consisting of local experts
Set up ● ●
●
We will explore three data partitioning strategies We will build a meta ensemble consisting of local experts Motivation ●
divide and conquer
●
use different views to the same learning problem
●
assess the impact of class labels to partitions
●
building blocks for handling contexts / concept drift
Partitioning
Three partitioning techniques ●
Cluster the input data
●
Cluster each class separately
●
Partition based on a selected feature
Toy data
Clustering all (CLU) Cluster the input data
Clustering all (CLU) Cluster the input data
Build classifiers
Clustering all (CLU) Cluster the input data
Build classifiers
Select the relevant classifier
Clustering within classes Cluster the first class
A B
Clustering within classes Cluster the first class
A B
Cluster the second class
D C
Clustering within classes Cluster the first class
A B
Build the classifiers (pairwise)
A D
B A
Cluster the second class
C
D C
D
B
C
Clustering within classes Build the classifiers (pairwise)
D
A D
Select two closest clusters = the relevant classifier
B A C
B
C
Partitioning based on a feature Slice the data and build classifiers
Partitioning based on a feature Slice the data and build classifiers
Select the relevant classifier
Experiments
Experiments ● ●
●
CLU, CLU2, FEA and meta ensemble (MMM) Baselines: naive (NAI), random partitoning (RAN) and no partitioning (ALL) Classification datasets from various domains ●
Three Data Partitioning Strategies for Building Local ...
Experiments. â CLU, CLU2, FEA and meta ensemble (MMM). â Baselines: naive (NAI), random partitoning. (RAN) and no partitioning (ALL). â Classification datasets from various domains. â dimensionalities 7-58. â sizes 500- 44000. â two classes ...
(2004), for example, show that exploiting both the textual content of web pages and the anchor text of ..... 1http://www.umiacs.umd.edu/~abhishek/papers.html.
Nov 23, 2010 - Comparison of Communications on a Star Topology . ...... (2001b), the future of computing platforms is best described ... of a small number of interconnected heterogeneous computing .... as the computational (number crunching) equivale
Jun 23, 2009 - Let Sp be the set of elements that are intact (uncracked) and share the crack front with elements ... where nI is the number of the adjacent cracked elements which share node I with the current element e. Note that ..... Xu XP, Needlem
Apr 22, 2008 - For example, figure 1 summarizes a restaurant using aspects food ... a wide variety of entities such as hair salons, schools, mu- seums, retailers ...
Ask About Your Brand,â Harvard Business Review, September, 80 (9), 80-89. Kevin Lane Keller (2001), âBuilding Customer-Based Brand Equity: A Blueprint for Creating. Strong Brands,â Marketing Management, July/August, 15-19. Kevin Lane Keller and
Mar 5, 2008 - may reduce performance. A simple reductio ad absurdum shows .... flow.inc(cap); ..... In our earlier work, we introduced set iterators to express.
Company's Most Valuable Asset READ ONLINE By Daniel. Diermeier. Online PDF Reputation Rules: Strategies for Building Your Company's Most Valuable .... S.C. Johnson &Son Distinguished Professor of International Marketing, Kellogg.
Feb 26, 2008 - disadvantage in that terminal subscribers should move to a downloading area to ... data to the plurality of mobile communication terminals through the ... Additional advantages, objects, and features of the inven tion will be set ...