Software modularity and reuse Paulo Borba Informatics Center Federal University of Pernambuco
[email protected] ◈ twitter.com/pauloborba Wednesday, August 18, 2010
1
Reuse opportunities
Wednesday, August 18, 2010
2
Different devices, 15 to 60 different applications…
Different clients, different products
http://www.androidauthority.com Wednesday, August 18, 2010
3
Little reuse and agility, high costs Even with J2ME!
Wednesday, August 18, 2010
4
Problem might also appear in the context of a single product... Wednesday, August 18, 2010
5
How to detect it? Wednesday, August 18, 2010
6
Code cloning tool, file
Wednesday, August 18, 2010
7
Code cloning tool, packages
Wednesday, August 18, 2010
8
But what is a clone? Wednesday, August 18, 2010
9
Clones? Strings? Tokens? if (first == null && second != null) { return -1; } if (first == null && second != null) { return -1; } if (f == null && s != null) { return -1; } if (f == null && s != null) { return 0; } Wednesday, August 18, 2010
10
Clones? Reuse opportunities? int getDay() { return day; } int getMonth() { return month; } int getDay() { return day+1; } Wednesday, August 18, 2010
11
Google wave clones int getDay() { return day; } int getMonth() { return month; } int getDay() { return day+1; } Wednesday, August 18, 2010
12
int sum(int x,int y){return x+y;} int sumInt(int a,int b){return a+b;} Wednesday, August 18, 2010
13
Changing clone notion, small
Wednesday, August 18, 2010
14
Changing clone notion, large
Wednesday, August 18, 2010
15
Minimum clone length (tokens) if (o.toString().equals("/")) { div = new DivExp(null, null)); } if (o.toString().equals("*")) { mult = new MultExp(null, null)); }
Wednesday, August 18, 2010
16
P-match... maps different variables and function names to different tokens if (x>0) {y=1; x=2; x=w;} else {y=1; x=2; x=z;} Wednesday, August 18, 2010
17
Minimum TKS (kinds of tokens) int x = 0; int y = 0; int z = 0; int a = 0; int b = 0; int c = 0;
Wednesday, August 18, 2010
18
Shaper level, hard and soft
Wednesday, August 18, 2010
19
Shaper level, easy and none
Wednesday, August 18, 2010
20
File metrics
• • •
LEN: File length (in tokens)
• • •
RSA: Ratio of similarity to another file
•
RNR: Ratio of non-repeated code
Wednesday, August 18, 2010
CLN: Number of Code Clones NBR: Neighbors, other files that share a code clone in this file
RSI: Ratio of similarity within the file CVR: Coverage, percentage of tokens covered by another code clone
21
Clone set metrics • • •
LEN: Length of a code fragment of the code clone
•
RAD (radius): Range of the source code fragments of a code clone in the directory hierarchy
•
RNR (ratio of non-repeated tokens): Ratio (percentage) of tokens that are not included in repeated part of a code fragment of the code clone
•
TKS (token set size): Size of a set of tokens of a code fragment of the code clone
•
LOOP, COND, McCabe
Wednesday, August 18, 2010
POP (population): Count of code fragments of the code clone NIF: Count of source files that include one or more code fragments of the code clone
22
Strategy • Start with stronger constraints • Analyze clones • Reduce constraints • Cycle until no more interesting clones Wednesday, August 18, 2010
23
Take notes, now! Wednesday, August 18, 2010
24
Reuse opportunities
Wednesday, August 18, 2010
25
Software modularity and reuse Paulo Borba Informatics Center Federal University of Pernambuco
[email protected] ◈ twitter.com/pauloborba Wednesday, August 18, 2010
26