Mining Contexts for Recommending Source Locations to Explore

Seonah Lee and Sungwon Kang Dec. 14, 2013

International Workshop on ICT, 2013

1

Outline 1. Introduction 2. Proposed Approach 3. Evaluation

4. Conclusion

International Workshop on ICT, 2013

2

1. Introduction  Research Background  Related Work  Question

International Workshop on ICT, 2013

3

Research Background (1/2)  The relative cost for software evolution now represents > 90% of its total cost [Erlikh 2000]  In software evolution tasks…  Programmers spend > 50% time to

understand source code [Fjeldstad 1983]  Programmers spend  35% time in navigating code bases [Ko 2005]

Activities Understanding Navigating

Changing

Software evolution Task: The smallest identifiable and essential piece of a job that serves as a unit of work that changes a software system (i.e. fixing bugs and enhancing features) International Workshop on ICT, 2013

4

Research Background (2/2)  Programmers look for new source locations which may related

to a given task  [Letovsky 1986] [Ko 2005][Latoza 2010]

Potential navigation paths exponentially increase Source Code

createArrowMenu

Source Code

International Workshop on ICT, 2013

5

Related Work Interaction History-based Code Recommenders  Mylyn [Kersten 2006]

 TeamTrack [DeLine 2005]

 Display a collection of source

 Recommend source

locations when a programmer selects a task ID  Count the frequencies of source locations

locations historically associated with the location that a programmer selects  Determine associations between source locations

Required a programmer‘s manual indication of a task International Workshop on ICT, 2013

Limited to recommending co-visited source locations 6

Research Question  To effectively guide programmers' code navigation, collections of source locations to explore should be given, automatically  How these collections of source locations can be

automatically created and visualized? visit source locations Collection of source locations relevant to a given Task

?

Programmer International Workshop on ICT, 2013

7

3. Proposed Approach  Definition (Navigation Context)  Principles (for Mining Contexts)  Steps (for Mining Contexts)  Tool (for Mining and Recommending

Contexts)

International Workshop on ICT, 2013

8

Definition (Navigation Context)  Conceptually  Information that a programmer needs to explore

and understand during a software evolution task

 Technically  Collection of source locations, frequently visited

relevant to similar tasks

International Workshop on ICT, 2013

9

Principles (for Mining Contexts)  Relevance by Frequency  The source locations that programmers frequently

visited are likely to be highly relevant to the tasks of the programmers

 Relevance by Context  If a source location of a navigation sequence is

highly relevant to a task, it is likely that the other source locations in the same navigation sequence are relevant to the same task International Workshop on ICT, 2013

10

Steps (for Mining Contexts in Programmer Interaction Histories)  Navigation Context = Retrieve (Mine (Segment

(InteractionHistories)), Navigation Path) Interaction Traces a b c a b d b d, e f g e f c f c f, c a b x b d

{ c, d }

Retrieve:

Segment:

(b, 5), (a, 3),

(a b c) (a b d) (b d) (e f g) (e f c) (f c) (f) (c a b x) (b d)

(d,3),

Mine:

(c, 2) ,

Micro-clustering A • B / ||A|| ||B|| Cosine Similarity

Macro-clustering K-nearest clustering International Workshop on ICT, 2013

{ (a, 2), (b, 2), (c,1), (d, 1) } { (b, 2), (d, 2) } { (e, 2), (f, 2), (g, 1) (c, 1) } { (f, 2), (c, 1) } { (c, 1), (a, 1), (b, 1) (x, 1) }

(x, 1)

TF • IDF Similarity

{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}

{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}

{ (e, 2), (f, 4), (c, 2), (g, 1) }

{ (e, 2), (f, 4), (c, 2), (g, 1) }

11

Steps (for Mining Contexts in Programmer Interaction Histories)  Navigation Context = Retrieve (Mine (Segment

(InteractionHistories)), Navigation Path) Interaction Traces

{ c, d }

a b c a b d b d, e f g e f c f c f, c a b x b d Retrieve:

(b, 5), (a, 3), (d,3),

Mine:Segment:

(c, 2) ,

(a b c) (a b d){ (a, (b 2),d)(b,(e2), f(c,1), g) (d, (e1) f} c) (f c) (f) (c a b x) (b d) Micro-clustering A • B / ||A|| ||B|| Cosine Similarity

Macro-clustering K-nearest clustering International Workshop on ICT, 2013

{ (b, 2), (d, 2) } { (e, 2), (f, 2), (g, 1) (c, 1) } { (f, 2), (c, 1) } { (c, 1), (a, 1), (b, 1) (x, 1) }

(x, 1)

TF • IDF Similarity

{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}

{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}

{ (e, 2), (f, 4), (c, 2), (g, 1) }

{ (e, 2), (f, 4), (c, 2), (g, 1) }

12

Steps (for Mining Contexts in Programmer Interaction Histories)  Navigation Context = Retrieve (Mine (Segment

(InteractionHistories)), Navigation Path) Interaction Traces Mine: a b c a b d b d, e f g e f c f c f, c a b x b d

{ c, d }

Retrieve: { (a, 2), (b, 2), (c,1), (d, 1) } Segment:Micro-clustering { (b, 2), (d, 2) } (a b c) (a b d) (b d) (e f g) f c) ||B|| (f c) (f) (c a b x) (b d) A• B /(e||A|| { (e, 2), (f, 2), (g, 1) (c, 1) } Cosine Similarity { (f, 2), (c, 1) } { (c, 1), (a, 1), (b, 1) (x, 1) }

(b, 5), (a, 3), (d,3), (c, 2) , (x, 1)

TF • IDF

Macro-clustering K-nearest clustering

Similarity

{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)} { (e, 2), (f, 4), (c, 2), (g, 1) }

{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)} { (e, 2), (f, 4), (c, 2), (g, 1) }

International Workshop on ICT, 2013

13

Steps (for Mining Contexts in Programmer Interaction Histories)  Navigation Context = Retrieve (Mine (Segment

(InteractionHistories)), Navigation Path) Retrieve: Interaction Traces

(b, 5),

a b c a b d b d, e f g e f c f c f, c a b x b d

{ c, d } (a, 3),

(d,3),

Segment:

(c, 2) ,

(a b c) (a b d) (b d) (e f g) (e f c) (f c) (f) (c a b x) (b d)

(x, 1)

Mine: Micro-clustering A • B / ||A|| ||B|| Cosine Similarity

{ (a, 2), (b, 2), (c,1), (d, 1) } { (b, 2), (d, 2) } { (e, 2), (f, 2), (g, 1) (c, 1) } { (f, 2), (c, 1) } { (c, 1), (a, 1), (b, 1) (x, 1) }

TF • IDF Similarity { (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}

Macro-clustering K-nearest clustering International Workshop on ICT, 2013

{ (b, 5), (a, 3), (d,3), (c, 2) , (x, 1)}

{ (e, 2), (f, 4), (c, 2), (g, 1) }

{ (e, 2), (f, 4), (c, 2), (g, 1) }

14

Tool (for Mining & Recommending Contexts)  A graphical code recommender that visualizes source locations to visit  It incorporates the proposed approach

- Display History - Display Recommendations - Collect interaction traces - Jump to source locations - Update diagram Layout

Recommendations Histories

International Workshop on ICT, 2013

15

4. Evaluation  Evaluation Plan  Evaluation Results

International Workshop on ICT, 2013

16

Evaluation Plan Simulations

User Studies

Experiment in an early phase

Simulation using Experimental Data

Wizard-of-oz Study

- 12 programmers performed the same 4 tasks

- 11 programmers performed the same 4 tasks

Application in a later phase

Simulation using Real Data

Diary Study

- 10 programmers - 4,397 interaction traces, used the tool in their extracted from the Eclipse environment for a month Bugzilla system

International Workshop on ICT, 2013

17

Evaluation Results Simulations

User Studies

 NavClus showed two times

 Wizard-of-oz study: 9 out of

higher recommendation accuracy than TeamTracks

11 programmers positively evaluated:  “It provides a crucial hint”

0.2

 “Uh, here are all answers”

F-measure

0.15

 Diary study: it is limited to

0.1

the individual use of the tool

0.05 0

Myl yn TeamTracks 0.082 NavClus 0.14

 Although all of 10 programmers Platf PDE ECF MDT orm 0.058 0.055 0.091 0.034 0.122 0.2 0.191 0.051

International Workshop on ICT, 2013

highly evaluated NavClus, it was not because of the NavClus recommendations 18

5. Conclusion  Conclusion  Future Work

International Workshop on ICT, 2013

19

Conclusion RQ: How navigation contexts can be automatically created and visualized? Navigation context: the information that a developer needs to explore and understand during a software evolution task

 We propose a clustering technique that automatically forms

past programmers' navigation contexts  We implemented the NavClus tool, and investigated the

effectiveness of the NavClus tool in real-world development

International Workshop on ICT, 2013

20

Future Work  Comparison of Recommendation Techniques  Data Clustering  Association rule mining

 Hidden markov model

 Additional User Studies for Collaboration  Contextual Knowledge Transfer  Training new comer

NavClus International Workshop on ICT, 2013

21

Question?

Seonah Lee [email protected]

International Workshop on ICT, 2013

22

A Study on Guiding Programmers' Code Navigation ...

Dec 14, 2013 - Programmers look for new source locations which may related to a given task ... Information that a programmer needs to explore .... Application.

1MB Sizes 2 Downloads 172 Views

Recommend Documents

A Study on Guiding Programmers' Code Navigation ...
other history-based approaches in using a graphical view (Section 2.4). ... The approaches of leveraging programmers' history have emerged. ..... sity, Pittsburgh, PA. 2006. [5] A. Cox, M. Fisher and J. Muzzerall, ―User Perspectives on a Visual Aid

A Study on Guiding Programmers' Code Navigation ...
code, software visualization tools have developed ... Programmers still expect to use software visualization ... intelligent system, actually operated by an unseen.

Distributed Algorithms for Guiding Navigation across a ...
systems, well-suited for tasks in extreme environments, es- pecially when the .... the smallest number of communication hops to a sensor that .... filing. In our current implementation, we perform the neigh- bor profiling on the fly. Every time a ...

Mixing navigation on networks
file-sharing system, such as GNUTELLA and FREENET, files are found by ..... (color online) The time-correlated hitting probability ps and pd as a function of time ...

Programmers' Build Errors: A Case Study - Research at Google
of reuse, developers use a cloud-based build system. ... Google's cloud-based build process utilizes a proprietary ..... accessing a protected or private member.

Outdoor Robot Navigation Based on a Probabilistic ...
vehicle, and, afterwards, how to reduce the robot's pose uncertainty by means of fusing .... reveal that, as shown in Appendix C, GPS (or DGPS) pose estimates are ...... Applications to Tracking and Navigation”, John Wiley & Sons,. 2001.

Perspectives on the development of a magnetic navigation system for ...
Mar 17, 2006 - of this system for cardiac mapping and ablation in patients with supraventricular ... maximum field strength dropped from 0.15 T (Telstar) to.

Perspectives on the development of a magnetic navigation system for ...
Mar 17, 2006 - Development of the magnetic navigation system was motiv- ated by the need for accurate catheter manipulation during complex ablation ...

Guiding Principles on Young People's Participation in Peacebuilding ...
Guiding Principles on Young People's Participation in Peacebuilding - Infographic.pdf. Guiding Principles on Young People's Participation in Peacebuilding ...

A Study on Double Integrals
This paper uses the mathematical software Maple for the auxiliary tool to study two types of ... The computer algebra system (CAS) has been widely employed in ...