2009. 9. 23 ICSM 2009

TOKYO INSTITUTE OF TECHNOLOGY DEPARTMENT OF COMPUTER SCIENCE

Recovering Traceability Links between a Simple Natural Language Sentence and Source Code Using Domain Ontologies Takashi Yoshikawa Shinpei Hayashi Motoshi Saeki Department of Computer Science Tokyo Institute of Technology Japan

Background 

Doc-to-code traceability is important − For reducing maintenance costs − For software reuse & extension



Focus: recovering sentence-to-code traceability − In some software products, there’s only documents of simple sentences without any detailed descriptions e.g. under an agile process 2

Aim 

To precisely detect set of methods related to the input sentence

NL sentence (a set of words)

Users can draw a plain oval.

Source code (a set of methods)

draw Oval() writeLog() getCanvas()

getColor()

setPixel() getColorPallete() DrawPanel()

OvalTool()

3

Problem 

How to precisely get the set? − Word similarity: leads to false positives/negatives

NL sentence (a set of words)

Source code (a set of methods)

Users can draw a plain oval.

draw Oval() writeLog()

void setPixel(...) { ... draw ... }

False positives

getCanvas()

getColor()

setPixel()

False negatives

getColorPallete() DrawPanel()

OvalTool()

4

Problem 

How to precisely get the set? − Word similarity: leads to false positives/negatives − Method invocation: leads to false positives

NL sentence (a set of words)

Users can draw a plain oval.

Source code (a set of methods)

False positive

drawOval() writeLog() getCanvas()

getColor()

setPixel() getColorPallete() DrawPanel()

OvalTool()

5

Problem 

Another criterion required − To judge whether a method invocation is needed − Considering the problem domain

NL sentence (a set of words)

Users can draw a plain oval.

Source code (a set of methods)

drawOval()

important getCanvas()

Not important

writeLog()

getColor()

setPixel() getColorPallete() DrawPanel()

OvalTool()

6

Domain Ontology 

Formally representing the knowledge of the target problem domain − As relationships between concepts (words) canvas draw

oval

color

A concept “canvas” is a possible target to “draw”. The “draw” function concerns a “color” concept. An ontology for painting tools (excerpt)

7

Our Solution 

Choosing method invocations by using domain ontologies

NL sentence (a set of words)

Users can draw a plain oval.

Source code (a set of methods and their invocations)

draw Oval()

Ontology canvas

writeLog() getCanvas()

getColor()

setPixel()

draw

getColorPallete() oval

color

DrawPanel()

OvalTool() 8

System Overview Sentence Source Code Domain Ontologies

Inputs

Extracting Words

Words in the Sentence

splitting, stemming, extract identifiers removing stop words... Words in the Code method invocation Call-graph analysis

Sentence-related Traversing Code fragments call-graph

Prioritizing

Outputs

Ordered Sentence-related Code fragments 9

Procedure

NL sentence (a set of words)

Source code (a set of methods and their invocations)

{wa, wb }

m1

1. Root selection 2. Traversal 3. Results extraction

m2

m3

m4 m6

m5 m7

m8 10

Procedure

NL sentence (a set of words)

Source code (a set of methods and their invocations)

{wa, wb }

m1

role: {wa}

1. Root selection

– Choose the methods having the words of the input sentence – The words become the methods' roles

2. Traversal 3. Results extraction

{wa, ...}

m2 m4

m3 {wb, ...}

role: {wb}

m6

m7

m5 m8 11

Procedure

NL sentence (a set of words)

Source code (a set of methods and their invocations)

{wa, wb }

m1

role: {wa}

1. Root selection 2. Traversal

m2

– Traverse method invocations from the roots iff the invocation satisfies one of the traversal rules

m3

m4

m5

role: {wb}

3. Results extraction

m6

m7

m8 12

Traversal 

3 traversal rules 1. Sentence-based rule 2. Ontology-based rule 3. Inheritance rule Ontology



Role extraction − Words used in the rules are extracted as the callee’s role

Rule #2 (ontology-based)

drawOval()

role: {draw, oval} draw canvas

Caller

Callee

getCanvas()

role: {canvas} 13

Procedure

NL sentence (a set of words)

Source code (a set of methods and their invocations)

{wa, wb }

m1

role: {wa}

1. Root selection 2. Traversal

– Traverse method invocations from the roots iff the invocation satisfies one of the traversal rules – Method roles are also extracted by the rules

3. Results extraction

m2

m3

role: {…}

role: {…}

m4

m5

role: {wb}

m6

role: {…}

role: {…}

m7

m8 14

Procedure

NL sentence (a set of words)

Source code (a set of methods and their invocations)

{wa, wb }

Sa

role: {wa}

1. Root selection 2. Traversal

m2

m3

role: {…}

3. Results extraction

– The traversed set of methods are the candidates of sentencerelated code fragment

m1

role: {…}

m4

m5

role: {wb}

Sb m6

role: {…}

role: {…}

m7

m8 15

Case Study 

Evaluation target: JDraw 1.1.5 − Picked up 7 sentences from JDraw's manual on the Web − Prepared an ontology for painting tools • including 38 concepts, 45 relationships

− Prepared control (answer) sets by a expert



Evaluation Criteria − Calculating precision and recall values by comparing the extracted sets with the control sets 16

Results Use of ontology Input sentences

Yes Prec.

Yes Recall

No Prec.

No Recall

1. "plain, filled and gradient filled rectangles"

0.83

0.94

1.00

0.19

2. "plain, filled and gradient filled ovals"

0.82

0.98

1.00

0.21

3. "image rotation"

1.00

0.35

0.00

0.00

4. "image scaling"

0.22

0.68

1.00

0.58

5. "save JPEGs of configurable quality"

0.40

1.00

0.67

1.00

6. "colour reduction"

0.74

0.95

0.74

0.95

-

-

-

-

7. "grayscaling"



Accurate results by using the ontology − precision > 0.7 for 3 cases − recall > 0.9 for 4 cases 17

Results Use of ontology Input sentences

Yes Prec.

Yes Recall

No Prec.

No Recall

1. "plain, filled and gradient filled rectangles"

0.83

0.94

1.00

0.19

2. "plain, filled and gradient filled ovals"

0.82

0.98

1.00

0.21

3. "image rotation"

1.00

0.35

0.00

0.00

4. "image scaling"

0.22

0.68

1.00

0.58

5. "save JPEGs of configurable quality"

0.40

1.00

0.67

1.00

6. "colour reduction"

0.74

0.95

0.74

0.95

-

-

-

-

7. "grayscaling"



Improvement by using the ontology − Improved recall for 1st and 2nd cases − detected traceability for 3rd case

 Bad results (degradation, no effect) also occurred 18

Conclusion Domain ontologies give us valuable guides for traceability-recovering and feature-location.



Summary: − Proposed a technique to find a set of methods related to the given NL sentence by using domain ontologies − Showed the feasibility of our approach with a case study of JDraw



Future work − Automated construction of domain ontologies − Case study++

19

Recovering Traceability Links between a Simple ...

and Source Code. Using Domain Ontologies ... How to precisely get the set? ... Domain. Ontologies. Prioritizing. Sentence-related. Code fragments. Words in the.

2MB Sizes 3 Downloads 164 Views

Recommend Documents

Recovering Traceability Links between a Simple ...
J-Domain [JDraw] - Opera http://jdraw.sourceforge.net/index.php?page=6. Recovering Traceability Links between. | a Simple Natural Language Sentence and ...

The neuroscience of affiliation: Forging links between ...
Mount Sinai School of Medicine, One Gustave L. Levy Place, Box 1230, ... Available online 1 August 2006. Abstract ...... Carter, C.S., Altemus, M., 1997.

Empirical links between achievement goal theory and ...
variables with diþerent degrees of self-determination. .... the same degree of self-determination. The ®rst ...... Morgantown, WV: Fitness Information Technology.

Links between soil microbial communities and ... - Wiley Online Library
Nov 27, 2016 - This is an open access article under the terms of the Creative Commons Attribution License, which permits use, ... ties underpins ecosystem function, succession, and recovery from .... weighted plant trait data from Fridley et al.

Empirical links between achievement goal theory and ...
Journal of Sports Sciences ISSN 0264-0414 print/ISSN 1466-447X online Ó 2001 Taylor .... strive to learn or master sport skills. ..... During administration.

Traceability from global manufacturer's ...
Collaboration. Page 4 of 30. Traceability from global manufacturer's perspective_MikeDethick_RDPAC_Beijing2016.pdf. Traceability from global manufacturer's ...

Cannabis Supply Chain Traceability - GitHub
Companies or organizations using this document are advised to seek professional ...... Cannabis Supply Chain Traceability. Open Cannabis System of. 10. 27 ...

07 - Masiulis - outbreak investigation and traceability RU.pdf ...
07 - Masiulis - outbreak investigation and traceability RU.pdf. 07 - Masiulis - outbreak investigation and traceability RU.pdf. Open. Extract. Open with. Sign In.

PDF Refuge Recovery: A Buddhist Path to Recovering ...
Recovery is a systematic method based on Buddhist principles. which integrates scientific. non-theistic. and psychological insight. Viewing addiction as cravings ...

Recovering from Airline Operational Problems with a ...
problems (identify solutions that can mitigate the problems encountered). ..... probably won't compensate the penalization associated with the exchange). If the.

PDF Cloak of Green: The Links Between Key Environmental Groups, Government, & Big Business Full Books
Cloak of Green: The Links Between Key Environmental Groups, Government, & Big Business Download at => https://pdfkulonline13e1.blogspot.com/1550284509 Cloak of Green: The Links Between Key Environmental Groups, Government, & Big Business pdf down

Recovering American Philosophy
Feb 1, 2014 - is that the way truly to advance American philosophy is to abandon ... philosophy”—which invaded America and declared pragmatism soft.

Dancing Links
The element denoted by x has been deleted from its list; why would anybody want .... corresponding to rank i and file j of the board; each row is conveniently represented by giving the ...... Chinese title, “Dr. Dragon's Intelligence Profit System.

Korea pharmaceuticals serialisation policy & national traceability ...
There was a problem previewing this document. Retrying... Download. Connect more apps... Try one of the apps below to open or edit this item. Korea pharmaceuticals serialisation policy & national traceability system_KyongjaLee_KPIS_Beijing2016.pdf. K