A Combinatorial Approach to Building Navigation Graphs for Dynamic Web Applications W. Wang1, Y. Lei1, S. Sampath2, R. Kacker3, R. Kuhn3, J. Lawrence4 1University

of Texas at Arlington 2University of Maryland, Baltimore County 3National Institute of Standards and Technology 4George Mason University 9/24/2009

Outline • Introduction • Basic concepts, Challenges

• Our approach • Abstract URL, Pairwise strategy, Algorithm design, Tool

• Experiments • Design, Subject applications, Empirical results

• Related Work • Conclusion

09/24/2009

2/27

Navigation graph • A navigation graph represents the navigation structure of a web application. • A node represents a web page. • An edge represents one transition between two nodes.

• Usage: regression testing, impact analysis • Has an expected navigation path been implemented? • Has an unexpected navigation path been introduced? • What pages will be affected if one page is changed? 09/24/2009

3/27

Challenges • Page explosion problem • An astronomical number of dynamic web pages, possibly infinite web pages • Example: a web application may dynamically generate greeting pages for different users.

• Navigation structure capture problem • Some dynamic web pages may not be reached unless appropriate requests are supplied. • Example: searching flights in the studentuniverse web site.

09/24/2009

4/27

Challenges • Form parameters: departure city, arrival city, departure date, return date. • City name: Dallas, Denver, Detroit, Edmonton. • Date: Sep. 29, Sep. 30. • home page->error page, captured by special combinations between two parameters. • the departure city is the same to arrival city. • the return date is before departure date.

• home page->searchResults page, captured by other ordinary combinations. 09/24/2009

5/27

Outline • Introduction • Basic concepts, Challenges

• Our approach • Abstract URL, Pairwise strategy, Algorithm design, Tool

• Experiments • Design, Subject applications, Empirical results

• Related Work • Conclusion

09/24/2009

6/27

Abstract URL • One abstract URL represents a group of concrete URLs. • These concrete URLs have the same base component and the same parameters in the query component.

• Example: • u1 = “http://test.com/foo.jsp?x=1&y=2” • u2 = “http://test.com/foo.jsp?x=0&y=3” • U= “http://test.com/foo.jsp?x&y”

09/24/2009

Figure 1: An URL example

7/27

Pairwise strategy • Given any two out of the k parameters, we ensure that every value combination between any two parameters is covered in at least once. • Our approach generates pairwise input combinations for forms to capture navigation structures behind forms. p1

p2

p3

p1 p2 p3 0 0 0 0 0 0 0 0 1 0 1 1 0 1 0 1 0 1 0 1 1 1 0 0 1 1 0 1 0 1 Figure 3: Combinations from pairwise testing 1 1 0 1 1 1 Figure 2: Combinations from the exhaustive testing 09/24/2009

8/27

Algorithm design

09/24/2009

Figure 4: Algorithm flow graph

9/27

Tansuo’s architecture

Figure 5: Tansuo’s architecture 09/24/2009

10/27

Tansuo’s architecture • Builder: • drives the entire exploration process.

• Fetcher: • fetches a page from the web server.

• Parser: • extracts static links and forms from a page.

• Form Handler: • Obtains values for form parameters. • fills forms with combinations. • Obtains URLs from form submissions.

• Fireeye: • generates pairwise input combinations.

• State Manager: • Resets the database. • Re-exercises the path from starting page to the current page.

• Viewer: • displays the current page our approach is working on 09/24/2009

11/27

Exploration demo

Figure 6: Exploration demo of Tansuo 09/24/2009

12/27

Features of Tansuo • Define exploration scope. • Define keywords for exploration scope. • Example: • Navigation structures for ordinary user. • Navigation structures for administrators.

• Semi-automated/automated exploration • GUI interface interaction. • Predefined files.

• Extract option values • • • •

09/24/2009

Values of select menus Values of check boxes Values of radio buttons Default values of text fields

13/27

Outline • Introduction • Basic concepts, Challenges

• Our approach • Abstract URL, Pairwise strategy, Algorithm design, Tool

• Experiments • Design, Subject applications, Empirical results

• Related Work • Conclusion

09/11/2009 09/24/2009

14/27

Experiment design • Environment: • Hardware: • CPU: 1.66GHz, RAM: 2G, Hard disk: 80G. • Software: • Windows XP SP2, Resin 2.1.8 web server, Apache 2.0.48, MySQL Server 4.1.

• Subject applications: • www.gotocode.com • Use five jsp web applications because of using the Clover tool. • Get source code statistics of subject applications with Clover. • Clover processes only JSP web applications. 09/24/2009

15/27

Application statistics Subject Application

NLOC

Classes

Methods

Branches

Bookstore

18385

27

925

4392

BugTrack

8094

13

438

1946

Classifieds

11599

18

618

2730

Links

8849

13

499

2074

Portal

Characteristics

17621 27 915 4084 Table 1: Source code statistics of subject applications

Subject Application

Forms

Actions

Params

APA

AVP

Bookstore

18

63

66

1.05

3.35

BugTrack

8

19

27

1.42

6.15

Classifieds

11

29

27

0.93

5.07

Links

11

24

26

1.08

5.77

19 39 95 2.44 Table 2: Form statistics of subject applications

3.40

Portal 09/24/2009

Characteristics

16/27

Results: navigation graph size Subject Application

Characteristics Nodes

Edges

Conn.

Bookstore

93

484

10.17

Bug Track

43

175

7.85

Classifieds

50

313

12.53

Links

52

259

9.72

Portal

80

652

17.77

Table 3: Size-statistics of generated navigation graphs

Notes: Conn.(Connectivity): the average incoming and outgoing edges per node. 09/24/2009

17/27

Results: performance & cost Subject Application

Total Time (hours)

State Restoration Time Memory Usage (hours) (M Bytes)

Bookstore

33.4415

27.7654

42.6328

BugTrack

0.1321

0.0641

19.5625

Classifieds

0.2999

0.2123

39.0078

Links

0.1275

0.0581

19.4570

Portal

1.2218

0.9519

80.3554

Table 4: Time and memory usage

Notes: Bookstore that contains a large number of images, which increased exploration time dramatically. For example, a search result page for Bookstore contained 20 images, whereas a search result page for Portal contained no images. 09/24/2009

18/27

Results: completeness Subject Application

Manual

Tansuo

Nodes

Edges

Nodes

%

Edges

%

Bookstore

97

596

93

95.9

484

81.2

Portal

91

836

80

87.9

652

78.0

Table 5: Completeness result statistics

Notes: Some nodes and edges are missed because of missing some complicated scenarios. For example, the page-flipping is missed because our approach, for efficiency, just place one order in the ShoppingCartRecord page. 09/24/2009

19/27

Results-comparison Subject Application

WebSphinx

LCP

Tansuo

Nodes

Edges

Nodes

Edges

Nodes

Edges

Bookstore

11

11

11

11

93

484

BugTrack

7

7

7

7

43

175

Classifieds

15

16

9

9

50

313

Links

11

12

11

11

52

259

Portal

17

22

17

22

80

652

Table 6: Comparison results

Nodes: LPC: Link Checker Pro. VeriWeb is not public accessible.

09/24/2009

20/27

Outline • Introduction • Basic concepts, Challenges

• Our approach • Abstract URL, Pairwise strategy, Algorithm design, Tool

• Experiments • Design, Subject applications, Empirical results

• Related Work • Conclusion

09/24/2009

21/27

VeriWeb [WWW 02] • Page explosion problem • Solution: sets length limits on navigation paths. • Results: • Can not address the page explosion problem indeed. • May cause losing navigation structures.

• Navigation structure capture problem • Does not consider input combinations for forms. • May miss navigation structures behind forms.

09/24/2009

22/27

WebSphinx [WWW 98] • Page explosion problem • Does not consider the page explosion. • Uses concrete URLs as nodes directly.

• Navigation structure capture problem • Can not handle forms. • Misses navigation structures behind forms.

09/24/2009

23/27

Google’s deep-web crawl [VLDB 08] • Page explosion problem • Solution: uses content discovery strategy to pick pages with most information. • Example: “Login” page will be discarded because it contains little information.

• Results: loses navigation structures

• Navigation structure capture problem • Solution: uses bottom-up fashion to generate input combinations for forms. • In fact, this solution works like exhaustive testing, which may produce a huge number of test cases.

• Results: causes low efficiency. 09/24/2009

24/27

Conclusion • Our approach is effective for generating practical navigation graphs. • Abstracting URLs controls navigation graph size effectively. • Pairwise input combinations of forms help capture most navigation structures.

• Future work: • Constraint support. • Improve the efficiency of state restoration. • Improve user interface.

09/24/2009

25/27

References • [WWW 02] M. Benedikt, J. Freire, and P. Godefroid, “VeriWeb: Automatically Testing Dynamic Web Sites”, Proc. of 1th Int’l Conf. on WWW, 2002. • [WWW 98] R.C. Miller, and K. Bharat, “SPHINX: A Framework for Creating Personal, Site-specific Web Crawlers”, Proc. of 7th Int’l Conf. on WWW, pp. 119130, 1998. • [VLDB 08] J. Madhavan, D. Ko, Ł. Kot, V. Ganapathy, A. Rasmussen, and A. Halevy, “Google’s Deep-Web Crawl”, Proc. of the VLDB Endowment, 1 (2): 12411252, 2008.

09/24/2009

26/27

Thanks ! 09/24/2009

27/27

A Combinatorial Approach to Building Navigation ...

Sep 24, 2009 - Design, Subject applications, Empirical results. • Related ... Figure 2: Combinations from the exhaustive testing .... Results: performance & cost.

4MB Sizes 1 Downloads 162 Views

Recommend Documents

Combinatorial approach to modularity
Aug 4, 2010 - social, and information sciences 1–3 . Real world networks ... annealing 10,23 , spectral methods 24–26 , genetic algo- rithms 27 , or extremal ... networks, modularity prefers to merge small groups into larger ones. We also ...

Combinatorial approach to modularity - Directory Of homes.sice ...
Aug 4, 2010 - topological considerations 6,7,9 to the study of the influ- ence that ..... involves a systematic search for such maxima that goes be- yond the ...

Combinatorial approach to modularity
Aug 4, 2010 - Commu- nities are groups of nodes with a high level of internal and ... The last few years have witnessed an increasing interest in defining ..... mation” yields. PC eα, ..... lar Eqs. 19 and 20 should take into account explicitly th

Tools and techniques (navigation and building interactive tools).pdf ...
Tools and techniques (navigation and building interactive tools).pdf. Tools and techniques (navigation and building interactive tools).pdf. Open. Extract.

Appreciative Approach to Capacity Building: The Impact ...
institutional strengthening, have a potential role to play ... production of charcoal is another income generator ... Ongoing political crisis, passive attitude from the ...

Combinatorial Nullstellensatz
Suppose that the degree of P as a polynomial in xi is at most ti for 1 ≤ i ≤ n, and let Si ⊂ F be a ... where each Pi is a polynomial with xj-degree bounded by tj.

Appreciative Approach to Capacity Building: The Impact ...
organisation) and a number of small groups of 20 to 30 people .... providing loans for small business activities .... accounts management system. This was due to ...

Approach A Reputed Building Materials Supplier for Flawless ...
Approach A Reputed Building Materials Supplier for Flawless Operation.pdf. Approach A Reputed Building Materials Supplier for Flawless Operation.pdf. Open.

Approach A Reputed Building Materials Supplier for Flawless ...
Approach A Reputed Building Materials Supplier for Flawless Operation.pdf. Approach A Reputed Building Materials Supplier for Flawless Operation.pdf. Open.

A combinatorial screen of the CLOUD uncovers a ...
May 22, 2017 - overload the infrastructure of most screening platforms. After an extensive ... additional information, based on a recent study35, to the descrip- tion of the library ... progression-free survival on these drugs. .... Economy and the N