Partial Information Extraction Approach to Lightweight Integration on the Web Junxia Guo1 , Prach Chaisatien1 , Hao Han1,2 , Tomoya Noro1 , and Takehiro Tokuda1 1
2
Department of Computer Science, Tokyo Institute of Technology Ookayama 2-12-1-W8-71, Meguro, Tokyo 152-8552, Japan {guo,prach,han,noro,tokuda}@tt.cs.titech.ac.jp Digital Content and Media Sciences Research Division, National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda, Tokyo 101-8430, Japan [email protected]
Abstract. We present partial information extraction approach to lightweight integration on the Web. Our approach allows us to extract dynamic contents created by scripts as well as static HTML contents. Our approach has three application areas: automatic generation of Web services from Web applications, automatic integration of Web applications with Web services on desktop computers, and automatic integration of mobile phone applications with Web applications and Web services on mobile phones. Keywords: partial information extraction, lightweight integration, desktop computers, mobile phones.
1
Introduction
The purpose of this paper is to present partial information extraction approach to lightweight integration on the Web and show that there are three important application areas of our approach in the lightweight integration. The traditional way of lightweight integration on the Web is to integrate Web services by writing a script or a program invoking those Web services. This approach is natural but not directly applicable to the integration of the Web applications which do not have Web service APIs. Some of the other existing approaches to lightweight integration on the Web are as follows. The first approach is to integrate the Web sources in predefined library using GUI interface as in Yahoo Pipes [24]. Users do not need to do programming, but they cannot add new Web sources into the library. The second approach is to use static Web contents extracted from Web applications as in Dapp Factory [11]. Users can use information extracted from Web applications and RSS to do lightweight integration, but they cannot extract dynamic Web contents such as the clock part of Localtimes.info [18] created by a script. Our partial information extraction approach is a technique to extract partial contents from Web pages in one Web site according to GUI-based definition of F. Daniel and F.M. Facca (Eds.): ICWE 2010 Workshops, LNCS 6385, pp. 372–383, 2010. c Springer-Verlag Berlin Heidelberg 2010
Partial Information Extraction Approach to Lightweight Integration
373
partial contents of a sample Web page in the Web site. Our technique is able to extract both static and dynamic contents created by scripts using the method called hide-and-display method. Thanks to the hide-and-display method, our approach allows us to construct Web service functions from Web applications which do not have Web services. Our approach allows us to integrate both Web applications and Web services on desktop computers using descriptions. Our approach also allows us to integrate Web applications with Web services and mobile phone applications on mobile phones using descriptions. The orgnization of the rest of this paper is as follows. In Section 2, we explain the detail of our partial information extraction approach. Section 3 explains the methods that can generate Web services from Web applications automatically. In Section 4, we explain our method to integrate Web applications with Web services on desktop computers. We present the method that integrate Web applications with Web services and mobile phone applications on mobile phones in Section 5. We evaluate our approach in Section 6. Finally, in Section 7, we give our conclusion.
Partial Information Extraction Approach to Lightweight Integration on ...
Service Interface Wrapper so that users can apply the wrapper's outputs with typical programming. To conclude, our method to perform Web information extraction on mobile phones using description-based configurations still requires manual works. The solution is to make the data flows diverge to parse the DOM tree ...
internet user who does not know about Google Search, Google Maps or Flickr web ... In first chapter of this thesis an overview of different web integration ...
instrumented node, the UGS therein informs the pursuer if ... If this happens, the. UGS is triggered and this information is instantaneously relayed to the pursuer, thereby enabling capture. On the other hand, if the evader reaches one of the exit no
web for which semantic association with locations could be obtained through .... Mn. Input Features. Figure 2: (Left) A Naive log-linear model as a factor graph.
because of the assumption that more characters lie on baseline than on x-line. After each deformation iter- ation, the distances between each pair of snakes are adjusted and made equal to average distance. Based on the above defined features of snake
... come about pattern-match rules that directly clear substance fillers for the machines of chance in the template, which makes into company techniques from several way of discovery from examples reasoning programming systems and gets unlimited desi
Computer Science and Engineering. Assistant professor in ... substance fillers for the machines of chance in the template, which makes into company techniques from several way of discovery from .... record data. For example, Amazon puts (a person) fo
tion, performance value. I. INTRODUCTION. Large-scale distributed environments (e.g. the Grid) provide a cost-effective computing platform to execute scientific ...
during the execution to ensure users' performance requirements. (e.g. execution ... (ADA) [10], which is defined as a parallel application that is able to add or ...
First, we define a new large-margin. Perceptron algorithm tailored for class- unbalanced data which dynamically ad- justs its margins, according to the gener-.
Camera-Captured Document Image Segmentation. 1. INTRODUCTION. Digital cameras are low priced, portable, long-ranged and non-contact imaging devices as compared to scanners. These features make cameras suitable for versatile OCR related ap- plications
INTRODUCTION ... our method starts by enhancing the grayscale curled textline structure using ... cant features of grayscale images [12] and speech-energy.
... by âbuildingâ is the mention of an entity of type FACILITY and sub- ..... We call this algo- rithm the ..... 24. 90. 116. 5.6. 38.5. 2.4. 53.5. 88.0. 59.1. 70.7. PHYS. 428. 76. 298. 113. 8.7. 69.1. 6.2 .... C-SVC SVM type) takes over 15 hours
'Only' belongs to a class of elements whose interpretation depends on .... (2002)) are used online during sentence comprehension to restrict ... Does recent mention in the context of 'only' make something a good candidate ... 1), one at each corner o
aspects of the information available in an interpretive context contribute to ... (Experiment 2) constrain referential domains for interpreting sentences with 'only', ...
Pursuit of a Moving Ground Target on a Graph Using Partial Information. 4. Green (y+ = â1): This implies that the evader has not visited u thus far. Therefore, the path information update is given by: P+(u, â1) = P\Q, Q = { k : k â Pu, t + de(n
Apr 29, 2010 - 106/823. See application ?le for complete search history. (56). References Cited ...... the maximum load carried by the specimen during the test by ..... nois Tool Works Inc., Glenview, Illinois) or similar fasteners, lead anchors ...
data into account. Experiments conducted on DBpedia and Wikipedia show that CE2 can provide good performance in terms .... The repository and the hybrid query engine implementing our approach are embedded into an ..... This approach has achieved supe
analysis digital security and many other applications. Fingerprints are fully formed at about seven months of fetus development and finger ridge configurations do not ... point or island, spur and crossover. A good quality fingerprint typically conta
OntoDW: An approach for extraction of conceptualizations from Data Warehouses.pdf. OntoDW: An approach for extraction of conceptualizations from Data ...
analysis digital security and many other applications. .... missing information and removes spurious minutiae. .... Find indices (r, c) of nonzero elements. d.