IJRIT International Journal of Research in Information Technology, Volume 3, Issue 3, March 2015, Pg. 234-238

International Journal of Research in Information Technology (IJRIT) www.ijrit.com

ISSN 2001-5569

Prediction of Hard Keyword Queries Deema Liyakath1, Deepika Ravichandran2, Divya Angappan3, Priyadharsini Murugesan4 and Mohankumar Bharathiyar5 1

UG Scholar, Department of Information Technology, Sri Ramakrishna Engineering College Coimbatore, TamilNadu, India [email protected]

2

UG Scholar, Department of Information Technology, Sri Ramakrishna Engineering College Coimbatore, TamilNadu, India [email protected]

3

UG Scholar, Department of Information Technology, Sri Ramakrishna Engineering College Coimbatore, TamilNadu, India [email protected]

4

Assistant Professor, Department of Information Technology, Sri Ramakrishna Engineering College Coimbatore, TamilNadu, India [email protected]

5

Assistant Professor, Department of Information Technology, Sri Ramakrishna Engineering College Coimbatore, TamilNadu, India [email protected] Abstract

Keyword queries provide easy access to data over databases, but often suffer from low ranking quality. Using the benchmarks, to identify the queries that are like to have low ranking quality. System may suggest to the user for providing the alternative queries for such hard queries. This paper, analyze the characteristics of hard queries and to measure the degree of difficulty over a database. Prediction of Hard Queries over Databases by Attribute value, Attribute and Entity set. Properties of hard queries on databases follows: Less specificity, Attribute level ambiguity and Entity set level ambiguity. The expected outcome are effectively predicts the hard queries in structured data over databases, increases the performance and time consuming.

Keywords: Keyword Query Interfaces (KQIs), Noise Generation, Ranking Robustness Principle, Structured Robustness (SR) Score, Approximation Algorithm.

1. Introduction Keyword query interfaces (KQIs) provide flexibility and ease of use in searching and exploring the data in database. Keyword queries have many possible answers. KQI must identify the information needs behind keyword queries and rank the answers. Databases contain entities, and entities contain attributes that take attribute values. Based on entity set, attributes and attribute value in the database we predict the hard keyword queries. Some of the difficulties of answering a query are as follows: For instance, query Q1: King Deema Liyakath, IJRIT-234

IJRIT International Journal of Research in Information Technology, Volume 3, Issue 3, March 2015, Pg. 234-238

Kong on the IMDB database (http://www.imdb.com) does not specify if the user is interested in movies whose title is King Kong or movies distributed by the King Kong Cinema Industry. The KQI is to recognize the queries and give the alternate techniques to the user. In this paper, we use two databases original and corrupted. The noise is generated in the original database to obtain corrupted database. The query is entered in both databases then ranking is done using top-k result list. SR score measures the difficulty of a query based on the ranking. To estimate the SR score, approxi mation algorithm is used and evaluates the performance of the query.

2. Related Work In the past years, hard queries are predicted over unstructured text documents. The unstructured text documents are divided into two groups: Pre-retrieval and Post-retrieval methods. Pre-retrieval methods are used to predict the difficulty of the query without computing the results. Post-retrieval methods are used to predict the difficulty of the query by computing the results. Post-retrieval is based on three categories: Clarity-score based, Ranking-score based and Robustness-based. The clarity-score predicts the difficulty of a query more accurately then pre-retrieval methods in text documents. But the query is predicted poorly in database. In ranking-score, the queries are ranked by using top-k result and efficiently the query is predicted. By comparing clarity-score based and ranking-score based, the query prediction will be higher in Robustness-based.

3. Existing System Today it is difficult to predict the hard queries over databases. But existing methods are only applicable for unstructured data. For instance, there are two databases original and corrupted database. Both databases contain entity set, attribute, and attribute value. Here, the noise is generated only based on attribute value which is referred as corrupted database.

Query

Databases

Ranking Robustness Principle

Structured Robustness Algorithm

Approximation Algorithm

Quality result

Fig. 1 Existing System Structure

Deema Liyakath, IJRIT-235

IJRIT International Journal of Research in Information Technology, Volume 3, Issue 3, March 2015, Pg. 234-238

The Fig. 1 shows the architecture of existing system where the searching quality and reliability rate of the system is lowest. To overcome these drawbacks we perform noise generation at three levels.

4. Proposed System We propose a novel framework to measure the degree of difficulty for a keyword queries over database. Prediction of hard queries over databases includes Attribute value, Attribute and Entity set. The properties of hard queries on databases are: Less specificity, Attribute level ambiguity and Entity set level ambiguity.

Fig. 2 Proposed System Structure

Deema Liyakath, IJRIT-236

IJRIT International Journal of Research in Information Technology, Volume 3, Issue 3, March 2015, Pg. 234-238

4.1 Noise Generation in Databases The noise changes the attribute or entity set of an attribute value in the corrupted database. The noise is generated based on attributes, attribute values and entity sets. The query result is obtained from the original and corrupted database.

4.2 Ranking in Original and Corrupted Database After obtaining the query result from original and corrupted databases the cosine similarity value is computed and ranked it by using Top-k result list. The two ranking method PRMS (Probabilistic Relational Models) and Spearman rank correlation is used. Cosine Similarity is computed by using dot product operation as follows:

(1)

4.3 Structured Robustness Algorithm Structured Robustness (SR) score measures the difficulty of a query based on the differences between the rankings of the same query over the original and noisy (corrupted) versions of the same database.

Algorithm 1 CorruptTopResults(Q,L,M,I,N) Input: Query Q, Top-K result list L of Q by ranking function g, Metadata M, Inverted indexes I, Number of corrupted iteration N. Output: S R score for Q. 1: S R ĸC ĸ^`C caches λT, λS for keywords in Q 2: FOR i=1 → N DO 3: I′ ← I; M′ ← M; L′ ← L; // Corrupted copy of I, M and L 4: FOR each result R in L DO 5: FOR each attribute value A in R DO 6: A′ ← A; // Corrupted versions of A 7: FOR each keywords w in Q DO 8: Compute # of w in A′ by Equation 10; // If λT,w λS,w needed but not in C, calculate and cache them 9: IF # of w varies in A′ and A THEN 10: Update A′, M′ and entry of w in I′; 11: Add A′ to R′; 12: Add R′ to L′; 13: Rank L′ using g, which returns L, based on I′, M′; 14: S R += Sim(L,L′); // Sim computes Spearman correlation 15: RETURN S R ← S R / N; // AVG score over N rounds Algorithm 1: Structured Robustness Algorithm

4.4 Approximation Algorithms Approximation algorithms are to estimate the SR score and performance of the query. The techniques used in these are: Query-specific Attribute values Only Approximation (QAO-Approx) and Static Global Stats Approximation (SGS-Approx).

Deema Liyakath, IJRIT-237

IJRIT International Journal of Research in Information Technology, Volume 3, Issue 3, March 2015, Pg. 234-238

5. Conclusions In the existing work, analyzes the characteristics of hard queries and propose a novel framework to measure the degree of difficulty for a keyword query over a database, considering the structure and the content of the database and the query results. However, in this system numbers of issues are there to address. They are, searching quality is lower than the other system and reliability rate of the system is lowest. In order to overcome these drawbacks, we are performing the noise generation in three level includes attribute, attribute value and entity set in the database. This proposed system is well enhancing the reliability rate of the difficult query prediction system. In other words, this work is support these operators for efficient result. From the experimentation result, we are obtaining the proposed system is well effective than the existing system in terms of accuracy rate, quality of result.

References [1] Marti Hearst, Search User Interface chap.4, Cambridge University Press, 2009. [2] Journals of Computer Applications and Trends in Engineering( Monthly Publications). [3] A. Shtok, O. Kurland, and D. Carmel, "Predicting query performance by query-drift estimation", in Proc. 2nd ICTIR, Heidelberg, Germany, 2009, pp. 305–312. [4] C. Hauff, L. Azzopardi, D. Hiemstra and F. Jong, "Query performance prediction: Evaluation contrasted with effectiveness", in Proc. 32nd ECIR, Milton Keynes, U.K., 2010, pp. 204–216.

Deema Liyakath, IJRIT-238

Prediction of Hard Keyword Queries

Keyword queries provide easy access to data over databases, but often suffer from low ranking quality. Using the benchmarks, to identify the queries that are like ...

202KB Sizes 1 Downloads 247 Views

Recommend Documents

Interactive Top-k Spatial Keyword Queries
on spatial dimension and different keywords, we learn them automatically ... It is worth mentioning that our work shares some similar- ities with ...... Data and queries. We use two real PoI datasets crawled from online SNS for our experimental study

Lesson 1.4: The art of keyword choice
Think about what you're trying to find. ○ Choose words that you think will appear on the page. ○ Put yourself in the mindset of the author of those words. Page 2. Hints to choose keywords … Question: “I heard there was some old city in San Fr

Keyword Spotting Research
this problem is stated as a convex optimization problem with constraints. ...... Joachims T 2002 Optimizing search engines using clickthrough data Proceedings ...

Gadget_Technology KWs WWW.GADGETREVIEW.COM - Keyword ...
Gadget_Technology KWs WWW.GADGETREVIEW.COM - Keyword Links.pdf. Gadget_Technology KWs WWW.GADGETREVIEW.COM - Keyword Links.pdf.

Keyword Supremacy Review.pdf
Page 2 of 2. Name URL Info Short URL. Keyword. Supremacy. https://docs.google.com/spreadsheets/d/103xuTD0MBfAqTm. CWUAyuzjoQh5MVIML-wh7m8fcknUE/edit?usp=sharing. An overview. of our. Keyword. Supremacy. materials. https://goo.gl. /VlVz41. Keyword. Su

author queries
8 Sep 2008 - Email: [email protected]. 22. ... life and domain satisfaction: to do well from one's own point of view is to believe that one's life is ..... among my goals. I also value positive hedonic experience, but in this particular. 235 situ

Queries - High School of Athens.pdf
Loading… Page 1. Whoops! There was a problem loading more pages. Queries - High School of Athens.pdf. Queries - High School of Athens.pdf. Open. Extract.

Conceptual Queries
article highlights the advantages of conceptual query languages such as ... used to work directly with the system (e.g. screen forms and printed reports). ... made to the general type of data model to be used for storage (e.g. relational or ... The u

Viewport and Media Queries
Nevermind the pixels, here comes the Complete Idiot's Guide to. Viewport and ... If you want to have a CSS style that only smartphones will pick up, use: @media ...

author queries
Some psychologists call for the replacement of all. 35 traditional first ... are alive and well,3 which means that objectivists about happiness and well-being .... of LS judgments shows them to be flawed in a way that 'objective happiness' is not.

The biogeography of prediction error
of prediction errors in modelling the distribution of invasive species (Fitzpatrick & Weltzin, 2005). RDMs are conceptually similar to SDMs, in that they assess the ...

Prediction of Population Strengths
Apr 28, 1998 - specific static strength prediction model which has been implemented in software produced by the Center for Ergonomics at the University of Michigan. The software allows the simulation of a large variety of manual exertions. This paper

6A5 Prediction Capabilities of Vulnerability Discovery Models
Vulnerability Discovery Models (VDMs) have been proposed to model ... static metrics or software reliability growth models (SRGMS) are available. ..... 70%. 80%. 90%. 100%. Percentage of Elapsed Calendar Time. E rro r in. E s tim a tio n.

Combination Skyline Queries
replaced by the combinations when a dominance check is invoked. In order to keep the combinations for each rule encountered, we use a matrix M with MBRs as rows and cardinalities columns. As the search order shown in Fig. 5, pattern p1 comes before p

DownloadPDF Foundations of Prediction Markets
... Evidence (Evolutionary Economics and. Social Complexity Science) FULL EPUB ... intelligence or machine learning tools to develop nonlinear models. The.

A Probabilistic Prediction of
Feb 25, 2009 - for Research, Education/Training & Implementation, 14-18, October, 2008, Akyaka, Turkey]. ICZM in Georgia -- from ... monitoring and planning, as well as the progress and experience with the development of the National ICZM ... the sus

KEYWORD RECOGNITION WITH PHONE ...
lance, agent monitoring in call-centers etc. KWS techniques can be broadly classified as LVCSR- based (large vocabulary continuous speech recognition) or.

Discriminative Keyword Spotting - Research at Google
Oct 6, 2008 - Email addresses: [email protected] (Joseph Keshet), ...... alignment and with automatic forced-alignment. The AUC of the discriminative.

Keyword and Tag Planner LISTING.pdf
There was a problem loading more pages. Retrying... Keyword and Tag Planner LISTING.pdf. Keyword and Tag Planner LISTING.pdf. Open. Extract. Open with.

Experimental Results Prediction Using Video Prediction ...
RoI Euclidean Distance. Video Information. Trajectory History. Video Combined ... Training. Feature Vector. Logistic. Regression. Label. Query Feature Vector.

An Empirical Performance Evaluation of Relational Keyword Search ...
Page 1 of 12. An Empirical Performance Evaluation. of Relational Keyword Search Systems. University of Virginia. Department of Computer Science. Technical ...

A Keyword History of Marketing Science
like “game theory” and “hierarchical Bayes,” have become more popular. Finally, some words are superseded by others, like “diffusion” by “social networking.” Second, the overall rate of .... technical appendix (available at http://dx.

how to search by keyword
To create a new Playlist, drag and drop the desired learning object into the New Playlist box in the right-hand column. STEP TWO. eMediaVA will prompt you to ...

eBook Download SEO Keyword Strategy
Aug 21, 2012 - Book Synopsis. Keyword strategy helps you select the keywords for your. Search Engine Optimization campaign that will most contribute to ...