GENERATING QUERY FACETS USING KNOWLEDGE BASES
ABSTRACT
A query facet is a significant list of information nuggets that explains an underlying aspect of a query. Existing algorithmsmine facets of a query by extracting frequent lists contained in top search results. The coverage of facets and facet items mined by thiskind of methods might be limited, because only a small number of search results are used. In order to solve this problem, we proposemining query facets by using knowledge bases which contain high-quality structured data. Specifically, we first generate facets basedon the properties of the entities which are contained in Freebase and correspond to the query. Second, we mine initial query facetsfrom search results, then expanding them by finding similar entities from Freebase. Experimental results show that our proposedmethod can significantly improve the coverage of facet items over the state-of-the-art algorithms.
EXISTING SYSTEM:
Query facets summarize a query in different aspects. Theymay help users quickly understand important aspects of thequery and help them explore information. Dou et al. firstintroduced this problem and proposed QDMiner algorithm. QDMiner first extracts frequent lists in top search resultsusing predefined patterns, then weights each list and groupsthem into final facets. Similar to QDMiner, Kong and Alla developed supervised approaches, namely QF-I and QFJ,to mine query facets. Facet item candidates are extractedfrom frequent lists which are obtained in a similar way asQDMiner. Then two Bayesian models are learned to estimatehow likely a candidate is a facet item and how likely twocandidates belong to the same facet. All the existing worksare based on top search results, hence the quality of finalfacets might be limited. If some words or phrases don’tappear in a list within top search results, they have noopportunity to be facet items.
PROPOSED SYSTEM:
The contributionsare two-fold:(1) By leveraging both knowledge bases and searchresults, QDMbreaks the limitation of only using searchresults to generate query facets, thus could improve thequality of facets, especially recall. Although in practice, itis impossible and unnecessary to show hundreds of facetitems to users, recall is still of great importance. First, forsome short facets such as “founders” of query “Google”and “family members” of query “Tom Cruise,” users wantexactly total answers. Second, even for long facets whichcover dozens of items, users still may have the potentialto explore as many suggested items as possible. Listingall these items aside the traditional “ten blue links” isdistracting. Instead, we could use a “more” link to guidethe users who want to explore more to another single page.KB(2) Knowledge bases act not only as supplemental datasources, but also bring structured information to queryfacets. Different items a facets mined by traditionalmethods are isolated and lean, while during the processof our algorithm, we actually link some facet items toknowledge bases, which could yield many benefits suchas (a) finding more information related to each facet itemthrough the link structure of knowledge bases; (b) usingthe types or properties in knowledge bases as a potentialexplanation of the meaning of each facet.We use two existing datasets that are used by QDMiner[4], namely UserQ and RandQ, to evaluate the proposedmethod. Experimental results show that our proposedmethod QDMsignificantly outperforms all state-of-theartmethods including QDMiner, QF-I, and QF-J in terms ofrp-nDCG. It yields significantly higher recall of facet items
CONCLUSIONS
Existing query facet mining algorithms, including QDMiner,QF-I, and QF-J mainly rely on the top search results fromthe search engines. The coverage of facets mined using thiskind of methods might be limited, because usually only asmall number of results are used. We propose leveragingknowledge bases as complementary data sources. We usetwo methods, namely facet generation and facet expansion,to mine query facets. Facet generation directly uses propertiesin Freebase as candidates, while facet expansion intendsto expand initial facets mined by QDMiner in propertybasedand type-based manners. Experimental results showthat our approach is effective, especially for improving therecall of facet items.
REFERENCES
[1] H. Shum, “Bing dialog model: Intent, knowledge, anduser interaction,” http://research.microsoft.com/en-us/um/redmond/events/fs2010/presentations/Shum_Bing_Dialog_Model_RFS_71210.pdf.
[2] J. Niccolai, “Yahoo vows death to the ’10 blue links’,”http://www.pcworld.com/businesscenter/article/165214/yahoo_vows_death_to_the_10_blue_links.html.
[3] R. Browne, “No longer just ’ten blue links’,” http://blogoscoped.com/archive/2007-06-28-n28.html.
[4] Z. Dou, S. Hu, Y. Luo, R. Song, and J.-R. Wen, “Finding dimensionsfor queries,” in Proceedings of CIKM’11, 2011, pp. 1311–1320.
[5] W. Kong and J. Allan, “Extending faceted search to the generalweb,” in Proceedings of CIKM ’14, 2014, pp. 839–848.
[6] ——, “Extracting query facets from search results,” in Proceedingsof SIGIR ’13, 2013, pp. 93–102.
[7] Z. Dou, Z. Jiang, S. Hu, J. Wen, and R. Song, “Automaticallymining facets for queries from their search results,” IEEE Trans.Knowl. Data Eng., vol. 28, no. 2, pp. 385–397, 2016.
[8] O. Ben-Yitzhak, N. Golbandi, N. Har’El, R. Lempel, A. Neumann,S. Ofek-Koifman, D. Sheinwald, E. Shekita, B. Sznajder, and S. Yogev,“Beyond basic faceted search,” in Proceedings of WSDM ’08,2008, pp. 33–44.
[9] M. Diao, S. Mukherjea, N. Rajput, and K. Srivastava, “Facetedsearch and browsing of audio content on spoken web,” in Proceedingsof CIKM ’10, 2010, pp. 1029–1038.
[10] D. Dash, J. Rao, N. Megiddo, A. Ailamaki, and G. Lohman,“Dynamic faceted search for discovery-driven analysis,” in CIKM’08, 2008, pp. 3–12.