Detecting Intent of Web Queries using Questions and Answers in CQA Corpus
1 view 0 purchase
Course
Detecting Intent of Web Queries
Institution
Detecting Intent Of Web Queries
Web user’s satisfaction. Structure of QAs or their relationships
were also analyzed as useful sources [18].
In this study, we analyze CQA contents from a novel
viewpoint of intent and try to connect its characteristics with
Web search activity to match a query with multiple intents.
This app...
2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology
Detecting Intent of Web Queries
using Questions and Answers in CQA Corpus
Soungwoong Yoon Adam Jatowt Katsumi Tanaka
Graduate School of Informatics, Kyoto University
Yoshida Honmachi, Sakyo, Kyoto 606-8501 Japan
{yoon, adam, tanaka}@dl.kuis.kyoto-u.ac.jp
Abstract—Detecting intent in Web search activity is important In this paper, we propose the methodology for using answers
task for finding relevant Web information. However extracting as well as questions in CQA corpus under the assumption that
intents from users’ queries is difficult as users express their intent the answers contain valuable information for extracting Web
by issuing short and often ambiguous queries; yet at the same
time it is crucial factor for enhancing user satisfaction. Showing query intent. This assumption is based on the observation that
the variety of candidate intents behind a query could help users the connections between a question and its answer(s) (QA)
choose correct intent expressions and improve the Web search. are semantically meaningful. Suppose a QA represents certain
In this paper, we propose the methodology for detecting intent, the question is regarded as an expression of intent, and
intent of Web queries using Community Question-Answer (CQA) the answer represents the target information of intent.
information. Our assumption is that questions and its answers
in CQA corpus reflect intents of questioners. To detect these
Our work aims at detecting important words in QA which
intents, we use the semantic connections between questions and clearly represent intent, called intent words. We have previ-
its answers. We categorize questions to find the connections of ously suggested the methodology to extract intent words from
features within a question and its answers, detect intent words questions [20], but we need to propose more comprehensive
in answers by calculating supports of concerned CQA contents, approach that considers also answers. Usually answers are
and cluster questions and their answers by these intent words.
Experimental results show that the variety of Web query intents
more diverse and noisy when compared to questions. Using
can be found with satisfactory performance. semantic relations in questions and its answers, we propose the
Index Terms—Query intent; Community Question-Answer cor- methodology to extract intent words from both the questions
pus; and answers. Candidate intent words in answers are collected
through question categorization, which shows semantic con-
I. I NTRODUCTION nections between features in question and ones in its answers.
CQA corpus-based support of intent words in answers is
Users search for Web information following their needs, and calculated to find appropriate intent words in answers. Finally,
usually their queries are explicit expressions of their search QAs are clustered using weighted K-Means clustering method
needs. We regard the information need in Web search activity for better intent representation. Extracted intent words can
as intent. However, a user query is generally not sufficient to successfully be used to re-rank Web search results, as shown
describe intent as it usually contains only a few terms [11]. in our experiments, and improve query modification.
The problem is that users may have insufficient knowledge or
skills to express their intents. Users can reformulate the initial II. R ELATED R ESEARCH
query following the search results shown to them, then their Intent finding has been important and challenging issue
knowledge span is expanded by clues extracted from search of Web search. Following the query classification into basic
activities. classes [1], numerous studies were made to find user intent by
Useful clues of intent can be detected from the query by using general click history [2], [14], personal profile [13], and
the variety of background knowledge and reasoning. Data sets statistical modeling [4], [11]. External thesauri [6], [11], [15],
concerned with queries such as click logs are useful to general- [17] were also used for detecting intent of Web query. Users
ize [1], [2] or personalize queries [13]. Other possible solutions may change or refine intents following their knowledge ac-
are manipulation of the query such as query suggestion [4], cumulated during the search session while browsing retrieved
expansion [5] and/or reformulation [10]. results by search engine [8], click a result [7], or even input
If we regard querying as asking, Community Question- queries [12] so discovering intent is complex and ongoing task.
Answer (CQA) corpora is the direct representation of asking Several researches were done for analyzing CQA informa-
activities that users generally do on the Web. When compared tion [9], [18], [19]. Their statistical analysis showed the CQA
to keyword-based searching, the intents behind the questions structure of CQA and emphasized the connections among
in CQA corpus are clearer. In our previous research [20] CQA users. Semantic analysis of CQA contents had been
we showed that questions in CQA corpus contain useful also conducted [19]. Based on statistical analysis of user
information to find the range of different search intents behind interaction data, researchers found meaningful factors and tried
Web queries. to estimate the effectiveness of CQA contents for enhancing
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller TIFFACADEMICS. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $14.99. You're not tied to anything after your purchase.