Exam (elaborations)

CoCQA: Co-Training Over Questions and Answers with an Application to Predicting Question Subjectivity Orientation

13 views 0 purchase

Course
CoCQA: Co-Training

Institution
CoCQA: Co-Training

.1 Community Question Answering Online social media content and associated services comprise one of the fastest growing segments on the Web. The explicit support for social interactions between participants, such as posting comments, rating content, and responding to questions and commen...

[Show more]

Preview 2 out of 10 pages

View example

Uploaded on August 25, 2024
Number of pages 10
Written in 2024/2025
Type Exam (elaborations)
Contains Questions & answers

cocqa co training over questions and answers wit
cocqa a co training framework over questions and

Institution CoCQA: Co-Training
Course CoCQA: Co-Training

TIFFACADEMICS Member since 1 year 554 documents sold

$15.49

Added

Add to cart Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

CoCQA: Co-Training Over Questions and Answers
with an Application to Predicting Question Subjectivity Orientation
Baoli Li Yandong Liu Eugene Agichtein
Emory University Emory University Emory University
csblli@gmail.com yliu49@emory.edu eugene@mathcs.emory.edu

with a specific, accurate, and complete an-
Abstract swer that addresses the question. Although
much progress has been made, answering
An increasingly popular method for
complex, opinion, and even many factual
finding information online is via the
questions automatically is still beyond the
Community Question Answering
current state-of-the-art. At the same time, the
(CQA) portals such as Yahoo! An-
rise of popularity in social media and collabo-
swers, Naver, and Baidu Knows.
rative content creation services provides a
Searching the CQA archives, and rank-
promising alternative to web search or com-
ing, filtering, and evaluating the sub-
pletely automated QA. The explicit support
mitted answers requires intelligent
for social interactions between participants,
processing of the questions and an-
such as posting comments, rating content, and
swers posed by the users. One impor-
responding to questions and comments makes
tant task is automatically detecting the
this medium particularly amenable to Ques-
question’s subjectivity orientation:
tion Answering. Some very successful exam-
namely, whether a user is searching for
ples of Community Question Answering
subjective or objective information.
(CQA) sites are Yahoo! Answers 1 and
Unfortunately, real user questions are
Naver 2 , and Baidu Knows 3 . Yahoo! Answers
often vague, ill-posed, poorly stated.
alone has already amassed hundreds of mil-
Furthermore, there has been little la-
lions of answers posted by millions of par-
beled training data available for real
ticipants on thousands of topics.
user questions. To address these prob-
The questions posted to such CQA portals
lems, we present CoCQA, a co-training
are typically complex, subjective, and rely on
system that exploits the association be-
human interpretation to understand the corre-
tween the questions and contributed
sponding information need. At the same time,
answers for question analysis tasks.
the questions are also usually ill-phrased,
The co-training approach allows
vague, and often subjective in nature. Hence,
CoCQA to use the effectively unlim-
analysis of the questions (and of the corre-
ited amounts of unlabeled data readily
sponding user intent) in this setting is a par-
available in CQA archives. In this pa-
ticularly difficult task. At the same time,
per we study the effectiveness of
CQA content incorporates the relationships
CoCQA for the question subjectivity
between questions and the corresponding an-
classification task by experimenting
swers. Because of the various incentives pro-
over thousands of real users’ questions.
vided by the CQA sites, answers posted by
users tend to be, at least to some degree, re-
1 Introduction sponsive to the question. This observation
Automatic question answering (QA) has been suggests investigating whether the relation-
one of the long-standing goals of natural lan-
guage processing, information retrieval, and 1
http://answers.yahoo.com
artificial intelligence research. For a natural 2
http://www.naver.com
language question we would like to respond 3
http://www.baidu.com

937
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, pages 937–946,
Honolulu, October 2008. c 2008 Association for Computational Linguistics

, ship between questions and answers can be
exploited to improve automated analysis of the
CQA content and the user intent behind the
questions posted.
To this end, we exploit the ideas of co-
training, a general semi-supervised learning
approach naturally applicable to cases of com-
plementary views on a domain, for example,
web page links and content (Blum and
Mitchell, 1998). In our setting, we focus on the
complimentary views for a question, namely
the text of the question and the text of the as-
sociated answers.
As a concrete case-study of our approach
we focus on one particularly important aspect
of intent detection: the subjectivity orientation.
We attempt to predict whether a question
posted in a CQA site is subjective or objective.
Objective questions are expected to be an-
swered with reliable or authoritative informa-
tion, typically published online and possibly
referenced as part of the answer, whereas sub-
jective questions seek answers containing pri-
vate states, e.g. personal opinions, judgment,
experiences. If we could automatically predict
the orientation of a question, we would be able
to better rank or filter the answers, improve
search over the archives, and more accurately
identify similar questions. For example, if a Figure 1: Example question (Yahoo! Answers)
question is objective, we could try to find a
few highly relevant articles as references, The rest of the paper is structured as fol-
whereas if a question is subjective, useful an- lows. We first overview the community ques-
swers are not expected to be found in authori- tion answering setting, and state the question
tative sources and tend to rank low with cur- orientation classification problem, which we
rent question answering and CQA search tech- use as the motivating application for our sys-
niques. Finally, learning how to identify ques- tem, more precisely. We then introduce our
tion orientation is a crucial component of in- CoCQA system for semi-supervised classifi-
ferring user intent, a long-standing problem in cation of questions and answers in CQA com-
web information access settings. munities (Section 3). We report the results of
In particular, we focus on the following re- our experiments over thousands of real user
search questions: questions in Section 4, showing the effective-
• Can we utilize the inherent structure of the ness of our approach. Finally, we review re-
CQA interactions and use the unlimited lated work in Section 5, and discuss our con-
amounts of unlabeled data to improve classi- clusions and future work in Section 6.
fication performance, and/or reduce the
amount of manual labeling required? 2 Question Orientation in CQA
• Can we automatically predict question sub- We first briefly describe the essential features
jectivity in Community Question Answering of question answering communities such as
– and which features are useful for this task Yahoo! Answers or Naver. Then, we formally
in the real CQA setting? state the problem addressed in this paper, and
the features used for this setting.

938

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller TIFFACADEMICS. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $15.49. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

79650 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling

Popular Universities in the United States

Popular books

Find notes and summaries for these qualifications

Exam (elaborations)

CoCQA: Co-Training Over Questions and Answers with an Application to Predicting Question Subjectivity Orientation

Document information

Subjects

Written for

Seller

Reviews received

Content preview