100% satisfaction guarantee Immediately available after payment Both online and in PDF No strings attached
logo-home
Natural Language Processing Graded A+ Questions and Answers 2024 $13.49   Add to cart

Exam (elaborations)

Natural Language Processing Graded A+ Questions and Answers 2024

 4 views  0 purchase
  • Course
  • Institution

Natural Language Processing Graded A+

Preview 4 out of 91  pages

  • October 31, 2024
  • 91
  • 2024/2025
  • Exam (elaborations)
  • Questions & answers
avatar-seller
Natural Language Processing Graded A+

Logistic Regression - answer An algebraic function that is used to relate any and all
independent variables to the expected dependent variable.
INPUT X ^(i) = column = [[1],[8],[11]]
HYPERPARAMTERS θ
sigmoid ( θ.T, X )
LABEL Y ex positive sentiment 1, Neg Sentiment =0
PREDECITED LABEL Y'
COST FUNCTION TO MINIMIZE L(Y,Y')
Gradient Descent = θ - alpha * gradient slope

vocabulary - answer List of unique words in a document

sentiment analysis - answer an automated process of analyzing and categorizing social
media to determine the amount of positive, negative, and neutral online comments a
brand receives

Looking at
Vocabulary you can create a
Positive Frequency
Negative frequency
associated to every word in vocabulary

sentiment analysis: Positive Frequency Dictionary - answer"I am happy because I am
learning NLP"
"I am Happy"

vocabulary:
I am happy because learning nlp sad not
33211100

Feature Extraction: Spare Representation - answerA representation that contains a lot
of zeros
example
vector of 1, 0's each representing the existence of the words in the vocabulary

CONS: Features are as large as the size of a Vocabulary. This could result in larger
training time, and large prediction time.

sentiment analysis: Negative Frequency Dictionary - answer"I am sad, I am not learning
NLP"
"I am Sad"

,vocabulary:
I am happy because learning nlp sad not
33001121

Feature Extraction: Frequencies of Words - answerwhere freq(word, sentiment class)
Xm = [ 1 (bias) , Σw freqs(w,1) pos.,Σw freqs(w,0) neg. ]

vocab: I am happy because learning nlp sad not
pos: 3 3 2 1 1 1 0 0
neg: 3 3 0 0 1 1 2 1

Σw = "I am sad, I am not learning NLP"
Σ freqs(w,1) pos.=I:3+am:3 +sad:0 +not:0+learn:1+NLP:1=8
Σ freqs(w,0) neg.=I:3+am:3 +sad:2 +not:1+learn:1+NLP:1=11

X shape (m, 3)

X1 = [ 1 , 3, 5]
X2 = [ 1 , 5, 4]
...
Xm = [ 1 , 8, 11 ] sample 1 row of m sample

Feature Extraction Preprocessing: Stop Words - answerFrequently used words that are
part of sentence but don't add value such as conjunctions and punctuations

Feature Extraction Preprocessing: Stemming - answerReducing Words to their base
derivations removing tense example

tuning -> tun
tune -> tun
tuned -> tun

reducing vocabulary size

Accuracy - answerΣi (pred^(i) == y^(i))/m

Sigmoid - answerprediction h(x,θ) vertical , θ.T (X) horizontal

Cost - answerusually graphed over iterations
-1/m Σi y^(i) log (h(x,θ)) + (1-y^(i)) log (1 - h(x,θ))
y= 1 and h(x,θ) = 0 infinity
y= 0 and h(x,θ) = 1 infinity

Probabilities - answerthe likelihood that something will happen

,Conditional Probability - answerthe probability that one event happens given that
another event is already known to have happened.

All the other givens are ignored

P(X|Y)

Bayes Rule - answerP(Positive|word) P(Word) = P( Positive ∩ word)
P(word|positive) P(Positive) = P( Positive ∩ word)
P(word|positive) P(Positive)=P(Positive|word) P(Word)

P(X|Y)= P(Y|X)P(X)/P(Y)

Naive Bayes Classifier for sentiment Analysis - answerpredicts the probability of a
certain outcome based on prior occurrences of related events

Assumes that Variables are Independent

For each Word in Vocabulary you can create a
Positive Frequency List
Negative frequency List

V = Count Total Words in Vocabulary
∀w Σw freqs(w,1) pos.
∀w Σw freqs(w,0) neg.

For Each word calculate new table
P(word | pos) = freqs(word,1) / ∀w Σw freqs(w,1) pos.
P(word | neg) = freqs(word,0) / ∀w Σw freqs(w,0) neg.

Σw P(w | pos) = 1
Σw P(w | neg) = 1

any word where P(w | pos) = P(w | neg) are neutral and don't add to sentiment

Also actively avoid P(w | pos)=0 or P(w | neg)=0

Power words are widely skewed
P(w | pos) >> P(w | neg) or P(w | pos) << P(w | neg)

1) Annotate tweets to be Pos or Neg
2) preprocess Tweets
Lowercase
Remove punctuation, urls, names
Remove stop words
Stemming

, Tokenize sentences
3) Get columns Σw freqs(w,1) pos. , Σw freqs(w,0) neg.
4) Get columns P(w | pos), P(w | neg)
5) Get column λ(w) = log (ratio(w))
6) calculate log( Prior Ratio)= log(P(pos)/P(neg))

Assumptions of Naïve Bayes - answerWords in a Sentence are assumed independent
Bad: Words can be used together to describe/reference another word in a sentence and
not necessarily be stand alone. example "sunny and hot" of "cold and snowy"


Relies on data distribution of training sets. Good training sets have equal frequencies of
data classifications.

Bias is present in sentiments of training tweets for example

Applications of Naïve Bayes - answerAuthor Identification
= P(author1 | book)/P(author2 | book)

Spam Filtering
= P(spam | email)/P(non spam | email)

Information Retrieval
P(document k |query) ~ Πi P( query i | document k)
retrieve relevant document in k documents if
P(document k |query) > threshold

Word disambiguation
= P(context 1 | ambig word)/P(context 2 | ambig word)
= P("RIVER" | "BANK")/P( "MONEY" | "BANK")

Ration Of Probabilities - answerratio(w) = P(w | pos)/P(w | neg)
POSITIVE = P(w | pos)/P(w | neg) => ∞
NEUTRAL = P(w | pos)/P(w | neg) = 1
NEGATIVE = P(w | pos)/P(w | neg) => 0

Prior Ratio - answerP(pos)/P(neg)

in a balance data sets P(pos)/P(neg)= 1

Naïve Bayes Inference condition Rule for Binary Classification - answersentence m= "I
am happy today; I am Learning"

Likelihood = Πm P(wm | pos)/P(wm | neg)

Naïve Bayes Inference= (Prior Ratio) (Likelihood )

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller julianah420. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $13.49. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

75632 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling
$13.49
  • (0)
  Add to cart