Resume

Samenvatting Statistiek IV: kort en bondige herhaling

59 vues 4 achats

Établissement
Katholieke Universiteit Leuven (KU Leuven)

Cursus Statistiek IV belangrijkste delen samengevat + enkele belangrijke formules die niet in formularium staan. Perfect om net voor je examen te herhalen, zodat alle belangrijkste dingen nog zijn opgefrist!

[Montrer plus]

Aperçu 3 sur 20 pages

Voir l'exemple

Publié le 8 mars 2022
Nombre de pages 20
Écrit en 2021/2022
Type Resume

€3,89

Ajouté

Ajouter au panier Ajouter au liste de veux

Garantie de satisfaction à 100%
Disponible immédiatement après paiement
En ligne et en PDF
Tu n'es attaché à rien

Samenvatting Statistiek IV
Chapter 2: The good old one-way ANOVA

ANOVA = analysis of variance
 Used to make inferences about means

Analyzing data  always start with explorative analysis
IOT test = interocular trauma test (pattern in data is so obvious that no further statistical analysis is
needed)

Notation and interpretation:
- Person i in condition j
o i = 1 … mj (mj persons in condition j)
o j = 1 … a (a conditions = levels of a factor)
o Balanced (number of persons across conditions is equal) or unbalanced

Statistical inferences:
1. Models and hypotheses
Full (systemic part ie population mean muj and random deviation ie noise) = means
can differ across conditions
Reduced (condition means are all equal to each other) (nested in full model)
a. Parameter estimation (population means = unknown)
 Least squares estimation (= minimizes sum of squared differences between
what is observed and what the model tells it should be)
 Fitted value = best guess for an observation based on the model
 Difference between yij and mu(j) = difference between an observation and
what model tells us = residual = eij (the bigger the residue, the worse the
model)
 Reduced model:
 Full model:
b. Sum of squares
 Single number needed that expresses how large the residuals are =
minimized sum of squares = error sum of squares = residual sum of squares =
SSEred/SSEfull
 SSEfull = how much variability is left unexplained under full model, variability
within conditions/groups because considering the differences between
conditions does not imply that all data within a condition are exactly the
same
 Total sum of squares = SStot = measures total variation present in the data
(deviation from the observations to the grand sample average, is an index of
the total variability in the sample) = SSEred = to be explained var
 One-way anova: SStot = SSEreduced
 SSEred > of gelijk aan SSEfull
 SSEff = SSEred – SSEfull = expresses how much we can decrease the error by
considering the different groups (or conditions) (between variance tussen
condities) (difference between variability to be explained and the
unexplained variability) (measure for explained variability) (wat is het effect v
full model?)

1

,  Problems:
 Problem of scaling (kwadrateren) = sum of squares only interpreted
relatively to another
 Error sum of squares reduced model is always larger or equally large
than full model (full more complex  more flexible  residuals
smaller)  H0 true difference will be small, but what is small/large?
 degrees of freedom
c. Degrees of freedom = complexity of the models
 Raw residuals sum to zero (without squaring)
 Dfred: n-1
 Dffull: n-a
 General: number of observations – number of freely estimated parameters
(more parameters = smaller df)
 Dfred > dffull
d. Mean squares
 Sum of squares / degrees of freedom
 Df SSEff = a-1 (= difference between df red and full)
e. Alternative model parameterization with effect parameters
 Effect parameters have to sum to zero
 Alpha = estimated effect parameter = muj – mu
2. Choice of the test statistic
o Fit of the model to the data + complexity of model
o Is the decrease in error sum of squares (or fit) of full model large enough to justify its
increase in complexity? If additional number of parameters lowers the error sum of
squares sufficiently, then yes
o SSEff: not scale invariant + model complexity not taken into account
o F statistic (fits systematic and sampling (ie random) variability)
o Teller: systematic differences between conditions + sampling variability
o Noemer: sampling variability
o Systematic difference increase, F statistic also increases
3. The sampling distribution of F under H0 and what to conclude
a. Sampling distribution
 H0 is true: F distribution with a-1 and n-1 df
P value = probability, given H0, to find an equally or more extreme F value
P value is conditional, defined given H0
b. What to conclude?
 The smaller p value, the more evidence against H0

c. ANOVA table:
 Between groups = SSEff (treatment)
 Within groups = SSEfull (residuals) (error)
 Total = SStot = SSEred (ook: total sample variance . n-1)
4. Determine the size of your effect
o Reporting effect sizes is crucial! Very small effect studies, but enormous amount of
data  very small p value
o = practical significance
a. Biased estimator of the proportion of variance explained: eta^2 (anova) / R^2 (regr)
 = ratio of amount of explained variability over variability to be explained

2

,  0 < SSEff < SStot  0 < eta^2 < 1
 BUT: biased estimator of the true proportion of variance explained
(verwachte waarde groter dan 0)

b. Unbiased estimator of the proportion of variance explained: w^2
 Smaller than eta^2
 BUT: can become negative (-> zero)
 Preferred over eta^2
c. Remarks on effect sizes: unitless + between 0 and 1 -> what is large/small?
 1% = small
 6% = medium
 14% = large
d. Why not use F statistics of p value as a measure of effect size?
 F depends on sample size + effect size
e. Uncertainty of effect sizes
 Effect sizes are statistics, so depend on sample size
 CI

Chapter 3: Contrasts, be more specific!

 F test: conditions differ, but which conditions? How much differ they?
 Contrast = a difference in which the averages of two or more conditions are involved
o Pairwise contrast = simple difference between the averages of two conditions
o Complex contrast = difference between two elements, and one or both elements are
averages of several conditions
 Contrast = linear combination of sample averages, such that the coefficients sum to zero (cj)

A single planned contrast
- Derivation of the sampling distribution of g
o Distribution of g: if yij is normally distributed, then sample average yjstreep also
normally distributed  every linear combination of sample averages yjstreep also
normally distributed (= contrast)
 E(g) = gamma -> g is an unbiased estimator of gamma
 Var (g) = variance of the sum = sum of variances because terms are
independent  variance of sample average is equal to variance of single
observation divided by number of observations into sample average
- Statistical inference for gamma
o Confidence interval for gamma
o Hypothesis test for gamma (H0: gamma = C)(H0 true -> t verdeeld onder tdffull)
o Effect size: cohens d (= difference two means divided by the estimate of the
corresponding within-group standard deviation)
 Around .2 small
 .5 medium
 .8 large
o Street fighting statistics: if sample size large enough (df full > 30)  t verdeling =
normale verdeling
 CI: 2.SE(g)
 Rough hypothesis test: comparing value of the absolute value of t statistic
with 2 to evaluate the significance (alpha = .05)

3

Les avantages d'acheter des résumés chez Stuvia:

Qualité garantie par les avis des clients

Les clients de Stuvia ont évalués plus de 700 000 résumés. C'est comme ça que vous savez que vous achetez les meilleurs documents.

L’achat facile et rapide

Vous pouvez payer rapidement avec iDeal, carte de crédit ou Stuvia-crédit pour les résumés. Il n'y a pas d'adhésion nécessaire.

Focus sur l’essentiel

Vos camarades écrivent eux-mêmes les notes d’étude, c’est pourquoi les documents sont toujours fiables et à jour. Cela garantit que vous arrivez rapidement au coeur du matériel.

Foire aux questions

Qu'est-ce que j'obtiens en achetant ce document ?

Vous obtenez un PDF, disponible immédiatement après votre achat. Le document acheté est accessible à tout moment, n'importe où et indéfiniment via votre profil.

Garantie de remboursement : comment ça marche ?

Notre garantie de satisfaction garantit que vous trouverez toujours un document d'étude qui vous convient. Vous remplissez un formulaire et notre équipe du service client s'occupe du reste.

Auprès de qui est-ce que j'achète ce résumé ?

Stuvia est une place de marché. Alors, vous n'achetez donc pas ce document chez nous, mais auprès du vendeur evamariedelarbre. Stuvia facilite les paiements au vendeur.

Est-ce que j'aurai un abonnement?

Non, vous n'achetez ce résumé que pour €3,89. Vous n'êtes lié à rien après votre achat.

Peut-on faire confiance à Stuvia ?

4.6 étoiles sur Google & Trustpilot (+1000 avis)

73314 résumés ont été vendus ces 30 derniers jours

Fondée en 2010, la référence pour acheter des résumés depuis déjà 14 ans

Commencez à vendre!

Populaire universiteiten

Populaire hogescholen

Populaire studieboeken voor Communicatie en Taal

Populaire studieboeken voor Economie en Bedrijf

Populaire studieboeken voor Exact en Informatica

Populaire studieboeken voor Gedrag en Maatschappij

Populaire studieboeken voor Gezondheid en Geneeskunde

Populaire studieboeken voor Recht en Bestuur