Statistics for Educational Scientists, Part 3: Data-analytic process
1 Example
Research: motivation and creativity (Amabile, 1985)
● Research question: “What is the influence of extrinsic or intrinsic motivation on creativity?”
● Experimental design
○ 47 students randomly assigned to two groups
○ Task: write a poem
○ Group 1: list of intrinsic reasons (n = 24)
○ Group 2: list of extrinsic reasons (n = 23)
○ Poems rated bij 12 poets on a 40-point scale
Questionnaire given to creative writers, to rank intrinsic and extrinsic reasons for writing:
Creativity scores in two motivation groups, and their summary statistics:
Averages differ, bur can we conclude that there are
mean differences in the population?
1
, 2 Data analysis workflow
2.1 Preparations
What to do during preparations for data analysis?
● Is the research question clear?
● Evaluate the design of the experiment
○ In our example we have an experiment where participants are randomly assigned to
two conditions
○ Causal inference may be possible
● Check data
○ Ex.: forgetting a decimal point
○ Ex.: score higher than maximum score
2.2 Exploratory data analysis
During exploratory data analysis we use descriptive statistics to …
● Become familiar with the data
● Tentatively seek answers to research questions
● Detect outliers
● Uncover interesting aspects of the data
Example - Exploratory data analysis
A histogram for each group so we can
see how data are distributed amongst
the groups
We can see that people who had to list
intrinsic motivations generally have
higher scores than people who had to
list extrinsic motivations
Boxplots give an idea about the
distribution of scores in terms of
quartiles
We can also look at symmetry,
skewness, …
2
, A bit of practice…
Look at the boxplots. Which statement is wrong?
A. The boys (J) and girls (M) group achieve the same
maximum score but a different minimum score
B. The boys’ group (J) average is 13, while the
girls’ group (M) average is 15
C. The girl’s boxplot (M) is symmetric, while the boys’
boxplot (J) is skewed
D. For the boys’ boxplot (J), we are dealing with an
outlier, but not for the girls’ boxplot (M)
E. No idea
B is the wrong answer because the boxplots show us the
medians, NOT the averages/means
2.3 Statistical inference
We see the differences between groups, but are these differences also observed in the population?
To check, we need to formulate hypotheses
Stap-by-step guide for statistical inference
1. Formulate H0 and H1
2. Select the significant level α (optional)
3. Calculate test statistics
4. Determine the p-value
5. Decision (optional)
Notations:
● Yij: score of person i in group j of the dependent variable
● nj: number of observations in group j
● 𝑌𝑗: sample mean in group j
We start with a research question, which we then try to translate into a statistic model to answer the
research model → we use this to formulate hypotheses
1. Formulate models and hypotheses
● Ho: µ1 = µ2 (restricted model)
𝑖𝑖𝑑
○ Yi1 ∼
N(µ,σ2), i = 1, …, n1
𝑖𝑖𝑑
○ Yi2 ∼
N(µ,σ2), i = 1, …, n2
𝑖𝑖𝑑
○ Yij = µ + εij, εij ∼
N(0,σ2)
● H1: µ1 ≠ µ2 (restricted model)
𝑖𝑖𝑑
○ Yi1 ∼
N(µ1,σ2), i = 1, …, n1
𝑖𝑖𝑑
○ Yi2 ∼
N(µ2,σ2), i = 1, …, n2
𝑖𝑖𝑑
○ Yij = µj + εij, εij ∼
N(0,σ2)
● iid = independent and identically distributed: observations are independent and come
from the same distribution
3
, 2. Test statistics: choice and value
● What do we know about the distribution of 𝑌1 − 𝑌2 across different samples?
○ Normally distributed
○ With mean value µ1 - µ2
1 1
○ And standard deviation σ 𝑛1
+ 𝑛2
○ σ = unknown, so estimate using sample variances
● 𝑡=
(𝑌 − 𝑌 ) − (µ − µ )
2 1 2 1
=
(𝑌 − 𝑌 ) − 0
2 1
𝑆𝐸(𝑌 − 𝑌 ) 2 1
𝑆𝐸(𝑌 − 𝑌 )
2 1
(𝑛1 − 1)𝑆'21 + (𝑛2 − 1)𝑆'22
○ Where 𝑆𝐸 𝑌2 − 𝑌1 = ( ) 𝑛1 + 𝑛2 − 2
×
1
𝑛1
+
1
𝑛2
𝑛𝑗
2
○ And 𝑆𝑗 =
'2 1
𝑛𝑗 − 1 (
∑ 𝑌𝑖𝑗 − 𝑌𝑗
𝑖=1
)
3. Derive sample distribution, determine p-value and (optional) make a decision
● Given H0 is true: t ~ tdf=n1+n2-2
● A sampling distribution = repeated sampling
○ If we conduct an experiment many times, we get different data that we can
plot in different histograms to look at the distribution of scores
○ By replicating the experiment multiple times, we get the sampling distribution
of the statistics
● Deciding if we reject or accept the null hypothesis
○ Compare value of test statistics with use of tables or SPSS
○ Determine rejection region: region in distribution where we reject the null
hypothesis
○ If the distribution is two-sided, there is a rejection region on both sides of the
graph (2 x probability)
● Optional
○ Compare p-value with α to decide if the result of the test is statistically
significant or not
○ Make a decision to reject or accept H0
4. Effect size determination
● 100(1-α)% confidence interval (CI) for difference between two averages
● (
𝐶𝐼 = 𝑌2 − 𝑌1 ± 𝑡 𝑛 ) *
( 1
+ 𝑛2 − 2 ) (
× 𝑆𝐸 𝑌2 − 𝑌1 )
● Effect size helps evaluate “practical significance”
A bit of practice…
Which t-test statistic do you obtain from the data below?
A. t = (-) 0,09
B. t = (-) 0,32
C. t = (-) 0,87
D. t = (-) 1,63
𝑌2 − 𝑌1
𝑡 =
(𝑛1 − 1)𝑆'21 1 1
𝑛1 + 𝑛2 − 2
× 𝑛1
+ 𝑛2
15 − 13,8751 1,1429
𝑡 = = = 0, 87
2
(15 − 1) × 2, 8284 + (14 − 1) × 4,1111
2
1 1 12,2857 × 0,1381
15 + 14 − 2
× 15
+ 14
4
, Example - intrinsic and extrinsic motivation
Step 1: formulate models and hypotheses
Restricted model Unrestricted model
Assumptions: Assumptions:
● Scores follow normal distribution with ● µ is different in both groups
mean µ ● standard variance and variation is the
● µ will be the same in both groups same across both groups
Step 2: test statistics - choice and calculation
𝑡=
(𝑌 − 𝑌 ) − 0
2 1
=
𝑌2 − 𝑌1
𝑆𝐸(𝑌 − 𝑌 )
2 1 (𝑛1 − 1)𝑆1 + (𝑛2 − 1)𝑆'22
'2
1 1
𝑛1 + 𝑛2 − 2
× 𝑛1
+ 𝑛2
(19,88 − 15,74)
= 2 2
(24 − 1) × 4,44 + (23 − 1) × 5,25 1 1
24 + 23 − 2
× 24
+ 23
= 2, 92
Step 3: derive sample distribution and determine p- value, and (optionally) make a decision
Compare value of test statistic (2,92) with t-distribution with df = 45
Because the distribution is two-sides, there is a rejection region on both sides of the graph, so
there’s two times the probability that t-sore is larger than 2,92
SPSS: p = 0,0054
Table D: 0,005 > p > 0,0025 BUT it’s a two-sided distribution so 0,01 > p > 0,005
p = 0,0054 < α → we reject H0
Step 4: effect size determination
* 0,05
𝑡45 → 𝑝 = 2
= 0, 025
(
if α = 0,05, then 95% 𝐶𝐼 = 𝑌2 − 𝑌1 ± 𝑡45 × 𝑆𝐸 𝑌2 − 𝑌1 ) *
( )
= 4, 14 ± 2, 014 × 1, 42 → [1, 2801; 6, 9999]
5
,Interpreting the size of a p-value
● Is there evidence of a difference?
○ p = 0: convincing
○ p > 0,01: moderate
○ p > 0,05: suggestive, but inconclusive
○ p > 0,1: NO
● ATTENTION
○ The p-value is NOT the probability that H0 is wrong
○ The p-value depends on n (not the effect size)
○ Do not say: “the p-value is significant”
A bit of practice…
Determine the p-value using the following data. Compare with α = 0,05. What conclusion do you
draw?
A. 0,025 < p < 0,05 → reject H0
B. 0,025 < p < 0,05 → accept H0
C. 0,05 < p < 0,10 → reject H0
D. 0,05 < p < 0,10 → accept H0
E. No idea
df = 23 + 20 - 2 = 41
Table D → look at df = 40 (if we can’t find df in table, we always look at smaller df)
1,836 lies in between 0,05 and 0,025 → x 2: 0,05 < p < 0,1
p > α, so we accept H0
A bit of practice…
What is the crucial t-value at an α of 0,01? Use the following data:
A. t* = 2.326
B. t* = 2.403
C. t* = 2.576
D. t* = 2.678
E. No idea
df = 35 + 20 - 2 = 53 → looking at df = 50 in Table D
0,01
𝑝 = 2 = 0, 005
2.4 Interpretation
What should be present in the interpretation of data?
● Formulate a conclusion
○ Answer the research questions
○ Use substantive terminology
● Summarize results using plots
○ If only two groups are compared, it is not necessary to show the results in a plot
○ With multiple groups, however, it is useful
● State findings’ limitations
○ Randomization: causal inference possible
○ Random samples: assumption questionable, strictly no inference to population
possible
6
, Example - Interpretation extrinsic and intrinsic motivation
“There is strong evidence that a lower creativity score for a poem is obtained after completing the
extrinsic questionnaire (M=16) compared to the intrinsic questionnaire (M=20), (t(45) = 2.9, p< 0.01,
two-sided). The estimated difference is 4 points on a 40-point scale. The 95% confidence interval
for the decrease in creativity score due to extrinsic motivation ranges from 1 to 7 points.”
A bit of practice…
You want to investigate whether boys score significantly more or less on a test than girls. Based on
SPSS, you obtain the output below. Complete the conclusion below (with α = 0,05).
“There is significant evidence that the group “BOYS” (M=12,0714) scores differently on average on
the test than the group “GIRLS” (M=15), (t(27) = 2,638, p = 0,014). The estimated difference is 2,93
points with a 95% CI of 0,65 to 5,21 points.
A bit of practice…
A researcher conducts a t-test to compare the extent to which 30 professors and 32 teaching
assistants attach importance to guided self-study. When she wants to check the effect size, she
obtains the following 99% CI: [1,01; 8,99]. What is the value of the corresponding test statistic?
A. 2,660
B. 2,9949
C. 3,3333
D. 5
E. No idea
( ) *
𝐶𝐼 = 𝑌2 − 𝑌1 ± 𝑡 𝑛 +𝑛 −2 × 𝑆𝐸 𝑌2 − 𝑌1 → C = 99%, 𝑡(60) = 2,66
( 1 2 ) ( ) *
8,99 + 1,01
Center of CI = 2
=5
Calculate m: (upper - center) = (8,99 - 5) = 3,99
𝑡
*
( 𝑛1+𝑛2 − 2 ) (
× 𝑆𝐸 𝑌2 − 𝑌1 = 3, 99 )
(
𝑆𝐸 𝑌2 − 𝑌1 = ) 3,99
2,66
= 1, 5
𝑡=
(𝑌 − 𝑌 )
2 1
=
5
= 3, 3333
𝑆𝐸(𝑌 − 𝑌 )
2 1
1,5
In reality data analysis workflow is often more complicated!
Models involve certain assumptions, which sometimes do not hold true
7
,8
, Statistics for Educational Scientists, Part 3: analysis of variance with one
factor (ANOVA-1)
1 Example
Example - Validation of Boston Naming Test (BNT)
46 primary school children
Variables:
● Independent variable (IV): divided amongst 4 groups
○ Modality-specific speech and language development disorder (STOS)
○ STOS with behavioral disorders
○ STOS with generalized cognitive impairments
○ Children without STOS (control group)
● Dependent variable (DV): number of items correct (min. 0 - max. 60)
Children get 60 different pictures, and have to name the object that is portrayed on the picture
Research questions:
● Are there differences between the population averages between the four groups?
● Is there a difference between children without a STOS and those with an STOS? (contrast
analysis)
2 Notation and introduction to one-way ANOVA
Notation:
● Yij: score of person i in group j on the dependent variable (DV)
● nj: number of observations in group j
● N: total number of observations
● a: number of groups
● 𝑌j: sample mean in group j
● 𝑌: overall sample mean
Data structure → usually nice to have a look at data: tabulate data
Participant dataset usually used in software (like SPSS)
9
, There are 3 columns:
● Person identification number
● Dependent variable
● Group
This is the general data structure you get
when you upload a data set into softwares
like SPSS
For our example of Boston Naming Test,
our dependent variable is the score a
participant gets on the test (0-60)
3 Exploratory data analysis
We want to get an idea about certain patterns in the data → a preliminary indication of what’s going
on with the data
We can put the data into boxplots
This is going to visualize the distribution of our dependent variable
Boxplots CANNOT tell us the mean of a distribution
Example - Boston Naming Test
We can see an outlier in the distribution of our normal group (group 1)
The second group (STOS) has a much larger variability compared to the other groups in the
experiment
10