Introduction to statistical analysis YOU CAN DO THIS
57 views 2 purchases
Course
Introduction to statistical analysis
Institution
Erasmus Universiteit Rotterdam (EUR)
Book
Research Methods for the Behavioral Sciences
This summary is of all the lectures, however i highly advise you to practice as much as possible the inclass assignments.
With this summary + praticing i got a 9.3
You can do it!
Lecture 1
Statistics: The study of how we describe and make inferences form data
• An inference is a conclusion reached on the basis of evidence and reasoning
• Distinction between descriptive and inferential statistics
Types of statistics
Univariate: one variable of things
Bivariate: using two variables to come to conclusion
Multivariate: multiple variables to come to conclusion
Population symbol - N
Sample symbol - n
Descriptive statistics: describing the sample, not the population.
Inferential statistics: When you want measurement on the sample, make statement about
population you use inferental statistics
Units of analysis & variables
Units of analysis: the what or who is being studied
• The unit that you will be able to draw conclusion about
• Typically, all units are the same type of thing in single data set
• E.g. individuals, families, countries, companies, etc.
Variables: a ,measure property of each of the units of analysis
• E.g. age, GDP, household, income, annual revenue.
Levels of measurement
Nominal level Ordinal level Interval level Ratio level
• Group classification • Meaningful ranking • Meaningful ranking • All properties of
ordering interval
• No meaningful • Distance between • Distance are equal • Absolute & has
ranking possible categories unknown/ meaningful zero
not equal point
• Numerical coding • E.g. how often you • E.g. temperature in • If it is zero, it does
arbitrary watch tv degrees celsius not exist
• E.g. Reglion types • E.g. Age
QUALITATIVE—> <—QUANTITATIVE
* We always need to know the level of measurement in order to know which statistical technique
we may use for the given variable
,Continuous vs discrete variables
Continuous variable is measured along a continuum
Discrete variable is measured in whole units or categories.
Example:
• A person’s height - continuous
• A persons number of children - discrete
• Number of doctors in country - discrete
• Surface area (km) of a country - continuous
• Average number of children per woman in a country - continuous
Measures of central tendency & measures of variability
To (univariately) describe the distribution of variables on different levels of measurement
The mean
• Is for ➔ Interval/ratio
• Sample Mean symbol - M
• Changing any score will change mean
• Adding or removing a score will change mean (unless that score is already equal to mean)
• Adding, subtracting, multiplying, dividing each score by a given value (a “constant”) causes the
mean to change accordingly
• Sum differences from the mean is zero
• Sum of squared differences from the mean is minimal
• can only be used for interval/ratio variables
• most useful for describing (more or less) normally distributed variables
The median
• can be used for ordinal or interval/ratio variables
• often used for interval/ratio variables that have skewed distributions
• Ordinal, interval, ratio
,• Median is not as sensitive to outliers as the mean
• Also called 50th percentile
• Whenever n is an even number, the median is the mean value of the two middle cases
• To determine the median from a frequency table, we need to identify the first category that
exceeds 50% in the ‘cumulative percent’ column
• Cummulative percentage is used to find median
The mode
• can be used for nominal, ordinal or interval/ratio variables
• Nominal, ordinal,interval/ ratio
• The mode is the category with the largest amount of cases
Normal/skewed distribution
Tutorial SPSS 1
Data view: one column is one variable, one row is one person
Variable view: each row is one variable, and each column is one property of that variable.
- Variables name do not have space in between them
- Label you write down what you have originally asked in questionnaire or what you mean with
that variable
- Label values box you put in the answers people can ultimately choose “1.00 = never watched”
- In Measure box you put in your level of measurement “Scale = Ratio/interval”
- In Nominal measurement there is no mean or median
Analyzing Data
- Click analyze (top of the window) – and then descriptive statistics and then frequencies and
then the type of chart you will like
- All the variable son the right side are the variables hat spss will analyze
- When saying modes, you use “the labels you gave” instead of “the code number you gave it”
- In cumulative percentage if its above 50%
- Valid percent exclude the missing, the percent includes it
Analyzing on Case
- Click on data, then select cases, click on if, click variable, and type in value you want to check
exclusively. So here you check only males, females, or only country, etc.
, Lecture 2
Measure of variability
Measures of central tendency alone carry not enough information to adequately describe
distributions of variables, we need a second type of measures: Measuring variability
Different type of variables are called Dispersion/Variability
The Range: The distance between highest to lowest.
- can be calculated for ordinal, interval, ratio
- Always reported together with maximum and minimum score
- Is sensitive to outliers
Interquartile range (IQR)
- Based on “quartiles” that split our data into four equal groups of cases
- IQR based on distance between Q1 and Q3
The variance
Is based on the Sum of Squares, is the squared distance from the mean. For the calculation
of the variance, it matters whether we have sample data or population data.
How can we interpret the value of the variance?
• We don’t, but: “everything is meaningful in comparison” (i.e. when comparing variances across
groups, we can make comparative statements about more/less dispersion around the mean)
• For the purpose of interpretation, we calculate another measure of variability: the standard
deviation
Why are there two different variance formulas for sample data / population data?
• We often use the sample variance as an ‘estimator’ for the population variance (which is typically
unknown)
• When we calculate sample variance, we therefore divide by n-1, to arrive at an unbiased
estimator of the population variance
• Note how this is particularly relevant in small samples
What is the difference between the de nitional and the computational formula of the
variance?
• Different formulas for calculating the same thing (we use definitional formula)
• Advantage of computational formula: no need to calculate individual distances from
the mean
fi
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller naomialcocer1. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $7.05. You're not tied to anything after your purchase.