Summary

Extensive SUMMARY for Psychometrics (Radboud University Nijmegen)

Name: Extensive SUMMARY for Psychometrics (Radboud University Nijmegen)
SKU: doc_975312
Rating: 3.00 (1 reviews)
Author: PsychologyRadboudUniversity

1 review

276 views 21 purchases

Course
Psychometrics (SOWPSB2PS26E)

Institution
Radboud Universiteit Nijmegen (RU)

Book
Psychological Testing: History, Principles, and Applications, Global Edition

Extensive summary of the lectures/literature for the course psychometrics at Radboud University Nijmegen (2020). Achieved grade was a 9.5!

[Show more]

Preview 4 out of 39 pages

View example

Summarized whole book? No
Which chapters are summarized? Relevant information for exam (2020)
Uploaded on February 4, 2021
Number of pages 39
Written in 2019/2020
Type Summary

psychological testing
psychometrics
statistics
tests
realiability
validity
item response theory
spss

Book Title:Psychological Testing: History, Principles, and Applications, Global Edition

Author(s):Robert J. Gregory

Edition:mei 2014
ISBN:9781292058801
Edition:1

Summary
Measuring & Diagnostics 1 - English - Year 1, Period 4 - VU Psychology
Summary
Samenvatting Meten en Diagnostiek 1 Jaar 1.4 Psychologie
Summary
Summary Measurement Theory and Assessment 1 Year 1.4 Psychology

Institution
Radboud Universiteit Nijmegen (RU)
Education
Psychologie
Course
Psychometrics (SOWPSB2PS26E)

1 review

By: angelinatrich • 3 year ago

PsychologyRadboudUniversity

Member since 3 year 198 documents sold

$7.72

Added

Add to cart

Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Week 1 – Test construction
Measuring instrument → any method that leads to quantitative data
Test → measuring instrument that consists of several components (items, from which total score is
determined)
Questionnaire → can be a test, when calculating the total score (test can also measure attitude for
example; not just about performance)
Subscales/subtests → measuring instruments that consist of several coherent tests

Measuring vs. classification:
Classification is based on a division that is non-debatable
- Aries, Taurus, …
- Schizophrenic, psychotic, …

Measuring is based on a theory that can be tested (e.g. Voltmeter).
- Concepts can change in response to data
- e.g. Mathematics test → language, maths, calculation

Cycle for constructing a test:

COTAN Assessment system:

,Basic rule when assessing validity and reliability:
It is only good once it has been proven to be good
- A good for reliability means that the reliability has been properly investigated AND that the
conclusion of this investigation was that the reliability is good
- A fail for reliability means that the reliability was insufficiently investigated AND / OR that it
was investigated and that the conclusion was that the reliability is insufficient
- Analogue with validity

Phases in a validation study:
- In the construction of a test, a validation study has to be performed to investigate reliability
and validity

1. Preparation:
- Choice of the type of properties that will be measured (for each property, a separate -
subscale of several items must be made)
- Exploring the domain with literature and interviews (using existing scales increases the
comparability of your research)

2. Formulation of the items: determining extent to which the items appear to be suitable for
content
- Contents of the individual items (each item only measures the chosen domain, not another
domain)
- Representativeness of the collection of items (items must be a good representation of the
domain)
- Number of items (sufficient items per subscale are needed;
- Precise formulation of the items (items must not be susceptible to multiple interpretations
and must be adapted to language and comprehensibility of target group)
- Content and number of response categories (items that garner the same response from
every test takes are not informative, so use about 5-7 answer categories)
- Expert judgment (ask panel of experts about the quality of items)

3. Planning first administration:
- Use of multiple assessors (interrater reliability can be determined)
- Other variables that have to be measured (background variables)
- Number of test subjects that are needed

4. Initial decrease of the scale: data collection takes place

5. Analysis of data of individual items: pre-selection, in which worst items are removed
- Inter-rater reliability of items (whether observers have high degree of conformity with
cohen´s kappa)
- Variance of analyses (items with small variance are less suitable because they contribute
little to differentiation & items with variance 0 must be removed)
- Skewness and unisonness of the distribution of items (in most forms of factor analysis it is
assumed that the items are normally distributed; items that deviate strongly from symmetry
are removed)

6. Analysis of the relationships between items: whether items measure the same
(homogeneity or unidimensionality)
- Correlations between the items(items of the same scale must correlate positively because
they should measure the same property)

, - Factor analysis of items (examined more closely whether items measure the same aspect;
if items are not unidimensional, the scale may have to be split into subscales or items are
removed)
- Analysis of the internal consistency reliability (examines whether the scale has enough
items to view the sum score as a reliable measurement; items with negative contribution
to reliability are removed)

7. Standardization: examine what averages, SD and percentiles are for the target groups, in
order to define a norm

8. Analysis of the relationships between test scores and other variables:
- Test-retest reliability (examines how stable the scores are over time)
- Criterion validation (extent to which test can predict other variables is investigated)
- Construct validation (examines whether the test truly has the theoretically expected
relationships

Construct validity:
Theoretical interpretability
Do we understand what is measured?
Do the theoretical expectations come true?
- If there are not theories about the construct, the test cannot be construct valid! There
should be theories that can be falsified. They should be investigated and not be proven
wrong.

A dimension is just an aspect that we are using to characterize people. We want unidimensional tests
(can be determined with factory analysis and item response theory).

Unidimensionality:
- Means that the items measure the same thing. Component of construct validity, because if
the items have to measure the same construct, they must at least measure the same.
The models used to examine unidimensionality have the following assumptions:
- Unidimensionality → each person can be characterized with a single number that indicated
to what extent that person has the property that is intent to be measured. This value is
generally unknown and is therefore called the latent trait. For an intelligence test, this would
be someone's unknown true intelligence. That value is indicated by the Greek letter theta.
- Monotonicity → if there is some quantity that we want to measure and this quantity
increases, the probability of a correct answer will increase with the underlying property

, (when ability/fear increases, the probability of a correct answer increases (fear increases →
more vomiting → more peed in pants)
- Local independence → within a group of subjects with the same value of theta, the items are
not correlated. In the total population, there may be high correlations between the items.
For example: someone who has a low aggression should have a low score on all items used (except
for noise), and if that subject's aggression increases, this should show in all items. Another example is
a written exam. It is desirable that good students have a better chance on all questions. An exam
question on which good students score poorly – that is, a question on which one scores worse the
better one knows the subject matter – is undesirable.
Unidimensionality and monotonicity cannot be distinguished empirically. If you would only assume
unidimensionality, without monotonicity or something alike, it would not be testable (Sijtsma &
Junker, 2006, p. 86). The same applies to local independence. For this reason, the term
unidimensionality is often used for the three assumptions jointly. The word therefore has two
meanings.

What is the difference between unidimensionality and internal consistency reliability?:
Unidimensionality implies that the items measure the same trait, save for noise, but it does not say
anything about the size of the noise component. Internal consistency reliability says something
about the size of the noise in the total score, but does not give an answer to the question whether
the items measure the same trait.
For example: the arithmetic items '3 + 4 = ?' and '5 + 2 = ?' are unidimensional, but their total score is
not reliable because there are only two items. Conversely, the total score of an IQ test usually has a
high reliability, but intelligence is not unidimensional, because there are various types of intelligence
(such as fluid intelligence and crystallized intelligence).

- We start with some numerical assumption (there is something that can be measured and
that is not visible; like intelligence), but it has some relationships to observations that we can
do namely the responses that people give on the items of the test.
- We compare the predictions of the model with the data that we gathered and then accept
the model for the data (quantity exists and we can measure it with the test) or the
predictions are wrong and there is something wrong with the model and we cannot measure
the thing we want to measure with these items.

Guttmann scale:
- The items and the persons can be arranged in such a way that the person answers an item
correctly if the ability (scale) of the person larger than the difficulty of the item.
- According to this model: if you fail on item B you will also fail on item C

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller PsychologyRadboudUniversity. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $7.72. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

81633 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling

Popular Universities in the United States

Popular books

Find notes and summaries for these qualifications

Summary

Extensive SUMMARY for Psychometrics (Radboud University Nijmegen)

Document information

Subjects

Connected book

More summaries for

Written for

1 review

Seller

Reviews received

Content preview

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Quick and easy check-out

Focus on what matters

Frequently asked questions

What do I get when I buy this document?

Satisfaction guarantee: how does it work?

Who am I buying these notes from?

Will I be stuck with a subscription?

Can Stuvia be trusted?