Summary of the chapters 1, 2, 3, 4, 5, 6, 7, 8, 10, 11, 12 and 14 of the book A conceptual introduction to psychometrics writen by Gideon J. Mellenbergh. Also the three additional articles that have to be studied for the master course Test Construction at the University of Groningen are summarized...
Summary A conceptual introduction to psychometrics - Test construction (PSMM-6)
Summary Test Construction - A Conceptual Introduction to Psychometrics, ISBN: 9789490947293 (PSMM-6)
Lecture notes Test Construction (PSMM-6)
All for this textbook (7)
Written for
Rijksuniversiteit Groningen (RuG)
Master klinische psychologie
Test Construction (PSMM6)
All documents for this subject (1)
5
reviews
By: Anne753 • 1 year ago
By: dannydjaoedji • 7 months ago
By: elina011 • 1 year ago
By: ylseytsje • 1 year ago
By: kellywelberg1 • 1 year ago
Seller
Follow
isabelvdb
Reviews received
Content preview
1
A conceptual introduction into psychometrics (Gideon J.
Mellenbergh)
Week 1:
Chapter 1:
Book DuBois (1970): mentioning 3 roots of modern testing
1. Civil service examinations
2. Assessment of academic achievement
3. Study of individual differences in behavior
Test definitions:
Test Instrument for measurement for a person’s maximum/typical
performance under standardized conditions, where the
performance is assumed to reflect one or more latent attributes.
- Test= measurement instrument
- Test= measure of performance
1. Maximum performance test
2. Typical performance test
Maximum performance Person is asked to do his/her best to solve one/more problems.
test Answers on these tests can vary in correctness.
Typical performance test Person is asked to respond to one/more tests, where the responses
are typical for the person. Answers typify the person (personality or
attitude test).
Standardized conditions Performances are measured under the same conditions (test
instructions, test materials, administration procedures). The reason
for standardization is that the test performance must be
comparable between persons and occasions.
Latent attributes Latent attributes cannot be observed. It is assumed that one/more
latent attributes underlie the test performance and that latent
attributes effect the test performance.
Surveys Contain questions which are answered by a respondent. It is not
assumed that survey questions reflect a latent variable.
Subtests A part of a test.
Item Smallest possible subtest of a test. Building blocks of a test.
- Test consisting of one item = 1-item test.
Dimensionality Dimensionality of a test is equal to the number of latent attributes
(variables), which effects test performance.
- One latent attribute test = unidimensional test
- Test measuring more than one latent attribute =
multidimensional test
, 2
Test types:
Mental test Cognitive tasks.
Physical test Instruments to make somatic or physiological measurements.
Maximum performance Pure power test= problems that the test taker tries to solve. Test
test taker has ample time to work on each of the test items. Emphasize
is on measuring accuracy to solve the problems.
Time-limited power test= majority of the test takers have enough
time to solve the problems.
Speed test= test consists of very easy items that can be solved by all
test takers. Test taker is asked to solve problems as quickly as
possible. Emphasize lies on measuring time taken to solve problems.
Ability test/aptitude test= measuring a person’s best performance
in an area that is not explicitly taught in training/educational
programs.
Achievement test= measures performance that is explicitly taught
in training and educational programs.
Typical performance test Personality tests (questionnaires)= measures a person’s personality.
Interest inventories= measures a person’s interests.
Attitude questionnaires= measures a person’s attitudes.
Self-report questionnaire Test taker and the person who is measured are the same person.
Observation test Test taker is a person other than the one who is measured.
Part I: Test development
Chapter 2: Developing maximum performance test
Maximum performance test= asks test takers to do the best they can to perform a test.
Development starts with making a plan;
1. Construct of interest
2. Measurement mode of the test
3. Objectives of the test
4. Population and subpopulation where the test should be applied
5. Conceptual framework of the test
6. Response mode of the items
7. Administration mode of the test
(1) Construct of interest;
- Latent variable= general term
, 3
- Construct= used when a substantive interpretation is given of the latent variable. Constructs
vary in 3 different ways;
1. Constructs vary in content from mental abilities to psychomotor skills and physical
abilities.
2. Constructs vary in scope, from (for example) general intelligence to multiplication
skill.
3. Constructs vary from educational to psychological variables.
● Achievement tests= measure constructs that result from instruction and
teaching.
● Ability test= measure constructs that are relatively independent of
instruction and teaching.
(2) Measurement mode; different modes can be used to measure constructs.
- Self-performance mode= the common measurement mode is to ask test takers to perform a
mental or physical task.
- Self-evaluation mode= test taker is asked to evaluate his/her ability to perform a task.
- Other-evaluation mode= ask others to evaluate a person’s ability to perform a task.
(3) Objectives; tests are used for many different purposes.
- Scientific use or practical purpose of tests
- Individual test takers or group level test takers
- Description, diagnosis or decision
● Description= test is used to describe performance.
● Diagnosis= adds a conclusion to the description.
● Decision-making= decisions are based on the tests.
(4) Population; target population is the set of persons to whom the test has to be applied. A target
population can be split into distinct subpopulations.
(5) Conceptual framework; the definition or description, where the test development starts with, are
not concrete enough to write test items. A conceptual framework gives the item writer a handle to
write items.
(6) Item response mode;
- Free-response; can be further divided into short- answer and essay items.
- Constructed-response
- Choice
- Selected response
Item response scales:
- Dichotomous= 2 ordered categories (correct/incorrect).
- Partial ordinal-polytomous= the correct option is ordered above distractors, but the
distractors are not ordered among themselves.
- Ordinal-polytomous= more than 2 ordered categories (correct/partly correct/incorrect).
Options are completely ordered.
(7) Administration mode;
, 4
- Oral administration= test is presented orally by a single test administrator to a single test
taker.
- Paper-and-pencil (P&P)= test presented in the form of a booklet. Test takers read the
instructions, answer the items and report their free responses/choices.
- Computerized test (CT)= order of items is the same for each test taker. Difference with P&P
is that the test is presented on a computer instead of a booklet.
- Computerized Adaptive Test (CAT)= test is adaptive→ computer program searches for the
items that best fit the test taker. Therefore different items are presented to different test
takers.
Item writing guidelines (p. 38-47 for examples):
- Focus on one relevant aspect
- Use independent item content
- Avoid overly specific and overly general content
- Avoid items that deliberately deceive test takers
- Keep vocabulary simple for the population of test takers
- Put item options vertically
- Minimize reading time and avoid unnecessary information
- Use correct language
- Use non-sensitive language
- Use a clear stem and include the central idea in the stem
- Word the item positively and avoid negatives
- Use three options unless it is easy to write plausible distractors
- Use one option that is unambiguously the correct or best answer
- Place options in alphabetical, logical or numerical order
- Vary location of the correct option across the test
- Keep the options homogenous in length, content, grammar etc.
- Avoid ‘all-of-the-above’ as the last option
- Make distractors plausible
- Avoid giving clues to the correct option
Item rating guidelines:
- Rate responses anonymously
- Rate responses to one item at a time
- Provide the rater with a frame of reference
- Separate irrelevant aspects from the relevant performance
- Use more than one rater
- Re-rate the free responses
- Rate all responses to an item on the same occasion
- Rearrange the order of responses
- Read a sample of responses
Pilot studies on item quality:
- Experts’ pilots; content and technical aspects are assessed by experts in both the field of the
test and item writing.
● Sensitivity review= persons not on the panel reviewing the content and technical
aspects of the items. The sensitivity review panel is composed of members of
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying these notes from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller isabelvdb. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy these notes for $6.96. You're not tied to anything after your purchase.