Exam (elaborations)

Automated Assessment with Multiple-choice Questions using Weighted Answers

5 views 0 purchase

Course
Automated Assessment with Multiple

Institution
Automated Assessment With Multiple

In the case of multiple-choice questions, it is expected that teachers and professors engage in extra endeavour to elaborate ones that fairly assess their students’ competencies and skills. There are widely accepted methods to evaluate and classify a large number of candidates, for instance ...

[Show more]

Preview 2 out of 8 pages

View example

Uploaded on August 1, 2024
Number of pages 8
Written in 2024/2025
Type Exam (elaborations)
Contains Questions & answers

automated assessment with multiple choice question

Institution Automated Assessment with Multiple
Course Automated Assessment with Multiple

$14.99

Added

Add to cart Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Automated Assessment with Multiple-choice Questions using Weighted
Answers
Francisco de Assis Zampirollia,∗, Val ´erio Ramos Batistab, Carla Rodriguezc,
Rafaela Vilela da Rochadand Denise Goyae
Centro de Matem ´atica, Computac ¸ ˜ao e Cognic ¸ ˜ao, Universidade Federal do ABC (UFABC),
09210-580, Santo Andr ´e, SP , Brazil
Keywords: Automated Assessment, Multiple Choice Questions, Parametrized Quizzes.
Abstract: A resource that has been used increasingly in order to assess people is the evaluation through multiple-choice
questions. However, in many cases some test alternatives are wrong just because of a detail and scoring
nought for them can be counter-pedagogical. Because of that, we propose an adaptation of the open-source
system MCTest , which considers weighted test alternatives. The automatic correction is carried out by a
spreadsheet that stores the students’ responses and compares them with the individual answer keys of the
corresponding test issues. Applicable to exams either in hardcopy or online, this study was validated to a
total of 607 students from three different courses: Networks & Communications, Nature of Information, and
Compilers.
1 INTRODUCTION
In the case of multiple-choice questions, it is ex-
pected that teachers and professors engage in extra en-
deavour to elaborate ones that fairly assess their stu-
dents’ competencies and skills. There are widely ac-
cepted methods to evaluate and classify a large num-
ber of candidates, for instance the Item Response The-
ory (Aybek and Demirtasli, 2017). As an example, let
us consider the Brazilian National High School Exam
(ENEM), elaborated by Instituto Nacional de Estu-
dos e Pesquisas Educacionais An ´ısio Teixeira (INEP).
In January 2021 ENEM had almost 5.8 million stu-
dents enrolled for the classroom tests, but the absence
rate was 55.3% mostly due to the Coronavirus pan-
demic. The reader can see enem.inep.gov.br for de-
tails, but here we highlight that for the ﬁrst time ap-
plicants could sit this exam online in some venues.
This was possible for 93,079 of the candidates but
precisely they contributed 70% to absence. Anyway,
INEP foresees that 100% of the tests will be online al-
ahttps://orcid.org/0000-0002-7707-1793
bhttps://orcid.org/0000-0002-8761-2450
chttps://orcid.org/0000-0002-1522-3130
dhttps://orcid.org/0000-0003-4573-3016
ehttps://orcid.org/0000-0003-0852-6456
∗Grant #2018/23561–1, S ˜ao Paulo Research Founda-
tion (FAPESP).ready in 2026. In fact, it is following the same trend of
many others, e.g. the TOEFL language exam, which
is now online (ets.org), and such a trend boosts more
sophisticated studies devoted to the elaboration of ap-
plicable questions.
In (Burton, 2001) the author presents a study on
improvements for the reliability of multiple-choice
questions through deterring examinees from just
guessing the right answer. The paper states that pure
guessing can be discouraged by fractional marks at-
tributed to wrong answers, namely ‘negative mark-
ing’ or ‘penalty scoring’. However, the ﬁnal perfor-
mance can be damaged by the examinee’s uncertainty
in case they have solved a question just partially. The
very author cites some works that debate such penal-
ties, but he focuses on achieving percentage values
of unreliability of a test by studying three scenarios:
Q, where the only random element is the drawing
of some items from Question Banks (QB) in which
scope and difﬁculty are equally levelled, and the ﬁnal
mark is exactly the number of correct answers; G, in
which the only random element is the drawing of the
alternatives; QG, which uses both random elements.
All questions must be answered. For Q and QG one
must have QB with at least ﬁve times the number of
questions in the exam. In his model, the author con-
siders an exam with sixty questions and four alterna-
tives per question. By taking the average knowledge
254
Zampirolli, F ., Batista, V., Rodriguez, C., Vilela da Rocha, R. and Goya, D.
Automated Assessment with Multiple-choice Questions using Weighted Answers.
DOI: 10.5220/0010338002540261
InProceedings of the 13th International Conference on Computer Supported Education (CSEDU 2021) - Volume 1 , pages 254-261
ISBN: 978-989-758-502-9
Copyright c/circlecopyrt2021 by SCITEPRESS – Science and Technology Publications, Lda. All rights reserved of 50% the mean scores in cases Q, G and QG were
30, 37.5 and 37.5, respectively. He concludes that a
60-question four-choice test is rather unreliable, and
one of the main reasons is that G and QG allow guess-
ing, which is not the case of Q.
The authors in (Oliveira Neto and Nascimento,
2012) adapted the Learning Management System
(LMS) Moodle to make formative assessment during
the teaching-learning process with high quality feed-
back for a distance learning course of 40h per week
in Mathematical Finance. These evaluations can bet-
ter direct the student’s performance if the feedback
is quick and precise at pointing out their difﬁculties.
Moreover, the feedback can guide the teacher about
the adopted teaching process, and so the students’
understanding can be reinforced regarding some top-
ics that have not been assimilated yet. By analysing
the students’ answers in previous classes, the authors
have improved the QB with additional rules to tests,
error messages and links to either theoretical topics or
extra exercises.
In the elaboration of multiple-choice questions, it
is also important to consider suitable wrong options
among the alternatives, also called distractors . Un-
suitable distractors enable the examinee to guess the
correct answer by discard, as discussed in (Moser
et al., 2012), where the authors present a text pro-
cessing algorithm for automatic selection of distrac-
tors. A more recent work is (Susanti et al., 2018),
but devoted to automatic production of distractors for
the English vocabulary. In (Ali and Ruit, 2015) the
authors present an empirical study on ﬂawed alterna-
tives and low distractor functioning. They conclude
that removal or replacement of such defective distrac-
tors, together with increasing the cognitive level, im-
prove detection of high- and low-ability examinees.
Our present work introduces an automatic gen-
erator and corrector devoted to exams that consist
of multiple-choice questions with weighted alter-
natives. It is adapted from the open-source sys-
tem MCTest available on GitHub. For such exams
MCTest stores the correction in a CSV-ﬁle and emails
it to the professor. This ﬁle contains each student’s
responses compared with the individual answer key
of the exam issue received by that student. Common
programs like Excel and LibreOfﬁce open the ﬁle in a
spreadsheet with built-in formulas that give each stu-
dent’s ﬁnal mark according to the weights, as we shall
detail in this paper.
As a related work we cite (Presedo et al., 2015),
in which the authors use Moodle to create multiple-
choice questions with weighted alternatives. Their
system also enables the user to give an exam in hard-
copy but with neither the student’s id nor variationsof the exam. Moreover, it requires the plugin Ofﬂine
Quiz (moodle.org/plugins/mod ofﬂinequiz). Moodle
enables Calculated question type , that we call para-
metric question in MCTest , in which the statement
and the alternatives accept wildcards but in Moodle
only for simple mathematical operations. By contrast,
MCTest enables nominal exams, numerous variations
and wildcards that accept complex formulas written in
Python and its libraries. Details on parametric ques-
tions with MCTest can be found in (Zampirolli et al.,
2021; Zampirolli et al., 2020; Zampirolli et al., 2019).
The paper is organized as follows. Section 2 de-
scribes the adapted MCTest for an automated assess-
ment with multiple-choice questions using weighted
answers; Section 3 shows the obtained results and dis-
cusses them; ﬁnally, Section 4 presents our main con-
clusions and opportunities for future work.
2 USING ADAPTED MCTest :
MATERIALS AND STEPS
This work applies the open-source Information and
Communication Technology (ICT) MCTest available
on GitHub (https://github.com/fzampirolli/mctest).
We have implemented MCTest in order to enable
weighting answers of multiple-choice questions. In
this section, we explain how to create exams that in-
clude such questions with weighted answers.
2.1 Creating Multiple-choice Questions
After downloading MCTest from GitHub, the sys-
tem administrator must install it on a server. Be-
fore creating a question, they have to include Insti-
tution, Course, Discipline and also associate a pro-
fessor as Discipline Coordinator. This one can cre-
ate discipline Topics and also add more professors.
See vision.ufabc.edu.br for details. Afterwards, any
of them can add a Class and also questions asso-
ciated to a Topic thereof. An example would be
setting[ED]<template-figure >atChoose Topic
in Figure 1. Namely, this topic belongs to a dis-
cipline called ED, a mnemonic to Example Disci-
pline. In that ﬁgure we have Short Description:
template-fig-tiger-en , which is optional but
makes it easier to locate questions in Question Banks
(QB), as we shall explain in Subsection 2.2. The ﬁeld
Group is also optional for the user to deﬁne a group
of questions, so that in each exam MCTest will always
draw only one question from that group. The most rel-
evant ﬁeld is Description , where we can insert para-
graphs in L ATEX and also combine them with a Python
code, as explained later in another example.Automated Assessment with Multiple-choice Questions using Weighted Answers
255

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller Ariikelsey. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $14.99. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

79373 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling

Popular Universities in the United States

Popular books

Find notes and summaries for these qualifications