e-Assessment of Short-Answer Free-Text Questions

0

No comments posted yet

Comments

Slide 3

Any subject. Range of complexity. 1 mark questions. Our focus has been on developing a robust system. This is currently for 1 or two mark questions. Important to establish credibility at this level before jumping to harder questions. Technology will develop further.

Slide 15

Year groups 2, 3 and 5 were computer marked in approximately 3 hours and 45 minutes on a 2.4GHz PC running Windows XP. This entailed marking of approximately 108,000 responses. This equates to around 30 seconds per student ‘script’ of 270 items.

Slide 1

E-Assessment of Short-Answer Questions Dr. Tom Mitchell Intelligent Assessment Technologies

Slide 2

How do we mark free-text responses by computer ? IATs Marking Engine does not operate on raw text, but on the output of a sentence analyser. Computerised Marking

Slide 3

What sort of questions can be computer marked ? Short-Answer Questions Science Why are some flowers highly scented with brightly coloured petals ? Information Technology Why do Java programs typically execute more slowly than similar C++ programs ? Medical Exams What movement is affected by rupture of the supraspinatus tendon ? × Give a brief description of the function of the CPU. But not all items are suitable

Slide 4

Characteristics of suitable items : Short-Answer Questions Short phrase or sentence response. Identification of correct and incorrect answers is clear-cut. Characteristics of unsuitable items : Many ways in which a correct response can be expressed. Responses are complex in nature.

Slide 5

How do we mark free-text responses by computer ? IATs Marking Engine does not operate on raw text, but on the output of a sentence analyser. Computerised Marking

Slide 6

How do we represent the mark scheme ? Each mark scheme answer is represented as a template. Each template specifies one particular form of acceptable or unacceptable answer. A Computerised Mark Scheme

Slide 8

Creating A Mark Scheme

Slide 9

Sources of Error Missing model answers; Improperly correcting wrongly spelled words; Problems analysing the sentence structure; Invalid qualification of otherwise correct answers; There are four recognisable sources of error inherent in computerised marking of short-answers : Human marking tends to be better at all of the above, but worse in terms of consistency.

Slide 10

Computer Based Testing of Medical Knowledge

Slide 11

A test of the basic core knowledge of medical students Essential knowledge required to be a PRHO All five years sit the same examination Questions by subject and year in proportion to curriculum time Free-text short answers to questions – no prompting or triggers Administered as pilot manuscript tests in 2001 and 2002; summative computer marked test in 2003 What is the Dundee Progress Test?

Slide 12

Each test consists of approximately 270 1 mark questions. New questions are added to the bank each year. New questions require moderation the first year they are used Sample based checking of computerised marking for the new questions (using year 5 students to provide the sample) Populating the Question Bank

Slide 13

Test Delivery.

Slide 14

Computerised Marking (1).

Slide 15

Computerised Marking (2).

Slide 16

Computer-Assisted Moderation

Slide 17

Benefits of a Computerised Test

Slide 18

Student Reports

Slide 19

Growth in basic science knowledge 2003-05 (median, interquartile range, outliers) Statistically significant increase by study year for Y1 questions for all test dates, apart from Y2 vs Y1 in 2003 (NS) (ANOVA plus Scheffe) %

Slide 20

Growth in Outcomes knowledge throughout the curriculum – combined data 03-05 Steeper growth in knowledge of Outcome 1 (Clinical skills) vs Outcome 4 (Management) or 8 (basic knowledge)

Slide 21

Very acceptable ‘alpha reliability coefficient’ at 0.954 Significant correlations of Year 5 student data 2003 Progress Test vs: portfolio EMI Y4 CRQ Y4 OSCE Y4 0.31** 0.55** 0.61** 0.29** The Progress Test appears to be related best to the written components of finals, but does not seem to be measuring quite the same attributes. Reliability and Correlations - 2003

Slide 22

Utility of the Progress Test? Ensure adequate core knowledge in Y5 portfolio students Characterise progress of overall cohort knowledge from year to year Characterise progress of an individual’s knowledge from year to year Identify core curriculum topics where cohort knowledge appears deficient Feed back this information to system convenors and teachers

Slide 23

Formative Assessment at The Open University

Slide 24

Formative Assessment

Slide 25

Formative Assessment

Slide 26

Formative Assessment

Slide 27

Findings by the OU Of the 78 questions originally authored, four were deemed to be unworkable and removed. Accuracy test carried out by the OU: computer and six course tutors, adjudicated ‘blind’ by question author); “in all cases, the mean mark allocated by the computer system was within the range of means allocated by the human markers” “for each of the questions, the majority of the variation was caused by discrepancies in the marking of the course tutors” Journal Paper : E-assessment for learning? The potential of short-answer free-text questions with tailored feedback – BJET 2008

Slide 28

Designing Items for Computerised Marking

Slide 29

Designing Items for Computerised Marking Word items to constrain students into only writing about one thing at a time. Break longer response items down into more specific parts. There may be scope to increase computerised marking by careful item design… …without compromising educational validity.

Slide 34

Conclusions Computerised marking of short-answer questions is a mechanical process. Not all items are suitable. For suitable items, marking accuracies are comparable or better than human marking. It is possible to design items to make them more suitable for computerised marking. Not a replacement for all human marking, but a valuable enhancement of the e-assessment instrument.

Slide 35

Intelligent Assessment Technologies Dr. Tom Mitchell

Summary: On-screen assessments are increasingly replacing traditional pencil and paper tests. The downside of computerised tests is that they typically rely on closed question types – multiple choice, drag and drop, image hot-spot etc. But short-answer free-text questions, a favourite tool of teachers and examiners alike, can now be computer marked by natural language based assessment engines which aim to mimic human marking of free-text. This presentation will outline the capabilities and limitations of computerised marking of short-answer questions. It will include an overview of two case studies : a summative assessment system developed for the University of Dundee; a formative assessment system in use at the Open University.

Tags: e-assessment testing assessment online tests assessments free-text examinations

URL:
More by this User
Most Viewed