|
Scoring
Computerized adaptive testing
The common (Verbal and Quantitative) multiple-choice portions of the exam
currently use
computer-adaptive testing (CAT) methods that automatically change the
difficulty of questions as the test taker proceeds with the exam, depending on
the number of correct or incorrect answers that are given. The test taker is not
allowed to go back and change the answers to previous questions, and some type
of answer must be given before the next question is presented.
The first question that is given in a multiple-choice section is considered
to be an "average level" question that half of the GRE test takers will answer
correctly. If the question is answered correctly, then subsequent questions
become more difficult. If the question is answered incorrectly, then subsequent
questions become easier, until a question is answered correctly.
This approach to administration yields scores that are of similar accuracy while
using approximately half as many items.
However, this effect is moderated with the GRE because it has a fixed length;
true CATs are variable-length, where the test will stop itself once it has
zeroed in on a candidate's ability level.
The actual scoring of the test is done with
item response theory (IRT). While CAT is associated with IRT, IRT is
actually used to score non-CAT exams. The GRE subject tests, which are
administered in the traditional paper-and-pencil format, use the same IRT
scoring algorithm. The difference that CAT provides is that items are
dynamically selected so that the test taker only sees items of appropriate
difficulty. Besides the
psychometric benefits, this has the added benefit of not wasting the
examinee's time by administering items that are far too hard or easy. This
occurs in fixed-form testing.
|