Show
One of my favorite (but not used all that much) test item types is the “matching exercise.” One class I teach has quite a bit of vocabulary that my students just flat-out need to memorize. Matching seems like a good, concise way of testing them with a minimum amount of pain on their part (writing the answers) and my part (creating the test). The sources all agree on the definition:
I was pleased to see this same source describing the types of material that can be used:
This other source, http://teaching.uncc.edu/learning-resources/articles-books/best-practice/assessment-grading/designing-test-questions, lists
As you can see, this test item format is well-suited for testing the Knowledge Level of Bloom’s Taxonomy, however several sources hint that it can apply to the Comprehension Level “if appropriately constructed.” Only one source discusses in detail how to “aim for higher order thinking skills” by describing variations that address, for example, Analysis and Synthesis. (http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf) One variation is to give a Keylist or Masterlist, that is information about several objects, and have the student interpret the meaning of the information, do comparisons (least/greatest, highest/lowest, etc.), and translate symbols. The example gives three elements from the periodic table with the properties listed below them but no title on the properties. The questions ask “Which of the above elements has the largest atomic weight?” and “Which has the lowest melting point?” and other similar inquiries. Another variation is a ranking example:
These directions are followed by a list of events. While I see these variations more as the “fill-in-the-blank” types, their connections to matching properties to objects or events to a time line make it reasonable to treat them as matching types. What are the advantages and disadvantages of matching exercises? (Source: http://cft.vanderbilt.edu/guides-sub-pages/writing-good-multiple-choice-test-questions/)
(Source: http://www.iub.edu/~best/pdf_docs/better_tests.pdf)
(Source: http://teaching.uncc.edu/learning-resources/articles-books/best-practice/assessment-grading/designing-test-questions)
But
There are design strategies that can reduce the amount of time it takes for students to work through the exercise, and others that don’t put so much emphasis on reading skills. We’ll look at those in the next post. In the previous post we talked about the pros and cons of the alternative-response (e.g., true-false) types of questions as well as their application to Bloom’s Taxonomy. Next we discuss aspects to consider when writing the questions. I found this “Simple Guidelines” list helpful and informative.
All of this makes sense to me. At first I objected to “Make half the statements true and half false” but when I thought about it, I wouldn’t do exactly half necessarily but maybe close to half. In fact this source, http://teaching.uncc.edu/learning-resources/articles-books/best-practice/assessment-grading/designing-test-questions, suggests making the ratio more like 60% false to 40% true since students are more likely to guess the answer is true. I found other points to add to the guidelines list. (Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf)
This same source gives you a nice tip for writing true-false items:
Most of this discussion has been about True-False questions but the category is really Alternative-Response. Let’s look at the variations available to us. (Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf)
In summary, the sources all tend to agree that the best type of Alternative-Response items are those that are unambiguous (“true or false with respect to what?”), concisely written, covering one idea per question, and aimed at more than rote memorization. We should avoid trick questions or questions that test on trivia. And the best tests with A-R items have a lot of questions with a True-to-False ratio of 40:60. Next test item type: Matching! My original intent for the title here was just “True or False Test Items” but one source pointed out that the best name is Alternative-Response. That source is https://www.msu.edu/dept/soweb/writitem.html and defines alternative-response as
It goes on to point out the advantages of this item type:
But there is a major disadvantage: “students have fifty-fifty probability of answering the item correctly by chance alone.” It suggests making up for this by offering “a larger number of alternative-choice items than of other types of items in order to achieve a given level of test reliability.” There are other advantages and disadvantages, such as these offered by http://www.iub.edu/~best/pdf_docs/better_tests.pdf, which only addresses true-false items.
To me this seems like a contradiction: How can they be easy to write but difficult to make unambiguous? This same source offers tips on writing good true-false items which we will address in the next post. This source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf brings up some other points on true-false items.
Bloom’s Taxonomy is of concern for us here, too. Most sources I read indicate these types of questions items address only the first level of Bloom’s, Knowledge. We saw that above with the source listing “Good for” bullet points. Other sources say it bluntly:
The documents used at the Palomar College Plenary breakout sessions for August, 2015 are below. The slideshow in .PPTX format. You can click “View” and then “Notes View” to see each slide presented on a single page with the accompanying text: Test Writing Plenary Talk Aug 2015 The worksheet in .PDF format: Test Writing Plenary Talk Worksheet The verb lists and question frames documents are found here as well as in context within the blog. Bloom’s Verbs, one page Bloom’s Verbs for Math Bloom’s Question Frames More Bloom’s Question Frames Bloom’s Verbs for Science *Graphic from http://tips.uark.edu/using-blooms-taxonomy/ It is often thought that multiple choice questions will only test on the first two levels of Bloom’s Taxonomy: remembering and understanding. However, the resources point out that multiple choice questions can be written for the higher levels: applying, analyzing, evaluating, and creating. First, we can recognize the different types of multiple choice questions. While I have used all of these myself, it never occurred to me to classify them. Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf
In fact, this source states:
So what can we do to make multiple choice questions work for higher levels of Bloom’s? Source: http://www.uleth.ca/edu/runte/tests/
Another part of this source brings up the idea of using the “inquiry process” to present a family of problems that ask the student to analyze a quote or situation.
This source gives some good ideas, too. Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf
In short, multiple choice questions, when designed with good structure and strategies, can provide an in-depth evaluation of a student’s knowledge and understanding. It can be challenging to write those good questions but the benefits are worthwhile. I thought about writing a summary of what we have learned about multiple choice questions but found this funny little quiz to be better than anything I could come up with: Can you answer these 6 questions about multiple-choice questions? One type of objective question is multiple choice. We all know what it is but let’s look in detail at its description anyway. Source: https://www.msu.edu/dept/soweb/writitem.html
So the structure of a multiple choice question is a stem followed by options. The options contain one correct answer and a set of distractors. The Stem Some advice for constructing a good stem is Source: http://www.iub.edu/~best/pdf_docs/better_tests.pdf
Here are some examples of good and bad stem design: Source: http://cft.vanderbilt.edu/guides-sub-pages/writing-good-multiple-choice-test-questions/#stem
The best thought about the stem I have seen on the Internet: Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf
The Options Sometimes known as “the alternatives”, they are composed of one right answer and a group of “foils” or distractors. One point that is emphasized regularly in the resources is that the distractors should all be plausible and attractive answers. Source: http://cft.vanderbilt.edu/guides-sub-pages/writing-good-multiple-choice-test-questions/
[Ed. Note: I have some issues with this particular example but I get the point of their suggestion.]
Other suggestions from this source:
There is more to think about for multiple choice questions, which we will examine in the next post. Now that we have studied general test writing strategies, ideas, and tips, it is time to pull our focus inward to the details of the questions themselves. In general, question types fall into two categories:
I needed specific definitions for these, which I found here. Source: http://www.k-state.edu/ksde/alp/resources/Handout-Module6.pdf
This source also suggests guidelines for choosing between them:
And it continues with this bit of advice:
I wanted to see what different sources would say, so I also found this one. Source: http://www.helpteaching.com/about/how_to_write_good_test_questions/
I am not sure that “multiple choice” should be the primary choice but I understand they are suggesting to avoid open-ended questions if you want to measure reasoning or analytic skills or general comprehension. This bothers me a little. It seems to me, from reviewing the previous posts in this blog, that an open-ended question could measure those skills. The example that comes to mind is the question I had in botany about describing the cell types a pin might encounter when passing through a plant stem. That was an essay question measuring general comprehension of plant tissues. The following source brings up good points about analyzing the results. It also notes that objective tests, when “constructed imaginatively,” can test at higher levels of Bloom’s Taxonomy. Source: http://www.calm.hw.ac.uk/GeneralAuthoring/031112-goodpracticeguide-hw.pdf
I like their point about how objective tests cannot test competence to communicate, construct arguments, or offer original answers. Training our students to take only multiple choice tests (or simply answer “true” or “false”) does not help them to learn how to explain their thoughts or even ensure that they can write coherent sentences. This is addressed by the second source and in previous posts. The suggestion is to use a variety of test item types. This can give you a better picture of what your students know, whereas using one single type can be biased against students who are not strong respondents to that type. We are at the point of our investigation where we need to start looking in more detail at test construction. Here is a brief summary that puts together the pieces of what we have learned so far. Our challenges Write an accurate measure of student achievement that
Some things we can do to accomplish this In general, when designing a test we need to
More specifically, our test goals are to
We should consider the technical quality of our tests Quality means “conformance to requirements” and “fitness for use.” The criteria are
A useful tool is Bloom’s Taxonomy It lists learning levels in increasing order of complexity:
To apply Bloom’s directly, we looked at
Next up: Learning what question types to use to achieve our goals. When I first started considering Bloom’s Taxonomy, I thought it was good to help expand my ideas on how to test but I struggled with applying it directly. I appreciated the increasing cognitive levels but needed help in writing test questions that utilized them. What I found were lists of verbs associated with each level. A good one to start with is: Source: http://www.lshtm.ac.uk/edu/taughtcourses/writinggoodexamquestions.pdf
Here is an extensive list that is printable on one page, useful for reference while you are designing your test: Bloom’s Verbs, one page. Other useful lists: Bloom’s Verbs for Math Bloom’s Question Frames (looks very good for English, literature, history, etc.) This gives you nearly complete questions which you can manipulate into test items appropriate to your discipline. More Bloom’s Question Frames (2 pages). Bloom’s Verbs for Science What comes across to me again and again throughout the sources is that considering the hierarchy when designing exams creates a culture of learning that involves thinking deeply about the course material, taking it beyond simple rote memorization and recitation. This culture would benefit from also considering Bloom’s while you are teaching. Modeling higher level thought processes, showing joy at cognitive challenges, exploring topics in depth (if time permits) or mentioning the depth exists (if time is short) can send a strong signal that thinking is valued and important to learning. Another view on Bloom’s as applied to test writing is to consider the knowledge domains inherent in your course material. They are: Source: http://www.lshtm.ac.uk/edu/taughtcourses/writinggoodexamquestions.pdf
When I put this list with the verbs lists, I get more ideas for test questions and directions for exploring student acquisition of the course knowledge. One recurring recommendation in the resources is that we should consider Bloom’s Taxonomy when designing tests. To do so, we should know what it is. The triangle above is a version of the revised Bloom’s, using active verbs and with an addition of one level and a slight reordering at the top. According to http://www.learnnc.org/lp/pages/4719,
The site goes on to say:
Further, http://www.edpsycinteractive.org/topics/cognition/bloom.html tells us,
And also,
Let’s see what each level represents. The following list is based on the original Bloom’s categories but it is still enlightening. Source: http://www.calm.hw.ac.uk/GeneralAuthoring/031112-goodpracticeguide-hw.pdf
We will look at these in more detail in the next post. Posts navigationWhy would matching items be used instead of the short answer format?Guessing is difficult, perhaps the most difficult of any objective test format. Because matching items allow for many items in a short space and make guessing difficult, the validity and reliability of classroom tests are improved, and that helps all students to be assessed fairly and accurately.
Which personality test requires the test taker to describe events based on pictures?The Thematic Apperception Test (TAT), a widely used and researched projective personality test designed at Harvard University in the 1930s, requires test takers to look at ambiguous pictures showing a variety of social and interpersonal situations and to tell stories about each picture.
What are the 5 stages of test development?Construct definition, specification of test need, test structure; 2. Overall planning; 3. Item development; 4. Scale construction; 5.
Which is most useful in determining whether different items on a test are measuring the same thing?Internal consistency assesses the correlation between multiple items in a test that are intended to measure the same construct. You can calculate internal consistency without repeating the test or involving other researchers, so it's a good way of assessing reliability when you only have one data set.
|