Testing Guidelines

Overview of Testing Guidelines

Testing not only lets you and your students know how much they have learned, it also provides a chance for more learning to take place, by reinforcing course material or by requiring students to use or think about what they have learned in a new way. Tests should be designed with primary course objectives in mind and should cover material from all components of a course (sections, lectures, textbooks, etc.). If you are taking over a course, go over the old tests carefully to see what was covered and how.

Students should be told in advance, preferably at the beginning of the quarter, what kinds of exams will be given in a course. Since some students may have access to old exams, it is probably fairer to give all students sample copies of at least one previous exam.

The nature of the exam will directly influence how students prepare, study and learn, as will the format and frequency of your testing. If students have reason to believe that you will mainly stress recall of information, for example, then they are much less likely to devote time to the mastery of concepts and the synthesis of material (and they may also be more likely to cram at the last minute, which makes it less likely they will remember the material very long after the exam). On the other hand, if your tests will demand a deep knowledge of the ideas discussed, students are likely to respond accordingly.

Frequent testing can enhance learning as well as provide information on student progress. In this case, returning exams promptly and going over the exams with students will maximize the benefits of frequent testing and feedback.

Choosing an Exam Format

Your choice of exam format should be based on the learning outcomes you want to test. Below are some possible exam formats that can be combined to create a well-balanced approach to testing, along with some basic guidelines for using each format.

  • Essay tests give students a chance to organize, evaluate, and think, and therefore often have the best educational value. They are, however, the hardest to grade. Make sure you, or your graders, have the time and stamina to grade essay exams well. You should discuss the criteria for their evaluation with your students and with any fellow graders before the test is given.
  • Math and science exams generally consist of problems to be solved. Numerical or logical problems primarily test the ability to apply material; introducing familiar versus new problem types (which require extending what students have practiced to new applications) can vary the challenge of the exam.
  • Multiple-choice exams are the most difficult to construct well, but can be used to measure both information recognition and concept application. If you use this format, consider writing questions throughout the quarter, while the lectures and material are fresh in your mind.
  • Completion questions test for recall of key terms and concepts. If you use completion questions, be willing to accept reasonable alternative answers that you had not considered prior to giving the exam.
  • Matching questions are useful for testing recognition of the relationships between pairs of words or between words and definitions. Supply enough answer choices so that students cannot guess simply by the process of elimination.
  • Short-answer questions help test information recall and analytic skills. They achieve similar goals as multiplechoice questions, but require students to recall, and not just recognize, correct answers. If you use a short-answer format, make questions specific enough that students can confidently answer the question in the allotted space.

Take-home essay exams are also popular. Although they may seem an ideal format, by providing students with a calmer environment and more time to think through answers, they have their drawbacks. You can minimize these drawbacks with some basic precautions:

  • You can put word limits on each essay, so that students with other tests do not have to compete unfairly against students with no other demands on their time. However, following the procedures of the Honor Code, there should not be a specified time limit less than the full period between the distribution of the exam and the due date.
  • There should also be explicit instructions about whether or not students can talk to each other about their answers and whether they have unlimited access to materials (course materials only? library resources? talking to people outside the class?).
  • An alternative strategy is to give out the exam in advance and allow consultation among students, but have them write the test in class without notes.

Writing Good Exams

General Guidelines: Certain standards apply to all exam formats. Good exams"

  • Are written in clear, straightforward language, so that all students can understand what you are asking for.
  • Do not require skills, knowledge, or vocabulary that are not central to the course.
  • Are an appropriate length for the exam period. Directions are clearly stated on the exam (and reviewed in class before students begin the exam). The point value of each part of the exam is given so that students can prioritize their time.

For problem- or case-based exams:

  • Construct most problems so that they resemble the ones given in exercises during the quarter.
  • You can make problems more interesting by describing a “real” application for the concept or technique or by combining two concepts in a single problem.
  • Problems should be of graduated difficulty. The first problem, at least, should be one that builds confidence, so that nervous students do not become ruinously flustered at the outset.
  • Avoid “double jeopardy” (when the solution to one problem depends on successfully solving a previous one). Finally, avoid long, detailed computations. Concentrate on ideas, not endurance.

After writing the rough draft of your exam:

  • Classify the questions according to what skills they require of the students: information recall, translation, interpretation, application of principles, analysis of concepts, synthesis of ideas, or evaluation.
  • Make sure that your questions adequately cover the kinds of skills you want to assess. Particularly for multiple-choice exams, small changes to questions can demand a higher level of thought and more closely match the learning goals of the course. For example, instead of asking a student to recognize the correct word for a given definition, you might ask the student to choose the concept/term that best matches a novel example that you provide. This requires the student not only to know the definitions of the terms but also to interpret events using those definitions.
  • Check that your exam fairly represents the material of the testing period. It’s easy to fall prey to “primacy” and “recency” effects, where we overemphasize material from the beginning and end of a given testing period and underemphasize what was covered in the middle.

Once the questions are written:

  • Consider the more practical concerns of arranging the exam on the page.
  • Aim for a stylistically simple, clean, and uncrowded layout.
  • If you leave space for short for  or essays, realize that the amount of space you leave is often interpreted by students as the length of the answer you want.

After constructing any kind of exam: 

  • Ask an experienced colleague or your TAs to look it over.
  • Someone else can often point out ambiguities and typos that you do not see.
  • Poorly written questions and typos are discouraging to students, who trust that you have put careful thought and attention into how they are being evaluated.

Always take the exam first yourself: For most exam formats, you should be able to finish the exam in no more than a quarter of the time the students will have.

Grading Exams

Problem Sets, Short-Answer Questions, and Multiple Choice

Although these evaluation methods usually take longer to make up than others, they are also the easiest to grade:

  • Multiple-choice exams can usually be graded by one or two people in about an hour if you use a scanner and software to grade and analyze the exams.
  • Scantron grading software provides a number of test analysis options, including item-by-item analysis of question responses. If students are doing worse than chance on a particular question, it is likely that the question was poorly worded. In this case you should either give credit for more than one answer or toss the question out (by giving everyone credit).
  • With other formats, it is often a good idea to divide the exam questions among graders. This is more likely to provide grading consistency and make it possible for a grader to spot patterns of deviation for a single question or problem. For all exam formats, you may think that you have written the perfect question with only one correct answer, but always be prepared for alternative answers. Consider allowing students to submit regrade petitions justifying their solutions.

Essay Exams

Usually the challenge is how to wade through all those essays while remaining both consistent and sane. When there are a number of instructors assigned to a course, this is easier, because you can divide the workload in a variety of ways:

  • If each instructor has a section and all of you have covered the same basic material, then you may prefer to mark the entire exams of just the students in your section. (The problem here, of course, is that objectivity may be harder to achieve since you may be partial toward your own students.)
  • Grading question by question, rather than student by student, may improve grading consistency.
  •  If each instructor has dealt with specialized topics in lecture and section, then it is probably better to split the exam questions up so that each teacher covers the area he or she taught.This will allow you to give credit for material that you presented in section and it will give you feedback on whether the ideas you have emphasized have actually registered. At the same time, you should be guided by a grading standard that has been mutually agreed upon by all instructors.
  • Dividing the exam questions in this way also ensures that each question will be marked consistently across all students, even if one grader turns out to be more stringent or lenient than other graders.
  • However, reading 200 answers to the same question one after the other has its drawbacks; it can affect your mental health and your grading range. This is less likely to occur if you pace yourself, grade questions that you are interested in, and switch questions every once in a while if you are grading more than one question.
  • After grading has begun, consider having all graders share a sample of their A, B, and C essays to compare and sort out any inconsistencies developing among graders.

Keeping (relatively) objective: Grading essay exams involves a lot of subjective judgment, and your judgment may be influenced by things like fatigue, boredom, or impatience. In particular, you are more likely to be stringent with the first few essays you read than with the rest and you are less likely to be careful about comments when you are tired. To avoid such problems:

  • Read a few essays before you actually start grading to get an idea of the range of quality.
  • Stop grading when you get too tired or bored.
  • When you start again, read over the last couple of essays you graded to make sure you were fair.

After the Exam

When the exams have been marked, get together with the other graders to discuss and resolve any problems you have encountered:

  • Add up the total scores.
  • Double check your addition (this saves a lot of trouble later) and plot the distribution.
  • Discuss the grade distribution and what you think it says about student learning and test construction.

For maximum learning from an exam—and out of respect for the students—tests should be returned to students as soon as possible. Unless you intend to discuss them in class, hand tests back at the end of a period in order to avoid students being preoccupied while you try to cover something else. Provide a grade distribution to students to help them make sense of their numeric or letter grade. Do not post students’ grades publicly. They are legally entitled to confidentiality in this matter.

Requests for regrading: Consider having an official “regrade” policy in which students have a limited time (say, one week) to review their exam, request a regrade, and justify their request with a full written explanation. This policy has the benefits of encouraging students to review their exams in a timely manner, discouraging arbitrary grade complaints, and requiring students to examine their responses carefully.

Moving forward: After the exam has been graded and returned, place a copy of it in your files along with a note to yourself indicating which questions were most commonly missed, whether any parts unnecessarily confused students, and the grading distribution. This file will be helpful for writing future exams, as well as helping you focus on material that you know students will have trouble with.