NCAT: PCR Assessment Resources

NCAT suspended operations on 12/31/2018. This is a curated version of the NCAT website to enable continued use of NCAT resources by the higher education community. Please direct any questions to the University of Central Florida Center for Distributed Learning, which is the custodian of NCAT resources. The collected writings of NCAT founder Dr. Carol A. Twigg as well as archived NCAT materials will be available in UCF Libraries Special Collections & University Archives with materials accessible online through STARS.

Assessment Resources

Assessing the Impact of Course Redesign: Peter Ewell, Senior Associate at NCEMS, describes three basic assessment data-collection approaches and pertinent assessment references.

Sample Assessment Plans

The following examples show how nine of the institutions participating in the Pew Grant Program in Course Redesign will assess the impact and implementation of their redesigns.

Carnegie Mellon
Penn State University
Tallahassee Community College
University of Central Florida
University of Colorado-Boulder
University of Dayton
University of Massachusetts
University of Wisconsin-Madison
Virginia Tech

Carnegie Mellon

Summary by Peter Ewell

The redesign is a second generation effort to revamp Introduction to Statistical Reasoning using a dynamic “intelligent tutoring system” (ITS) that provides help to students as they engage in hands-on statistical problem-solving. The ITS provides many opportunities for assessment because of the “footprint” data that students leave in the software itself as they navigate the system. The institution’s assessment plan takes full advantage of the ITS capabilities.

The assessment plan itself is extremely sophisticated, involving both formative and summative components. One especially innovative aspect is the plan to use paid student subjects who have already taken statistics in an old format to pilot new course material in the form of either the ITS or a stand-alone statistics package with written instructions (as in the first generation re-design). This is a true “clinical traits” approach which is exemplary in the extreme.

A second very strong element of the plan involves exploiting the step-by-step capture capability of the ITS itself to examine exactly how students are approaching problems and where they are getting into difficulty. A particularly strong feature here is the use of specially-designed “cognitive transfer” exercises that examine students’ ability to apply learned concepts in new settings. Transfer exercises will be used in both the traditional (first generation) and redesigned (second generation) courses.

Third, the assessment design uses an existing pre-post assessment instrument to examine impact. A real strength here is the fact that the course team has historical data on this exam not only from students taking statistics recently, but also from students who have not taken it for some time. Longitudinal tracking of students to downstream courses completes the assessment design.

In an interesting contrast to other proposals, the Carnegie Mellon team does not intend to administer questionnaires to examine student attitudes and behavior. Though not stated, this is largely because most of the behavioral evidence will already have been unobtrusively collected by the students’ interactions with course material through the ITS. This is likely to provide an interesting “second-order” cost savings by reducing the potential overhead of assessment itself.

This is an exemplary assessment plan showing considerable sophistication, incorporating solid past experience, and demonstrating real recognition of the power of the technology itself to assist in the assessment effort.

The Assessment Plan

Prior Assessment Work

Carnegie Mellon has been collecting data from students taking Introduction to Statistical Reasoning. Pre- and post-tests are used to assess student learning gains within the course. Because the same tests are administered in all semesters, they can be used to compare students in the current course with students who have not taken the course for a number of years, forming a baseline about learning outcomes in the course as it is taught now. Thus, the institution can compare the learning gains of students in the newly-redesigned learning environment with the baseline measures already collected from students taking the current version of the course.

Assessing Impact

Student Learning in the Larger ITS Environment: Student learning using the ITS will be measured even before the fully redesigned course is offered. Students who have previously taken Introduction to Statistical Reasoning will be hired to solve new data analysis problems. They will receive either the ITS or a stand-alone statistics package using paper handouts from the existing course to re-train for this test. The team will gather data on students’ solution process (steps they take and amount of time to take them) and compare performance across conditions during both the learning and testing phases. In the testing phase, both groups will work in a similar, non-supportive environment to solve the same data analysis exercises designed to test their ability to transfer learning. The results of these studies will provide formative assessments of the new learning environment. The team will analyze data to test whether the new environment improves learning and transfer, and to study how students interact with the new system.
Student Learning Within the Redesigned Course: Student learning will be measured in three ways. 1) The ITS, with its ability to record how students go about solving data analysis exercises, will provide a constant stream of data on how students are learning and applying their statistical reasoning skills. 2) During certain laboratory sessions, the ITS will move into a less supportive mode (e.g., with reduced feedback, prompting questions, and other cues) during which students will be expected to solve transfer exercises designed to appear different from those previously solved. Student learning will be evaluated by analyzing their success. 3) Student learning in the redesigned environment will be measured against learning in the traditional course through standard pre- and post-tests.
Student Learning according to Student Type: Paid student subjects in the larger ITS environment will complete self-report questionnaires to determine whether different groups find the system helpful. The ITS allows tracking of students by group in the redesigned course.
Student Retention of Learning: Randomly-selected student volunteers will be tracked in downstream coursework and outside of the context of their coursework to test how well they maintain their ability to transfer statistical reasoning skills.
Student Perceptions: Students paid to help assess the larger ITS environment will be asked to complete self-report questionnaires on which aspects of the system were most useful and easy to use. Students in the redesigned course provide “ease-of-use” information simply by interacting with the ITS.
Student Continuation in the Field: The team will use data on student selection of additional statistics or statistics-related courses to compare students in the redesigned and current course.

Return to Top of Page

Penn State University

Summary by Peter Ewell

The project at Penn State centers on the redesign of Statistics, a course that not only enrolls large numbers of students but also serves as a prerequisite for a variety of client disciplines at the university. The opportunity for a substantial "multiplier effect" is also present for this course because it is implemented systemwide across twenty Penn State regional campuses. The emphasis of the redesign is placed on greater modularization of course components (delivered through technology), greater use of quizzes and quick feedback, and a lab emphasis. This project plan shows unusual sensitivity to key issues of implementation and delivery—elements that will make or break a change effort. The assessment plan indicates that the campus has already had substantial experience in developing assessment approaches and tools to examine the effects of course redesign. A major resource here is the Schreyer Institute, which is providing a lot of the support needed to develop and institute assessment. Indeed, Schreyer already worked with the statistics team implementing the course to develop assessment approaches in the past.

To assess student learning, a standard test of content mastery developed by the faculty will be used. The test proposed is new for the redesign, so direct comparisons of results for students taught traditionally cannot be made because the last traditional sections were offered last year. To address this weakness, the course team will use quiz questions that can be so compared.

More significantly, the assessment plan anticipates going "downstream" to examine the performance of students in later classes that require knowledge of statistics. Actual test performance statistics will be compiled with the cooperation of faculty in client disciplines. Focus groups will also be held with faculty from these disciplines, centering on the kinds of strengths and weaknesses in statistics exhibited by students trained in the old and new formats. Students themselves will be surveyed after they leave Statistics and enroll in classes that require statistics skills and concepts to determine their perceptions of relevance and mastery of the material.

The assessment plan also shows unusual attention to issues of implementation. The proposed survey of "time on task" for the faculty and TAs is a particularly good idea, as is the use of Schreyer-designed "Student Quality Teams" in such courses. Both are candidates for best practices that could be showcased for others. Also exemplary is the proposal for engaging in actual classroom observation.

Overall this is an exemplary proposal from the assessment standpoint. It shows unusual attention to the real effectiveness questions implied by a pre-requisite course—later class performance—while also demonstrating unusual thoroughness to implementation issues at the micro level. Recognizing the inherent weakness in a "before-after" innovation design in which there is no parallel control group, the testing process at least has some comparative quiz benchmarks that can be used. The design team correctly resisted the temptation to offer traditional sections "just for assessment," an impractical idea if they are convinced of the efficacy of what they are doing.

The Assessment Plan

Prior Assessment Work

The statistics department has already collaborated with Penn State’s Schreyer Institute for Innovation in Learning on an assessment plan for their current courses. The future assessment program will use existing and new assessment methods to focus on four major areas: student learning, student attitudes, cost effectiveness, and program implementation.

Additionally, Student Quality Teams (students who monitor the work processes of instructors and other students to find opportunities that lead to better learning) and measures of student learning and attitudes have already been designed and implemented in the assessment of Statistics. These will continue to be a part of the assessment plan for the redesigned course.

Assessing Impact

Student Learning: A standardized evaluation will measure student mastery of concepts. Student performance on tests, quizzes, and projects from previous semesters will be compared with a sample of similar learning artifacts from the redesigned course: pre-, post-, and follow-up tests (in subsequent courses) of statistical concepts.
Retention and Success Rates: Retention rates will be assessed by reviewing the percentage of students who performed poorly (D and F grades) and the percentage who needs to re-enroll in Statistics. For example, 15% of the students who currently take Statistics need to enroll more than once. Comparisons will be made for students in the redesigned course and students in the traditional course offered in the past. In order to assess student success, the team will use surveys to gather student reports of behavioral changes (e.g., time spent on reading out of class, enrollment in additional statistics courses).
Faculty Perceptions: Focus groups will be used to identify faculty perceptions of how well students are performing. Faculty from other departments will be asked to evaluate how well prepared students from the redesigned course are in subsequent courses requiring statistics knowledge.
Student Perceptions: Through surveys, students will be asked how well prepared they were for subsequent courses that had Statistics as a prerequisite.
Assessing Implementation
Ongoing, Informal Assessment: Classroom observations will be used as an ongoing means of assessing project implementation.
Questionnaires: The team will administer a survey to ascertain the level and quality of faculty support, training, and resources received. Student Quality Teams will gather student surveys of classroom processes.

Return to Top of Page

Tallahassee Community College

Summary by Peter Ewell

The course to be re-designed is College Composition, taken by approximately 3000 students each year and not completed by over 40% of them. The course is a key introductory course for further progress. At the same time, the skills it teaches are embedded in a required statewide proficiency test (CLAST), which all students must pass in order to move on. The re-design itself rests on a number of components including a new diagnostic testing system that will assign students to needed remediation modules “just in time,” enhanced tutorial assistance conducted on-line, automating most content transmission to allow instructors to concentrate on coaching in the writing process, and standardizing adjunct faculty training. As a community college, a principal problem to be overcome is enormous diversity among students in terms of demographics and preparation, and many aspects of the proposal appear designed to fit this context.

The assessment plan is extremely thoughtful and thorough. Its use of well-practiced common grading rubrics for written work is unusual and represents perhaps the best example of an established approach to authentic assessment that I’ve seen in these proposals over the years. It is tied directly to the grading process and is used in both College Composition and to evaluate writing in later English and humanities courses. The grading rubric will be applied to four standard writing assignment prompts administered in parallel in simultaneously-offered re-designed and traditional course sections. The course also uses a common final exam. In setting up this “quasi-experiment,” the course team has paid unusual attention to potential confounds by, among other things, eliminating data from traditional sections taught by faculty involved in the re-design process in order to control for any “halo” effect. CLAST—the common statewide achievement test in writing—provides another strong assessment feature as it allows external verification of performance. A well-described set of student surveys addressing student attitudes (like self-confidence) is to be administered before and after students take the course. Finally, students will be tracked into subsequent writing courses and their performance evaluated using the common grading rubric, and will be subsequently tracked to look at CLAST performance. A solid array of implementation measures including focus groups, monitoring faculty time on task, and looking at course assignments to verify the claim of growing standardization completes the design.

From an assessment standpoint, the Tallahassee Community College proposal is exemplary. It shows unusual sophistication about the nature and potential of assessment, is based on considerable past practice, and addresses a “non-science” topic that is not easy to assess.

The Assessment Plan

Prior Assessment Work

A variety of assessment tools are already in place. The department utilizes common grading criteria that address topic and purpose, organization and coherence, development, style, and grammar and mechanics. Specific descriptions within each of the areas are provided to distinguish between grades of A,B,C,D, and F, and faculty are trained in the interpretation of the criteria. The criteria were established collectively and are applied across all sections of College Composition. Although there are additional competencies for second-level English courses, the basic criteria established for distinguishing between letter grades in College Composition are also applied in the second-level courses. The existing exit standards are currently under review, and this process will include examining exit competencies at the state and national levels. Portfolios also are used to assess the progress of a student’s work and to determine the end of term course grade.

At the beginning and end of the semester, the student is given an English Language Skills test that tests sentence structure, grammar, punctuation, and editing. A post-test covering the same skills is given at the end of the semester. College Composition is also responsible for assessing the student’s skill level and reinforcing all of the College Level Academic Skills, including reading skills, English language skills, and writing skills. The list of individual learning outcomes is too extensive to relate in detail but includes 12 reading outcomes, 16 English language outcomes, and 24 writing outcomes. A pre-test and a post-test address these outcomes, and, within the context of the class, students complete four impromptu essays, including the departmental exit exam, that are facsimiles of the essay portion of the state College Level Academic Skills Test (CLAST). The four in-class impromptu essays are graded by using a 6-point scale that emerged from the state when the College Level Academic Skills Test was being developed. The reliability measure for this grading scale has been established at 0.92. Additionally, each paper is read by at least two readers. CLAST performance data are collected and analyzed at an institutional level and at the state level.

The Institutional Research Department maintains statistics on DWF rates and course completion data in all courses, as well as a wide range of demographic data. Students also evaluate courses on a regular basis, and these data are analyzed across sections to look for trends and patterns. Programs are already in place to track cohorts of students, and the framework exists for making valid comparisons between the performance and retention data of students enrolled in traditional sections compared to students enrolled in redesigned sections. This will not only be important for the pilot, but for longitudinal data to assess transfer of learning to subsequent English courses and other disciplines.

Assessing Impact

During the Spring 2001 semester, a pilot study will be conducted, with each of the English faculty who participated in the redesign teaching both a redesigned and a traditional section. One other full-time faculty member and two adjunct faculty, not involved in the redesign, also will teach both a redesigned and a traditional section.

Teacher variation is always a problem in this type of study. Teaching styles, personality traits, attitudes, and experience levels frequently have an impact on outcomes. Using the same faculty to teach both traditional and redesigned sections will allow for some comparisons that control for these variations in teacher characteristics. In order to control for a halo effect, data for traditional courses taught by redesign faculty will be compared to data for traditional courses taught by faculty not involved in the redesign process. Using other full-time and adjunct faculty who were not involved in the redesign will allow for comparisons between “expert” and “novice” practitioners with regard to understanding the purpose and focus of the redesign. Quantitative and qualitative data generated by full-time and adjunct faculty will be used to help refine and revise both the course and the orientation to the course.

Experimental and traditional sections will be selected so that morning, afternoon, and evening classes are included. The sections will be analyzed to determine equivalence with regard to student demographics, including age, gender, ethnicity, GPA, and placement score. Student data from the pre-test and post-test of grammar, mechanics, and reading comprehension will be used to assess the effectiveness of the interactive tutorials. Students in both traditional and redesigned sections will be given the same tests, although the traditional sections will use a paper and pencil version rather than a Web-based version. Of particular interest will be the degree of improvement. Therefore, the score on the pre-test will be used as a covariate in analyzing the post-test scores.

As stated earlier, part of the problem associated with the College Composition course is the highly diverse nature of the students and the difficulties encountered in addressing individual needs. With that in mind, performance and improvement data will be analyzed to examine effects of instruction on students by age, gender, ethnicity, the time of day, GPA, original placement, and whether students are ESL students or Learning Disabled students. A recent analysis of student success in College Composition shows that students who placed directly into College Composition have a success rate of 65%, while those who first completed a developmental course in English have a success rate of only 45%. For this reason, particular attention will be given to outcomes related to students who have completed a developmental course, as well as other groups who have traditionally had high failure rates, such as ESL, LD, and minority students.

With regard to written assignments, students complete four in-class impromptu writing assignments, including the final exam and four outside writing assignments. A standard set of topics will be established for the traditional and redesigned sections. A standardized method of evaluating the impromptu essays as well as a standardized method of arriving at letter grades for the outside writing assignments has already been established and will be used in grading each assignment. Any changes in standards or exit competencies that emerge from the current review will be applied to both traditional and redesigned sections. Comparisons will be made following each assignment, and data will be analyzed to obtain improvement data as well as performance data for each of the subgroups detailed above.

In addition to performance data generated through technology-based instructional programs and the well-defined grading criteria applied to written work during the course and on the departmental final exam, attitudinal, self-evaluation, and confidence data will be collected at the beginning and end of the course. At the beginning of the semester, students will be given a survey that asks questions about how well prepared they feel, their expectations with regard to their success, their motivation to improve their writing, their confidence in their ability to write, and the amount of time they expect to spend in course-related activities. A similar survey that addresses how well prepared students feel for the next course, their expectations for success as well as attitudinal, confidence, and time-on-task data will be given at the end of the semester. Surveys will be anonymous, but students will provide demographic data on the survey so that data can be analyzed by group as well as overall.

Faculty satisfaction and observation data will be collected through journals, surveys, and focus groups, and for subsequent semesters, faculty satisfaction and observation data in other English courses and other disciplines where written work is required also will be collected and analyzed.

Students completing College Composition must choose from one of the second-level English courses and, upon successful completion, choose two from among a variety of Humanities courses. Both the second-level English courses and the Humanities courses require extensive written work. Students in the initial experimental groups will be tracked through these courses to gather longitudinal data that examine performance and retention. These data will be compared to performance and retention data emerging from traditional courses. Once the redesigned course reaches full implementation, all students will be tracked through these courses and data compared to existing performance and retention data. Additional longitudinal data will be gathered for the initial group and subsequent groups with regard to the College Level Academic Skills requirement. Data will be gathered and examined to see if there is an increase in the number of students who exempt the College Level Academic Skills Test (CLAST) based on a C+ average in their two English courses. For those who must take the exam, data will be analyzed to see if scores increase and fewer students must repeat the exam.

Assessing Implementation

In terms of assessing the implementation of the project, several evaluation methods will be used. Faculty, staff, students, and technical support personnel will be surveyed to assess the effectiveness of materials, resources, technical support for faculty and students, and inter-group communications. Key project personnel will make weekly journal entries that will document implementation successes, problems, and unanticipated events and consequences for each aspect of the redesigned course, providing aggregate data on instructional and pedagogical issues, resource issues, personnel issues, and technical issues. Of particular importance will be faculty records indicating time-on-task for instructional activities. Given that a major intended outcome of the redesign is reduced time spent in preparation, diagnostic activities, and grading, it will be important to document the actual time spent performing these functions. Similarly, the Writing Center staff will document the activities in the Center.

Another intended outcome is greater standardization and consistency of assignments. This outcome will be validated through the collection of assignments and by analyzing them in terms of the requirements for each.

Focus groups made up of faculty members, Writing Center staff, technical support staff, and students will discuss the process in terms of “what worked, what didn’t, and what needs to be changed.” This assessment will take place during and immediately following the pilot so that revisions can be made, based on evaluative data. This process will be repeated during the full implementation phase during the Fall 2002 and the Spring 2003 semesters. A final piece of the evaluation will be to assess the actual cost savings in relation to the estimated cost savings.

Return to Top of Page

University of Central Florida

Summary by Peter Ewell

The redesign centers on an introductory American Government course which is taken by about three-quarters of entering freshmen to help fulfill general education requirements. The focus for the redesign involves the replacement of standard lecture sessions with Web-based modules that students access on their own. A set of relatively detailed learning outcomes has been identified for the course, and these can be used as guides for assessment. But the new course also involves the provision of new competencies, not established for the traditionally configured course. This will complicate the assessment problem somewhat, as students learning under the old and new formats will not only be learning in different ways, but also will be learning different things. The assessment plan acknowledges this problem and discusses some alternative ways to help deal with it.

The assessment plan is organized around the evaluation of outcomes and of student attitudes—as well as including a well-designed process evaluation component to examine implementation issues. With regard to outcomes, three standard assessment approaches will be used. The first is a straightforward comparison of completion rates (% C or better) between the old and new formats. Some sensitivity in analysis is shown even in this straightforward measure, as the discussion indicates that individual grade ranges will be examined to see if the formats have particular differential success rates for students at different points on the performance distribution. The second consists of an examination of attrition rates, both in the course and after completion. The third and most important is a set of common examinations designed especially to meet the learning goals identified for the course.

With regard to attitudes, the primary method will be surveys of students. The descriptions of what the surveys will cover are quite detailed and well thought through. In addition, unusual attention is paid to learning style—a measurement initiative that the university has been engaged in for a number of years.

A good deal of excellent attention is given to exactly how the resulting assessment data will be analyzed, addressing important issues of controlling for student background characteristics and other potentially biasing factors. Finally, the plan addresses a range of implementation issues through faculty focus groups and interviews.

The Assessment Plan

Prior Assessment Work

The University of Central Florida is conducting a comprehensive evaluation of its distributed learning initiative that concentrates on and affective outcomes for both students and faculty. The project is composed of summative and formative components. Evaluation of the American National Government course redesign will be based on components that have been in place for the past three years.

Assessing Impact

Student Learning: Traditional and redesigned sections will share a common core of information and analytic themes. Information and themes will be tested using the same or very similar exam instruments. The most important student outcome, substantive knowledge of American Government, will be measured in both redesigned and traditional courses. To assess learning and retention, students will take three tests: a pre-test during the first week of the term, a post-test at the end of the term and, for as many students as can be contacted, a second post-test at the end of the following term. The Political Science faculty, working with the evaluation team, will design and validate content-specific examinations that are common across traditional and redesigned courses. The instruments will cover a range of behaviors from recall of knowledge to higher-order thinking skills. The examinations will be content-validated through the curriculum design and course objectives. Course objectives will be organized according to the Structure of Learning Objectives (SOLO) taxonomy that is the basis of much critical thinking work at the University.

The effect of demographics on student learning will be monitored. Students tend to self-select traditional or media-enhanced courses for a number of reasons, none of which can be controlled, so that monitoring the "effects" of demographics on the outcomes of the evaluation is necessary. UCF is continually assessing demographic trends in its online and media-enhanced courses with respect to gender, ethnicity, and age. Other variables, such as ability and prior achievement, are co-varied.

The team will analyze data on student learning by using many techniques. For example, academic success will be regressed on section format. Demographics, learning style, prior ability, and achievement will use the logistic model. Differences in course-specific achievement will be determined with exact probabilities, and differential level performance assessed with segmentation analysis. Similar models will be used to assess the relationship of success and withdrawal to redesigned courses. Strict attention will be paid to nesting factors such as instructor, gender, and ethnicity. The data analysis will be based on data mining techniques.

Retention and Success Rates: The team will measure grades (the percentage of As, Bs, Cs, etc.) and the overall percentage of Cs or higher for students in the redesigned and traditional courses. It may well be that the redesigned course affects marginally engaged (D-range) students more than motivated (A- and B-range) students who are likely to succeed in any setting. They will also directly compare student attrition/drop-out rates across redesigned and traditional sections.
Student Perceptions: Student attitudes will be assessed using conventional survey instruments. Several survey questions will be aimed at communication of ideas and information, perceptions of student-student and student-information interaction, learning styles, and overall student satisfaction. Student-student and student-information interaction also will be assessed using data from the course management system.

Assessing Implementation

Ongoing, Informal Assessment: The process by which faculty convene around and accomplish curriculum redesign and identify objectives will be captured through observation and discussion.
Questionnaires: Student questionnaires will focus on implementation issues in addition to learning acquisition.
Focus Groups: Faculty will be asked to declare their personal theories of teaching and then, through a series of focus groups, identify shifts in those theories that they attribute to redesign. Teachers will describe changes in their roles and shifts in their expectations for students.

Return to Top of Page

University of Colorado-Boulder

Summary by Peter Ewell

The University of Colorado-Boulder’s intends to redesign Introductory Astronomy, a course taken by over 1,100 students per term to meet basic science distribution requirements. Because of the nature of the discipline, astronomy is a topic unusually suited to the use of Web-based images and simulations. The redesign itself concentrates on technology replacing traditional lecture time through the use of Web-based modules, "banquet-style" lecture hall formats, and the use of undergraduate "coaches" to work with small (nine-person) learning teams. This project plan is extremely thoughtful regarding the need to fuse technology-based redesign with other changes in pedagogies such as team-based approaches, "coaching" functions, and so on.

The assessment plan is thorough and doable, with appropriate attention to implementation and attitudinal issues as well as learning outcomes. It proposes to employ an external consultant from a university evaluation research center to aid with the evaluation and to serve as an integral part of the course team. This is excellent, so long as the resulting attitude is one of full team membership and not of "farming out" assessment so that core faculty do not have to deal with it. Nothing in this plan suggests that this might be the case; in fact, the proposal suggests quite the opposite.

The primary technique to be used in assessing content is common-item testing for comparing learning outcomes in the redesigned and traditional formats. The team also intends to administer an instrument designed to examine student attitudes toward science—a good idea for any introductory science course that will be taken primarily by non-majors.

The real innovation in this plan is their intent to conduct in-depth focus groups to look at a number of learner-pedagogical mode interactions. Over 100 such groups are planned, presumably under the direction of the Evaluation Center, to look at such matters as implementation issues, gender and race/ethnicity effects, as well as the actual learning effects of the proposed innovation. If successfully implemented, this is potentially an exemplary practice example.

The Assessment Plan

Prior Assessment Work

The University’s Assessment and Evaluation Group, consisting of full-time professionals and graduate students, works with faculty and students to conduct summative and formative evaluations of project effectiveness. Its overall goal is not only to assess and improve the effectiveness of teaching methods and activities but also to instill a culture of assessment at the University of Colorado-Boulder.

Many of the Web-based resources that will support course redesign have already been tested in both traditional large lecture sections (~216) and small classes (~30) to determine the effectiveness of those resources as well as other technology-based procedures (e.g., electronic discussions, submission of homework, and so on.)

Assessing Impact

Student Learning: To design and carry out the assessment program, the astronomy department will work with Dr. Elaine Seymour and her team (Ethnography & Evaluation Research) at the Bureau of Sociological Research of the University of Colorado, who have considerable experience in assessing undergraduate science education. Student learning will be assessed through testing and comparing test outcomes for traditional and redesigned sections. Participating faculty will work collaboratively to design a set of student assessments, monitor the quality of student work achieved by the revised testing strategies, adjust their testing strategies in light of the results, and keep a collective record of student scores on key items.
Comparison will occur in two ways. Traditional and redesigned sections will use many of the same exam questions. Also, some instructors will switch from the traditional to the redesigned format during the assessment period, allowing comparison of the formats’ effectiveness with the same instructor. When making these comparisons, the team will ensure that the common exam questions for comparison reflect learning goals that are common to both formats. Also, as part of the assessment procedures, they will attempt to measure the ability of each student during the first week of class and assign students to learning teams so that each team has a comparable mix of strong, average, and weaker students.

Retention and Success Rates: The team will monitor class attendance and completion throughout this initiative. Incoming students will be surveyed to determine their declared or intended majors. Students will also be surveyed at the end of the semester to check for changes in career intentions and causes of those changes. The impact of the course redesign on class attendance and career plans will be explored in focus groups.
Student Perceptions: Questionnaires will be administered at the beginning and end of the course as well as in-depth focus group interviews. Students in the astronomy course will be grouped according to their majors and some will be grouped according to gender and ethnicity. They will be asked about the following issues: 1) perceptions of the main learning goals of the course, 2) perceptions of their learning gains, 3) the degree to which they have benefited from the classroom pedagogy, 4) comparison with other forms of science teaching they have encountered, and 5) attitudes toward science and intended career paths.

Assessing Implementation

Ongoing, Informal Assessment: The team will provide formative feedback to the project participants via analysis of the interview data. Results will be shared as they emerge to allow faculty to make appropriate adjustments to their classroom strategies. GTAs will engage in regular discussion and decision-making concerning pedagogy, assessment strategies, and any other issues which they may raise.
Questionnaires: Questionnaires will be administered to students at the start and end of the term. Once interview responses are coded and patterns of codes develop, close-ended survey questions will be created to supplement routine course evaluations of future classes.
Focus Groups: UGTAs and GTAs will participate in focus group interviews about the effects of the experience on their career aspirations as science teachers.

Return to Top of Page

University of Dayton

Summary by Peter Ewell

The focus of the redesign is on an Introductory Psychology course currently taken by about half of the university’s students. The redesign itself is fairly radical and involves converting dissimilar faculty-delivered lecture sections into a distributed model that relies on online materials and common examinations. This is a considerable change in culture for faculty, and the agreement on common assessments in a field like psychology might be particularly interesting to watch.

As might be expected from a psychology department, the assessment plan incorporates a true experiment to assess impact. The experiment involves random assignment of students to “experimental” (redesign) and “control” (traditional) groups operating in parallel during the pilot phase of implementation. A range of instruments including objective tests, attitudinal and learning style instruments, and subsequent tracking will be applied as outcome measures, and the results will be compared. This is a very high-end strategy and there might be doubts about the ability of the course team to pull it off. But the narrative indicates that the team members have thought about most of the potential problems and threats to validity. At the same time, they seem to have had a successful experience in implementing a similar design for a previous, admittedly smaller, course. All told, this appears to be an exemplary assessment proposal.

The Assessment Plan

Prior Assessment Work

There is an overall assessment culture at the University of Dayton that lays the groundwork for assessing the impact of the redesigned course. In 1995, the University implemented a campus-wide learning assessment plan, based on its mission, that provides both administrative support to faculty and procedures for measuring student outcomes, securing resources to remedy deficiencies, and providing feedback to faculty and students. A redesigned course in advanced psychology has already been assessed, providing a research strategy to assess the outcomes of the newly redesigned Introductory Psychology course.

Assessing Impact

Student Learning: During the pilot phase, students will be randomly assigned to either the traditional course or the technology-enhanced course. Student learning will be assessed mostly through examination. Four objectively scored exams will be developed and used commonly in both the traditional and technology-enhanced sections of the course. The exams will assess both knowledge of content and critical thinking skills to determine how well students meet the six general learning objectives of the course. Students will take one site-based final exam as well. Student performance on each learning outcome measure will be compared to determine whether students in the enhanced course are performing differently than students in the traditional course.
Student Learning according to Student Type: The team will measure learning style using Kolb’s Learning Style Inventory. They will assess personality traits using the NEO-Five Factor Inventory, Form S. Both instruments have good reliability and validity and will help us better understand the impact of the redesigned course on different types of students.
Student Perceptions: Student attitudes will be measured by adding questions (about time, difficulty, interest, learning environment, etc.) to an established course evaluation form administered at the end of each semester.
Student Continuation in the Field: A baseline measure will be obtained in fall, 2000, to determine the proportion of students in the traditional Introductory Psychology course who then switch their major either into or out of psychology. Similar measures will be obtained during the pilot and full implementation of the technology-enhanced course.

Return to Top of Page

University of Massachusetts – Amherst

Summary by Peter Ewell

The proposed redesign is a redesign of an Introductory Biology class, a technology-enhanced version of a traditional course using ClassTalk to support classroom interaction. The changes proposed in the redesign include online quizzing, class preparation pages, enhanced interaction, and supplemental instruction. The incremental aspect of the proposal renders it especially interesting from an assessment standpoint as the campus proposes to compare outcomes across all three formats—traditional, ClassTalk, and redesigned course.

The assessment plan itself involves direct comparisons of performance on common examinations and quizzes among the three alternatives, together with an examination of pass rates and grades. A particular strength is the attention paid to impacts on particular sub-populations—for example, students experiencing lower success rates in traditional course formats according to historical records and students drawn from identifiable demographic groups. The proposal is especially sophisticated in its discussion of these interactions and how to control for them analytically and statistically. In addition to direct measures, a wide range of perceptual and implementation data will be collected, including videotaping of learning focus groups and interviews with students by trained facilitators from the Center for Teaching. Equally impressive is the way the campus intends to continuously monitor online usage of particular materials and resources and include this in the analysis.

This is an exemplary effort with respect to assessment. The plan is complex, but it is succinctly and clearly presented in a manner that demonstrates a thoughtful and realistic analysis of the problems involved. The three-way comparison is especially attractive here in conjunction with the monitoring of student usage of learning resources, as it should allow the team to begin to identify the effects of not only the total redesign but of its various components. Unlike many assessment designs, this should enable the team to determine not only whether the redesign is working for learning, but also why and for whom.

The Assessment Plan

Prior Assessment Work

The University of Massachusetts–Amherst has a culture of course-based assessment, designed and implemented with significant faculty input. Assessment is supported collaboratively with resources, expertise, and data from the University’s Office of Institutional Research, Office of Planning and Assessment, Center for Teaching, and Registrar. The course redesign team also has extensive assessment experience with Introductory Biology and has already used the ClassTalk technology to assess student learning. Additionally, independently trained facilitators will use the Center for Teaching’s midterm assessment that has been used for several years.

Assessing Impact

Student Learning in Class: Student learning will be assessed by reviewing 1) test outcomes and 2) in-class behavior. The traditional, ClassTalk, and redesigned sections of the course will use the same textbook assignments and will pursue the same biology department learning goals. Quizzes, hour exams, and lab assignments will test student knowledge of the same material, and the final exam will include common multiple choice questions for all course sections. The team will track the proportion of students who receive a C or better to see if student success rates improve when ClassTalk and online quizzes are included in the curriculum. Sections will be compared for performance on specific learning outcomes as well as overall grade performance. Student work groups will be videotaped in all three class environments. Student behaviors will be coded to yield a time-use analysis that should allow us to quantitatively compare them in the different environments. In addition, the team will gather qualitative data on the nature of student discussions during problem-solving.
Student Learning Out of Class: The team will track individual student use of online course preparation pages to show correlation between use and academic performance.
Student Learning according to Student Type: All student learning and perceptions will be related to student type. Data on age, gender, and academic ability (SAT scores, high school rank) will be used as independent variables in a multivariate model that explains student academic performance in the course. Grade outcomes will be reviewed to see if students with traditionally lower success rates have better success when innovation is introduced. Out-of-class preparation will be analyzed by different student populations. Measures of other independent variables will be obtained from student surveys, including study time, learning style, and participation in formal or informal learning communities.
Student Retention of Learning: The team will track student performance in future biology courses to see if students do better in them after using ClassTalk and online quizzes in the introductory course.
Student Perceptions: Student perceptions will be assessed through questionnaires given at the beginning and end of the semester. Students will provide information about their perceptions of the science profession and how scientists work, their own study and learning practices, their career goals and, on the end-of-course questionnaire, their learning experiences with the course. Additionally, a facilitator will interview three focus groups—five students from each section—biweekly. Focus groups will be videotaped for a video documentary. Students will also offer their perceptions of course design and teaching methodologies as part of each course’s midterm assessment process.
Faculty Perceptions: Faculty participating in the redesign will be interviewed prior to the start of the course to gather information on their expectations. A facilitator will interview faculty biweekly to gather their perceptions of the teaching and learning experiences. Interviews will be videotaped for the documentary video.

Assessing Implementation

Ongoing, Informal Assessment: The course redesign team will meet regularly to discuss implementation.
Time Accounting: The team will track faculty, staff, and teaching assistant time spent on course-related activities using categories from the course planning tool. These data will be compared with the estimates to determine the actual cost savings of the redesign. Unexpected increases in time spent on specific activities in the pilot year will be addressed by adjustments to the redesign plan.
Interviews: Faculty participating in the redesign will be interviewed before the start of the course to gather information on their expectations. A facilitator will interview faculty biweekly, and interviews will be videotaped to maintain an accurate record of the events in the redesign process.

Return to Top of Page

University of Wisconsin-Madison

Summary by Peter Ewell

The University of Wisconsin-Madison proposal centers on the redesign of a general Chemistry sequence taken by 2,400 students each semester. The main problems occasioning the redesign are high rates of non-completion and students’ lack of retention of concepts in subsequent coursework for which the general chemistry sequence is a prerequisite. The redesign itself concentrates on substituting Web-based modules for traditional lecture time. A good deal of preliminary work has already been accomplished with assessment, as course objectives have been developed and used and good national assessment guidance obtained through prior work with the National Science Foundation and The American Chemical Society. A particularly attractive feature of the design is the use of on-line assessments through which students can actively "test out" of modules which they have already mastered.

The assessment plan itself is divided into essentially two components—impact and implementation. The course team is working with a professional evaluator in both aspects, an individual with substantial experience in looking at science education. As in other cases in which this is done, care must be exercised to ensure that assessment activities are not totally "farmed out" so that faculty don’t have to deal with them. But there is little evidence that this will be the case from the narrative presented; indeed, the narrative reflects substantial sophistication and experience with assessment concepts.

Working with the evaluator, the team proposes to develop a tailored "experimental design" for each innovation to be tested—a particularly thorough design approach, which will allow a much more precise assignment of cause and effect. Special examination items are the principal method of assessment. Many of these items have been previously developed through a prior project using American Chemical Society content examinations that have been proven nationally. This is an especially powerful feature of the design. To adjust for the fact that content as well as pedagogy has changed, the team intends to use paired examination items—some of which are tailored to the older approach, some to the newer. This is a particularly useful approach that might be cited as a "best practice" for other programs facing the same difficulty.

In addition, the UW-Madison team will look at standard course completion statistics and, more importantly, will follow completers of the course into later coursework to examine performance. Subsequent tracking will involve cooperation with a campus center that has developed a longitudinal student tracking capability. The implementation component of the plan is equally well developed and centers on interviews and questionnaire surveys. Again, this is a technique with which the UWM team already has experience through a prior project.

All told, the UW-Madison proposal is exemplary from an assessment standpoint. Particularly noteworthy are the use of well-proven and appropriate national examinations, careful attention to changes in content between new and old formats, and the kinds of cooperative arrangements forged with evaluation and other support units on campus.

The Assessment Plan

Prior Assessment Work

During a previous project, the American Chemical Society Division of the Chemical Education Examinations Institute developed special content examinations for each semester of General Chemistry at the request of the UW-Madison chemistry department. These special examinations have been used for several years in the General Chemistry sequence at UW-Madison and therefore provide a baseline of student performance that can be used to measure the effect of our course redesign on performance. For each topic included in the examination, there are two types of questions—one designed to test conceptual understanding of the topic and a paired question in the format traditionally used in general chemistry courses.

As part of this previous project, UW-Madison collected data on student retention during each semester of General Chemistry. The department also collaborated with the LEAD (Learning through Evaluation, Adaptation, and Dissemination) Center on the UW-Madison campus to obtain information about what students do after they have taken the General Chemistry courses. They can track students through graduation to find out how many did graduate and in what majors, and they can obtain information about students’ success in subsequent courses and their retention rates in the university.

The special exams and student data provide a baseline for comparing students in the redesigned course with those in previous, more traditional sections of General Chemistry. Even though the team will not be able to obtain complete results until several years after this project has ended, they intend to collect and report such data.

Assessing Impact

Student Learning: The team will compare traditional and redesigned sections in terms of performance on course quizzes, exams, final exam, and course grade. The course will be divided into experimental and control groups. Each TA will have an experimental and a control group to minimize both TA and selection influences. The experimental sections will complete online homework that is also graded online to provide immediate feedback on errors and direction to study resources. The control sections will complete assigned problems from the textbook. If a significant difference in overall course performance is observed between the experimental and control groups, course grades will be normalized to ensure that neither group is penalized for its assignment to a particular treatment. Each component of the redesign (such as the effect of online homework) will be evaluated as it is completed as a means of assessing its importance to the overall course redesign.
In addition to comparing outcomes, the GALT (General Assessment of Logical Thinking), ACT/SAT math scores, and UW-Madison math placement scores will be used to divide students into subsets to analyze treatment effects on students with different logical thinking and mathematical abilities. Some students will also participate in structured interviews to obtain in-depth information about their understanding of concepts that were taught by the modules.
Retention and Success Rates: The chemistry department will continue their collaboration with the LEAD Center. They will compare retention and success rates for students in the traditional and redesigned courses. Students in both the traditional and redesigned courses will be in regard to success in subsequent courses, retention in the university, majors, and graduation statistics.
Student Perceptions: All students will be surveyed to determine how their attitudes about chemistry and science changed and how their confidence in their abilities to do science changed. As a result of a previous curriculum project funded by the National Science Foundation, the team has an attitude survey instrument that has been used with General Chemistry students for several years. They will supplement this survey with questions relating to the technology-based aspects of the course, taken from a project published by the Teaching, Learning, and Technology Group of the American Association for Higher Education. The combined survey will be used in this project.
Assessing Implementation
Faculty, Staff and TA Perceptions: As modules are developed for the redesigned course, chemistry faculty, staff, TAs, and faculty from client disciplines will be asked to evaluate the content and make suggestions for improvement. Based on this input, the modules’ content will be revised and modified.
Ongoing, Informal Assessment: Faculty, academic staff, TAs, and students will provide immediate feedback (reactions, problems) during each implementation stage. One of the project personnel assigned to troubleshoot implementation during the term will have a special e-mail account for problems. Another common forum will be created online where problems and their solutions can be aired.
Focus Groups: The designated troubleshooter will attend weekly staff meetings for each course affected by the redesign to ask faculty, staff, and TAs about their reactions and experiences.
Reports: At the end of the semester, the designated troubleshooter will report on implementation issues. This report will be used to revise procedures and materials for the following semester. It will also serve to document our problems and successes.
Questionnaires: Faculty, staff, TAs and students will be surveyed at the end of each semester regarding their experiences and to obtain their formative feedback regarding needed changes to the modules. Students will be asked what they thought about the modules and which aspects of the redesign were most used and most effective in helping them learn. A previously developed attitude survey instrument will be supplemented with questions relating to the technology-based aspects of the course, developed by the Flashlight Program, part of the Teaching, Learning, and Technology Group of the American Association for Higher Education. Each semester, the results from these surveys will be analyzed to determine whether the ways students learn have changed.

Return to Top of Page

Virginia Tech

Summary by Peter Ewell

The redesign effort at Virginia Tech centers on the Linear Algebra course taken by large numbers of students as a prerequisite. The focus of the redesign is to integrate the course fully into Virginia Tech’s Math Emporium, a 500-station laboratory setting with capabilities for flexible on-line delivery of content modules and assessments. Learning objectives are in place for the class, and a common final examination has been used for the course for the past five years. Both features mean that there is substantial assessment experience in place on which to build.

The assessment plan is based on a straightforward but solid combination of features. Assessment designs will be undertaken in cooperation with assessment expertise, as the Math department has its own assessment coordinator. In addition, the team plans to use the services of the American Association of Higher Education’s Teaching, Learning, and Technology group to consult on assessment.

The assessment design will employ a number of methods. First, a common final examination in place for five years is tied directly to learning goals on an item-by-item basis. This means that student performance patterns can be disaggregated by particular area of strength or weakness for particular types of students, providing a powerful assessment/analytical tool for determining impact. Student performance on regular quizzes will also be monitored.

Second, the Virginia Tech team plans to examine patterns of course completion and retention. These are conceptually straightforward, but the plan also notes a number of key ratios that are good summary measures based on readily available statistics that could be used in other settings. Third, explicit partnerships with "downstream" faculty in client disciplines such as Civil Engineering will allow student performance in subsequent coursework to be evaluated. Finally, questionnaires and focus groups will be used to gather data on perceptions, motivations, and satisfaction with the two formats.

The Assessment Plan

Prior Assessment Work

The Math Department employs an assessment coordinator who works with the department head and a faculty committee on assessment. Virginia Tech also has a contractual relationship with the Teaching, Learning, and Technology (TLT) Group, an affiliate of AAHE, through which consulting time on assessment will be obtained for periodic reviews. Baseline data for assessing the redesign of Linear Algebra exists from an ongoing assessment strategy for the course. The team will use data from fall semester, 1996, the last year before the Math Emporium opened, as the baseline. Included in the data are results of common quizzes, tests, exams, and an end-of-semester survey that gathered information on such things as students’ comfort levels and learning styles. On the basis of these results, in conjunction with course grade information, a more detailed assessment plan has been developed for the redesign to include additional information about student backgrounds, development, and changes in attitudes over the progress of the semester.

Assessing Impact

Student Learning: The Math Department is working to establish student achievement through validation and reliability studies of the common final exams given each semester. Each question is coded according to the learning goals it should measure and its difficulty as perceived by the faculty. All students in Linear Algebra take the common final exam, and question-by-question results are recorded. The exam has been refined through comparisons of overall results and analysis of specific questions. The math department correlates success on the question with total exam scores to judge effectiveness. They also compare teacher expectations with actual results to judge difficulty level and compare expectations to performance. Results on these tests will be the initial indicators of learning outcomes.

The goal of tightening the relationship between student learning and grade should be realized in part with the elimination of multiple sections, exams, and graders. It is possible, though, that the redesign could have unintended consequences in its effect on students who have problems adjusting to the new format or whose best learning styles have not been accommodated. Comparing grade outcomes with standard predictors and with results of the common exams should help identify remaining or new anomalies and suggest additional changes in the course.

Retention and Success Rates: Retention and success will be assessed in four ways. 1) The number of students registered at the beginning and end of the course will be compared. 2) On the common final exam, the ratio of no-shows to students still registered in the course will measure how many have essentially given up. 3) The percent of students who achieve a grade of C or higher in the course provides a measure of how well they are prepared to succeed in courses that use the material of Linear Algebra. 4) Data currently in hand and data collected as the project proceeds will be examined in order to establish useful predictors of grade success in the technology-enhanced setting.
Faculty Perceptions: A faculty advisory board including faculty from other departments will provide feedback on the quality of the course. Faculty from "downstream" courses will be asked to offer their perceptions of student competency. Math department faculty will work with faculty from "downstream" courses in a Partnership Program to develop diagnostic quizzes based on the topics and terminology appropriate to those courses. For example, Linear Algebra will partner with Civil Engineering faculty to record results of student diagnostic quizzes for assessment purposes. Additionally, an opinion survey of math department teachers has been conducted annually since 1998. The survey will determine faculty perception of how courses run in the new setting and the quality and nature of teaching and learning.
Student Perceptions: The team will investigate perceptions through selected questions on student opinion surveys, to be administered at the beginning and end of the course. Specifically, pre- and post- surveys will ascertain the following: students’ attitudes, high school backgrounds, familiarity with certain core linear algebra concepts, familiarity and comfort level with technology, comfort level with using desk top computers to use mathematical skills and concepts. Prior to the beginning of the semester, an additional survey will be administered to all students taking a course in the Math Emporium. This survey will include many of the questions on the surveys for Linear Algebra, so that data can be used to correlate and help validate the responses provided by students in Linear Algebra. Finally, the team will conduct focus groups of students in Linear Algebra to help assessors and faculty get a more personal understanding of the students’ feelings, likes, dislikes, and ideas for change.

Assessing Implementation

Reports: At the end of each semester, a team consisting of one of the course designers and one member of the department’s assessment committee will write a brief report on each large-enrollment course. The report will discuss the outcomes of the common final exam and noting areas of concern for course faculty and the departmental curriculum committee. For Linear Algebra during the period when materials and format are under development, the report will also summarize information on progress toward implementation.
Questionnaires: A student questionnaire covering all facets of the course, including course materials (satisfaction and level of use); Math Emporium facilities and staff; and personal reaction to the course content, organization, and learning experience will be administered.
Focus Groups: Software development personnel, teaching faculty, and Math Emporium staff will be interviewed on progress and experience with the Web-based system.

Return to Top of Page