![]() |
|
|||||||||||
|
|
|
|
|
|||||||||
|
|
||||||||||||
|
|
Assessing the Impact of Course Redesign: Peter Ewell, Senior Associate at NCEMS, describes three basic assessment data-collection approaches and pertinent assessment references.
Sample Assessment Plans The following examples show how nine of the institutions participating in the Pew Grant Program in Course Redesign will assess the impact and implementation of their redesigns.
Summary by Peter Ewell The redesign is a second generation effort to revamp Introduction to Statistical Reasoning using a dynamic “intelligent tutoring system” (ITS) that provides help to students as they engage in hands-on statistical problem-solving. The ITS provides many opportunities for assessment because of the “footprint” data that students leave in the software itself as they navigate the system. The institution’s assessment plan takes full advantage of the ITS capabilities. The assessment plan itself is extremely sophisticated, involving both formative and summative components. One especially innovative aspect is the plan to use paid student subjects who have already taken statistics in an old format to pilot new course material in the form of either the ITS or a stand-alone statistics package with written instructions (as in the first generation re-design). This is a true “clinical traits” approach which is exemplary in the extreme. A second very strong element of the plan involves exploiting the step-by-step capture capability of the ITS itself to examine exactly how students are approaching problems and where they are getting into difficulty. A particularly strong feature here is the use of specially-designed “cognitive transfer” exercises that examine students’ ability to apply learned concepts in new settings. Transfer exercises will be used in both the traditional (first generation) and redesigned (second generation) courses. Third, the assessment design uses an existing pre-post assessment instrument to examine impact. A real strength here is the fact that the course team has historical data on this exam not only from students taking statistics recently, but also from students who have not taken it for some time. Longitudinal tracking of students to downstream courses completes the assessment design. In an interesting contrast to other proposals, the Carnegie Mellon team does not intend to administer questionnaires to examine student attitudes and behavior. Though not stated, this is largely because most of the behavioral evidence will already have been unobtrusively collected by the students’ interactions with course material through the ITS. This is likely to provide an interesting “second-order” cost savings by reducing the potential overhead of assessment itself. This is an exemplary assessment plan showing considerable sophistication, incorporating solid past experience, and demonstrating real recognition of the power of the technology itself to assist in the assessment effort. The Assessment Plan Prior Assessment Work Carnegie Mellon has been collecting data from students taking Introduction to Statistical Reasoning. Pre- and post-tests are used to assess student learning gains within the course. Because the same tests are administered in all semesters, they can be used to compare students in the current course with students who have not taken the course for a number of years, forming a baseline about learning outcomes in the course as it is taught now. Thus, the institution can compare the learning gains of students in the newly-redesigned learning environment with the baseline measures already collected from students taking the current version of the course. Assessing Impact
Summary by Peter Ewell The project at Penn State centers on the redesign of Statistics, a course that not only enrolls large numbers of students but also serves as a prerequisite for a variety of client disciplines at the university. The opportunity for a substantial "multiplier effect" is also present for this course because it is implemented systemwide across twenty Penn State regional campuses. The emphasis of the redesign is placed on greater modularization of course components (delivered through technology), greater use of quizzes and quick feedback, and a lab emphasis. This project plan shows unusual sensitivity to key issues of implementation and delivery—elements that will make or break a change effort. The assessment plan indicates that the campus has already had substantial experience in developing assessment approaches and tools to examine the effects of course redesign. A major resource here is the Schreyer Institute, which is providing a lot of the support needed to develop and institute assessment. Indeed, Schreyer already worked with the statistics team implementing the course to develop assessment approaches in the past. To assess student learning, a standard test of content mastery developed by the faculty will be used. The test proposed is new for the redesign, so direct comparisons of results for students taught traditionally cannot be made because the last traditional sections were offered last year. To address this weakness, the course team will use quiz questions that can be so compared. More significantly, the assessment plan anticipates going "downstream" to examine the performance of students in later classes that require knowledge of statistics. Actual test performance statistics will be compiled with the cooperation of faculty in client disciplines. Focus groups will also be held with faculty from these disciplines, centering on the kinds of strengths and weaknesses in statistics exhibited by students trained in the old and new formats. Students themselves will be surveyed after they leave Statistics and enroll in classes that require statistics skills and concepts to determine their perceptions of relevance and mastery of the material. The assessment plan also shows unusual attention to issues of implementation. The proposed survey of "time on task" for the faculty and TAs is a particularly good idea, as is the use of Schreyer-designed "Student Quality Teams" in such courses. Both are candidates for best practices that could be showcased for others. Also exemplary is the proposal for engaging in actual classroom observation. Overall this is an exemplary proposal from the assessment standpoint. It shows unusual attention to the real effectiveness questions implied by a pre-requisite course—later class performance—while also demonstrating unusual thoroughness to implementation issues at the micro level. Recognizing the inherent weakness in a "before-after" innovation design in which there is no parallel control group, the testing process at least has some comparative quiz benchmarks that can be used. The design team correctly resisted the temptation to offer traditional sections "just for assessment," an impractical idea if they are convinced of the efficacy of what they are doing. The Assessment Plan Prior Assessment Work The statistics department has already collaborated with Penn State’s Schreyer Institute for Innovation in Learning on an assessment plan for their current courses. The future assessment program will use existing and new assessment methods to focus on four major areas: student learning, student attitudes, cost effectiveness, and program implementation. Additionally, Student Quality Teams (students who monitor the work processes of instructors and other students to find opportunities that lead to better learning) and measures of student learning and attitudes have already been designed and implemented in the assessment of Statistics. These will continue to be a part of the assessment plan for the redesigned course. Assessing Impact
Summary by Peter Ewell The course to be re-designed is College Composition, taken by approximately 3000 students each year and not completed by over 40% of them. The course is a key introductory course for further progress. At the same time, the skills it teaches are embedded in a required statewide proficiency test (CLAST), which all students must pass in order to move on. The re-design itself rests on a number of components including a new diagnostic testing system that will assign students to needed remediation modules “just in time,” enhanced tutorial assistance conducted on-line, automating most content transmission to allow instructors to concentrate on coaching in the writing process, and standardizing adjunct faculty training. As a community college, a principal problem to be overcome is enormous diversity among students in terms of demographics and preparation, and many aspects of the proposal appear designed to fit this context. The assessment plan is extremely thoughtful and thorough. Its use of well-practiced common grading rubrics for written work is unusual and represents perhaps the best example of an established approach to authentic assessment that I’ve seen in these proposals over the years. It is tied directly to the grading process and is used in both College Composition and to evaluate writing in later English and humanities courses. The grading rubric will be applied to four standard writing assignment prompts administered in parallel in simultaneously-offered re-designed and traditional course sections. The course also uses a common final exam. In setting up this “quasi-experiment,” the course team has paid unusual attention to potential confounds by, among other things, eliminating data from traditional sections taught by faculty involved in the re-design process in order to control for any “halo” effect. CLAST—the common statewide achievement test in writing—provides another strong assessment feature as it allows external verification of performance. A well-described set of student surveys addressing student attitudes (like self-confidence) is to be administered before and after students take the course. Finally, students will be tracked into subsequent writing courses and their performance evaluated using the common grading rubric, and will be subsequently tracked to look at CLAST performance. A solid array of implementation measures including focus groups, monitoring faculty time on task, and looking at course assignments to verify the claim of growing standardization completes the design. From an assessment standpoint, the Tallahassee Community College proposal is exemplary. It shows unusual sophistication about the nature and potential of assessment, is based on considerable past practice, and addresses a “non-science” topic that is not easy to assess. The Assessment Plan Prior Assessment Work A variety of assessment tools are already in place. The department utilizes common grading criteria that address topic and purpose, organization and coherence, development, style, and grammar and mechanics. Specific descriptions within each of the areas are provided to distinguish between grades of A,B,C,D, and F, and faculty are trained in the interpretation of the criteria. The criteria were established collectively and are applied across all sections of College Composition. Although there are additional competencies for second-level English courses, the basic criteria established for distinguishing between letter grades in College Composition are also applied in the second-level courses. The existing exit standards are currently under review, and this process will include examining exit competencies at the state and national levels. Portfolios also are used to assess the progress of a student’s work and to determine the end of term course grade. At the beginning and end of the semester, the student is given an English Language Skills test that tests sentence structure, grammar, punctuation, and editing. A post-test covering the same skills is given at the end of the semester. College Composition is also responsible for assessing the student’s skill level and reinforcing all of the College Level Academic Skills, including reading skills, English language skills, and writing skills. The list of individual learning outcomes is too extensive to relate in detail but includes 12 reading outcomes, 16 English language outcomes, and 24 writing outcomes. A pre-test and a post-test address these outcomes, and, within the context of the class, students complete four impromptu essays, including the departmental exit exam, that are facsimiles of the essay portion of the state College Level Academic Skills Test (CLAST). The four in-class impromptu essays are graded by using a 6-point scale that emerged from the state when the College Level Academic Skills Test was being developed. The reliability measure for this grading scale has been established at 0.92. Additionally, each paper is read by at least two readers. CLAST performance data are collected and analyzed at an institutional level and at the state level. The Institutional Research Department maintains statistics on DWF rates and course completion data in all courses, as well as a wide range of demographic data. Students also evaluate courses on a regular basis, and these data are analyzed across sections to look for trends and patterns. Programs are already in place to track cohorts of students, and the framework exists for making valid comparisons between the performance and retention data of students enrolled in traditional sections compared to students enrolled in redesigned sections. This will not only be important for the pilot, but for longitudinal data to assess transfer of learning to subsequent English courses and other disciplines. Assessing Impact During the Spring 2001 semester, a pilot study will be conducted, with each of the English faculty who participated in the redesign teaching both a redesigned and a traditional section. One other full-time faculty member and two adjunct faculty, not involved in the redesign, also will teach both a redesigned and a traditional section. Teacher variation is always a problem in this type of study. Teaching styles, personality traits, attitudes, and experience levels frequently have an impact on outcomes. Using the same faculty to teach both traditional and redesigned sections will allow for some comparisons that control for these variations in teacher characteristics. In order to control for a halo effect, data for traditional courses taught by redesign faculty will be compared to data for traditional courses taught by faculty not involved in the redesign process. Using other full-time and adjunct faculty who were not involved in the redesign will allow for comparisons between “expert” and “novice” practitioners with regard to understanding the purpose and focus of the redesign. Quantitative and qualitative data generated by full-time and adjunct faculty will be used to help refine and revise both the course and the orientation to the course. Experimental and traditional sections will be selected so that morning, afternoon, and evening classes are included. The sections will be analyzed to determine equivalence with regard to student demographics, including age, gender, ethnicity, GPA, and placement score. Student data from the pre-test and post-test of grammar, mechanics, and reading comprehension will be used to assess the effectiveness of the interactive tutorials. Students in both traditional and redesigned sections will be given the same tests, although the traditional sections will use a paper and pencil version rather than a Web-based version. Of particular interest will be the degree of improvement. Therefore, the score on the pre-test will be used as a covariate in analyzing the post-test scores. As stated earlier, part of the problem associated with the College Composition course is the highly diverse nature of the students and the difficulties encountered in addressing individual needs. With that in mind, performance and improvement data will be analyzed to examine effects of instruction on students by age, gender, ethnicity, the time of day, GPA, original placement, and whether students are ESL students or Learning Disabled students. A recent analysis of student success in College Composition shows that students who placed directly into College Composition have a success rate of 65%, while those who first completed a developmental course in English have a success rate of only 45%. For this reason, particular attention will be given to outcomes related to students who have completed a developmental course, as well as other groups who have traditionally had high failure rates, such as ESL, LD, and minority students. With regard to written assignments, students complete four in-class impromptu writing assignments, including the final exam and four outside writing assignments. A standard set of topics will be established for the traditional and redesigned sections. A standardized method of evaluating the impromptu essays as well as a standardized method of arriving at letter grades for the outside writing assignments has already been established and will be used in grading each assignment. Any changes in standards or exit competencies that emerge from the current review will be applied to both traditional and redesigned sections. Comparisons will be made following each assignment, and data will be analyzed to obtain improvement data as well as performance data for each of the subgroups detailed above. In addition to performance data generated through technology-based instructional programs and the well-defined grading criteria applied to written work during the course and on the departmental final exam, attitudinal, self-evaluation, and confidence data will be collected at the beginning and end of the course. At the beginning of the semester, students will be given a survey that asks questions about how well prepared they feel, their expectations with regard to their success, their motivation to improve their writing, their confidence in their ability to write, and the amount of time they expect to spend in course-related activities. A similar survey that addresses how well prepared students feel for the next course, their expectations for success as well as attitudinal, confidence, and time-on-task data will be given at the end of the semester. Surveys will be anonymous, but students will provide demographic data on the survey so that data can be analyzed by group as well as overall. Faculty satisfaction and observation data will be collected through journals, surveys, and focus groups, and for subsequent semesters, faculty satisfaction and observation data in other English courses and other disciplines where written work is required also will be collected and analyzed. Students completing College Composition must choose from one of the second-level English courses and, upon successful completion, choose two from among a variety of Humanities courses. Both the second-level English courses and the Humanities courses require extensive written work. Students in the initial experimental groups will be tracked through these courses to gather longitudinal data that examine performance and retention. These data will be compared to performance and retention data emerging from traditional courses. Once the redesigned course reaches full implementation, all students will be tracked through these courses and data compared to existing performance and retention data. Additional longitudinal data will be gathered for the initial group and subsequent groups with regard to the College Level Academic Skills requirement. Data will be gathered and examined to see if there is an increase in the number of students who exempt the College Level Academic Skills Test (CLAST) based on a C+ average in their two English courses. For those who must take the exam, data will be analyzed to see if scores increase and fewer students must repeat the exam. Assessing Implementation In terms of assessing the implementation of the project, several evaluation methods will be used. Faculty, staff, students, and technical support personnel will be surveyed to assess the effectiveness of materials, resources, technical support for faculty and students, and inter-group communications. Key project personnel will make weekly journal entries that will document implementation successes, problems, and unanticipated events and consequences for each aspect of the redesigned course, providing aggregate data on instructional and pedagogical issues, resource issues, personnel issues, and technical issues. Of particular importance will be faculty records indicating time-on-task for instructional activities. Given that a major intended outcome of the redesign is reduced time spent in preparation, diagnostic activities, and grading, it will be important to document the actual time spent performing these functions. Similarly, the Writing Center staff will document the activities in the Center. Another intended outcome is greater standardization and consistency of assignments. This outcome will be validated through the collection of assignments and by analyzing them in terms of the requirements for each. Focus groups made up of faculty members, Writing Center staff, technical support staff, and students will discuss the process in terms of “what worked, what didn’t, and what needs to be changed.” This assessment will take place during and immediately following the pilot so that revisions can be made, based on evaluative data. This process will be repeated during the full implementation phase during the Fall 2002 and the Spring 2003 semesters. A final piece of the evaluation will be to assess the actual cost savings in relation to the estimated cost savings.
Summary by Peter Ewell The redesign centers on an introductory American Government course which is taken by about three-quarters of entering freshmen to help fulfill general education requirements. The focus for the redesign involves the replacement of standard lecture sessions with Web-based modules that students access on their own. A set of relatively detailed learning outcomes has been identified for the course, and these can be used as guides for assessment. But the new course also involves the provision of new competencies, not established for the traditionally configured course. This will complicate the assessment problem somewhat, as students learning under the old and new formats will not only be learning in different ways, but also will be learning different things. The assessment plan acknowledges this problem and discusses some alternative ways to help deal with it. The assessment plan is organized around the evaluation of outcomes and of student attitudes—as well as including a well-designed process evaluation component to examine implementation issues. With regard to outcomes, three standard assessment approaches will be used. The first is a straightforward comparison of completion rates (% C or better) between the old and new formats. Some sensitivity in analysis is shown even in this straightforward measure, as the discussion indicates that individual grade ranges will be examined to see if the formats have particular differential success rates for students at different points on the performance distribution. The second consists of an examination of attrition rates, both in the course and after completion. The third and most important is a set of common examinations designed especially to meet the learning goals identified for the course. With regard to attitudes, the primary method will be surveys of students. The descriptions of what the surveys will cover are quite detailed and well thought through. In addition, unusual attention is paid to learning style—a measurement initiative that the university has been engaged in for a number of years. A good deal of excellent attention is given to exactly how the resulting assessment data will be analyzed, addressing important issues of controlling for student background characteristics and other potentially biasing factors. Finally, the plan addresses a range of implementation issues through faculty focus groups and interviews. The Assessment Plan Prior Assessment Work The University of Central Florida is conducting a comprehensive evaluation of its distributed learning initiative that concentrates on and affective outcomes for both students and faculty. The project is composed of summative and formative components. Evaluation of the American National Government course redesign will be based on components that have been in place for the past three years. Assessing Impact Student Learning: Traditional and redesigned sections will share a common core of information and analytic themes. Information and themes will be tested using the same or very similar exam instruments. The most important student outcome, substantive knowledge of American Government, will be measured in both redesigned and traditional courses. To assess learning and retention, students will take three tests: a pre-test during the first week of the term, a post-test at the end of the term and, for as many students as can be contacted, a second post-test at the end of the following term. The Political Science faculty, working with the evaluation team, will design and validate content-specific examinations that are common across traditional and redesigned courses. The instruments will cover a range of behaviors from recall of knowledge to higher-order thinking skills. The examinations will be content-validated through the curriculum design and course objectives. Course objectives will be organized according to the Structure of Learning Objectives (SOLO) taxonomy that is the basis of much critical thinking work at the University. The effect of demographics on student learning will be monitored. Students tend to self-select traditional or media-enhanced courses for a number of reasons, none of which can be controlled, so that monitoring the "effects" of demographics on the outcomes of the evaluation is necessary. UCF is continually assessing demographic trends in its online and media-enhanced courses with respect to gender, ethnicity, and age. Other variables, such as ability and prior achievement, are co-varied. The team will analyze data on student learning by using many techniques. For example, academic success will be regressed on section format. Demographics, learning style, prior ability, and achievement will use the logistic model. Differences in course-specific achievement will be determined with exact probabilities, and differential level performance assessed with segmentation analysis. Similar models will be used to assess the relationship of success and withdrawal to redesigned courses. Strict attention will be paid to nesting factors such as instructor, gender, and ethnicity. The data analysis will be based on data mining techniques.
University of Colorado-Boulder Summary by Peter Ewell The University of Colorado-Boulder’s intends to redesign Introductory Astronomy, a course taken by over 1,100 students per term to meet basic science distribution requirements. Because of the nature of the discipline, astronomy is a topic unusually suited to the use of Web-based images and simulations. The redesign itself concentrates on technology replacing traditional lecture time through the use of Web-based modules, "banquet-style" lecture hall formats, and the use of undergraduate "coaches" to work with small (nine-person) learning teams. This project plan is extremely thoughtful regarding the need to fuse technology-based redesign with other changes in pedagogies such as team-based approaches, "coaching" functions, and so on. The assessment plan is thorough and doable, with appropriate attention to implementation and attitudinal issues as well as learning outcomes. It proposes to employ an external consultant from a university evaluation research center to aid with the evaluation and to serve as an integral part of the course team. This is excellent, so long as the resulting attitude is one of full team membership and not of "farming out" assessment so that core faculty do not have to deal with it. Nothing in this plan suggests that this might be the case; in fact, the proposal suggests quite the opposite. The primary technique to be used in assessing content is common-item testing for comparing learning outcomes in the redesigned and traditional formats. The team also intends to administer an instrument designed to examine student attitudes toward science—a good idea for any introductory science course that will be taken primarily by non-majors. The real innovation in this plan is their intent to conduct in-depth focus groups to look at a number of learner-pedagogical mode interactions. Over 100 such groups are planned, presumably under the direction of the Evaluation Center, to look at such matters as implementation issues, gender and race/ethnicity effects, as well as the actual learning effects of the proposed innovation. If successfully implemented, this is potentially an exemplary practice example. The Assessment Plan Prior Assessment Work The University’s Assessment and Evaluation Group, consisting of full-time professionals and graduate students, works with faculty and students to conduct summative and formative evaluations of project effectiveness. Its overall goal is not only to assess and improve the effectiveness of teaching methods and activities but also to instill a culture of assessment at the University of Colorado-Boulder. Many of the Web-based resources that will support course redesign have already been tested in both traditional large lecture sections (~216) and small classes (~30) to determine the effectiveness of those resources as well as other technology-based procedures (e.g., electronic discussions, submission of homework, and so on.) Assessing Impact Student Learning: To design and carry out the assessment program, the astronomy department will work with Dr. Elaine Seymour and her team (Ethnography & Evaluation Research) at the Bureau of Sociological Research of the University of Colorado, who have considerable experience in assessing undergraduate science education. Student learning will be assessed through testing and comparing test outcomes for traditional and redesigned sections. Participating faculty will work collaboratively to design a set of student assessments, monitor the quality of student work achieved by the revised testing strategies, adjust their testing strategies in light of the results, and keep a collective record of student scores on key items.
Assessing Implementation
Summary by Peter Ewell The focus of the redesign is on an Introductory Psychology course currently taken by about half of the university’s students. The redesign itself is fairly radical and involves converting dissimilar faculty-delivered lecture sections into a distributed model that relies on online materials and common examinations. This is a considerable change in culture for faculty, and the agreement on common assessments in a field like psychology might be particularly interesting to watch. As might be expected from a psychology department, the assessment plan incorporates a true experiment to assess impact. The experiment involves random assignment of students to “experimental” (redesign) and “control” (traditional) groups operating in parallel during the pilot phase of implementation. A range of instruments including objective tests, attitudinal and learning style instruments, and subsequent tracking will be applied as outcome measures, and the results will be compared. This is a very high-end strategy and there might be doubts about the ability of the course team to pull it off. But the narrative indicates that the team members have thought about most of the potential problems and threats to validity. At the same time, they seem to have had a successful experience in implementing a similar design for a previous, admittedly smaller, course. All told, this appears to be an exemplary assessment proposal. The Assessment Plan Prior Assessment Work There is an overall assessment culture at the University of Dayton that lays the groundwork for assessing the impact of the redesigned course. In 1995, the University implemented a campus-wide learning assessment plan, based on its mission, that provides both administrative support to faculty and procedures for measuring student outcomes, securing resources to remedy deficiencies, and providing feedback to faculty and students. A redesigned course in advanced psychology has already been assessed, providing a research strategy to assess the outcomes of the newly redesigned Introductory Psychology course. Assessing Impact
University of Massachusetts – Amherst Summary by Peter Ewell The proposed redesign is a redesign of an Introductory Biology class, a technology-enhanced version of a traditional course using ClassTalk to support classroom interaction. The changes proposed in the redesign include online quizzing, class preparation pages, enhanced interaction, and supplemental instruction. The incremental aspect of the proposal renders it especially interesting from an assessment standpoint as the campus proposes to compare outcomes across all three formats—traditional, ClassTalk, and redesigned course. The assessment plan itself involves direct comparisons of performance on common examinations and quizzes among the three alternatives, together with an examination of pass rates and grades. A particular strength is the attention paid to impacts on particular sub-populations—for example, students experiencing lower success rates in traditional course formats according to historical records and students drawn from identifiable demographic groups. The proposal is especially sophisticated in its discussion of these interactions and how to control for them analytically and statistically. In addition to direct measures, a wide range of perceptual and implementation data will be collected, including videotaping of learning focus groups and interviews with students by trained facilitators from the Center for Teaching. Equally impressive is the way the campus intends to continuously monitor online usage of particular materials and resources and include this in the analysis. This is an exemplary effort with respect to assessment. The plan is complex, but it is succinctly and clearly presented in a manner that demonstrates a thoughtful and realistic analysis of the problems involved. The three-way comparison is especially attractive here in conjunction with the monitoring of student usage of learning resources, as it should allow the team to begin to identify the effects of not only the total redesign but of its various components. Unlike many assessment designs, this should enable the team to determine not only whether the redesign is working for learning, but also why and for whom. The Assessment Plan Prior Assessment Work The University of Massachusetts–Amherst has a culture of course-based assessment, designed and implemented with significant faculty input. Assessment is supported collaboratively with resources, expertise, and data from the University’s Office of Institutional Research, Office of Planning and Assessment, Center for Teaching, and Registrar. The course redesign team also has extensive assessment experience with Introductory Biology and has already used the ClassTalk technology to assess student learning. Additionally, independently trained facilitators will use the Center for Teaching’s midterm assessment that has been used for several years. Assessing Impact
Assessing Implementation
University of Wisconsin-Madison Summary by Peter Ewell The University of Wisconsin-Madison proposal centers on the redesign of a general Chemistry sequence taken by 2,400 students each semester. The main problems occasioning the redesign are high rates of non-completion and students’ lack of retention of concepts in subsequent coursework for which the general chemistry sequence is a prerequisite. The redesign itself concentrates on substituting Web-based modules for traditional lecture time. A good deal of preliminary work has already been accomplished with assessment, as course objectives have been developed and used and good national assessment guidance obtained through prior work with the National Science Foundation and The American Chemical Society. A particularly attractive feature of the design is the use of on-line assessments through which students can actively "test out" of modules which they have already mastered. The assessment plan itself is divided into essentially two components—impact and implementation. The course team is working with a professional evaluator in both aspects, an individual with substantial experience in looking at science education. As in other cases in which this is done, care must be exercised to ensure that assessment activities are not totally "farmed out" so that faculty don’t have to deal with them. But there is little evidence that this will be the case from the narrative presented; indeed, the narrative reflects substantial sophistication and experience with assessment concepts. Working with the evaluator, the team proposes to develop a tailored "experimental design" for each innovation to be tested—a particularly thorough design approach, which will allow a much more precise assignment of cause and effect. Special examination items are the principal method of assessment. Many of these items have been previously developed through a prior project using American Chemical Society content examinations that have been proven nationally. This is an especially powerful feature of the design. To adjust for the fact that content as well as pedagogy has changed, the team intends to use paired examination items—some of which are tailored to the older approach, some to the newer. This is a particularly useful approach that might be cited as a "best practice" for other programs facing the same difficulty. In addition, the UW-Madison team will look at standard course completion statistics and, more importantly, will follow completers of the course into later coursework to examine performance. Subsequent tracking will involve cooperation with a campus center that has developed a longitudinal student tracking capability. The implementation component of the plan is equally well developed and centers on interviews and questionnaire surveys. Again, this is a technique with which the UWM team already has experience through a prior project. All told, the UW-Madison proposal is exemplary from an assessment standpoint. Particularly noteworthy are the use of well-proven and appropriate national examinations, careful attention to changes in content between new and old formats, and the kinds of cooperative arrangements forged with evaluation and other support units on campus. The Assessment Plan Prior Assessment Work During a previous project, the American Chemical Society Division of the Chemical Education Examinations Institute developed special content examinations for each semester of General Chemistry at the request of the UW-Madison chemistry department. These special examinations have been used for several years in the General Chemistry sequence at UW-Madison and therefore provide a baseline of student performance that can be used to measure the effect of our course redesign on performance. For each topic included in the examination, there are two types of questions—one designed to test conceptual understanding of the topic and a paired question in the format traditionally used in general chemistry courses. As part of this previous project, UW-Madison collected data on student retention during each semester of General Chemistry. The department also collaborated with the LEAD (Learning through Evaluation, Adaptation, and Dissemination) Center on the UW-Madison campus to obtain information about what students do after they have taken the General Chemistry courses. They can track students through graduation to find out how many did graduate and in what majors, and they can obtain information about students’ success in subsequent courses and their retention rates in the university. The special exams and student data provide a baseline for comparing students in the redesigned course with those in previous, more traditional sections of General Chemistry. Even though the team will not be able to obtain complete results until several years after this project has ended, they intend to collect and report such data. Assessing Impact
Summary by Peter Ewell The redesign effort at Virginia Tech centers on the Linear Algebra course taken by large numbers of students as a prerequisite. The focus of the redesign is to integrate the course fully into Virginia Tech’s Math Emporium, a 500-station laboratory setting with capabilities for flexible on-line delivery of content modules and assessments. Learning objectives are in place for the class, and a common final examination has been used for the course for the past five years. Both features mean that there is substantial assessment experience in place on which to build. The assessment plan is based on a straightforward but solid combination of features. Assessment designs will be undertaken in cooperation with assessment expertise, as the Math department has its own assessment coordinator. In addition, the team plans to use the services of the American Association of Higher Education’s Teaching, Learning, and Technology group to consult on assessment. The assessment design will employ a number of methods. First, a common final examination in place for five years is tied directly to learning goals on an item-by-item basis. This means that student performance patterns can be disaggregated by particular area of strength or weakness for particular types of students, providing a powerful assessment/analytical tool for determining impact. Student performance on regular quizzes will also be monitored. Second, the Virginia Tech team plans to examine patterns of course completion and retention. These are conceptually straightforward, but the plan also notes a number of key ratios that are good summary measures based on readily available statistics that could be used in other settings. Third, explicit partnerships with "downstream" faculty in client disciplines such as Civil Engineering will allow student performance in subsequent coursework to be evaluated. Finally, questionnaires and focus groups will be used to gather data on perceptions, motivations, and satisfaction with the two formats. The Assessment Plan Prior Assessment Work The Math Department employs an assessment coordinator who works with the department head and a faculty committee on assessment. Virginia Tech also has a contractual relationship with the Teaching, Learning, and Technology (TLT) Group, an affiliate of AAHE, through which consulting time on assessment will be obtained for periodic reviews. Baseline data for assessing the redesign of Linear Algebra exists from an ongoing assessment strategy for the course. The team will use data from fall semester, 1996, the last year before the Math Emporium opened, as the baseline. Included in the data are results of common quizzes, tests, exams, and an end-of-semester survey that gathered information on such things as students’ comfort levels and learning styles. On the basis of these results, in conjunction with course grade information, a more detailed assessment plan has been developed for the redesign to include additional information about student backgrounds, development, and changes in attitudes over the progress of the semester. Assessing Impact
The goal of tightening the relationship between student learning and grade should be realized in part with the elimination of multiple sections, exams, and graders. It is possible, though, that the redesign could have unintended consequences in its effect on students who have problems adjusting to the new format or whose best learning styles have not been accommodated. Comparing grade outcomes with standard predictors and with results of the common exams should help identify remaining or new anomalies and suggest additional changes in the course.
Assessing Implementation
|
|
|
|
||||||||
![]() |
![]() |
![]() |
![]() |
|||||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|