Assessing quality in higher education
Douglas C. Bennett
How CAN WE ASSESS the quality of education offered by a college or university? How can we know reliably whether or when learning is taking place?
How can a prospective student evaluate whether she will get a good education at an institution where she is considering enrolling? How can a parent have confidence that his son or daughter is learning at the college to which he writes tuition checks? How can a governor or legislator come to terms with the effectiveness of the education offered within a state? How can a faculty assess the strengths and weaknesses of the educational program it offers?
No questions could be more important. And yet we ignore them or (just as bad) accept shallow or misleading answers. Thus, these questions are embarrassing ones. The National Center for Public Policy and Higher Education recently released Measuring Up 2000: The State-By-State Report Card for Higher Education. The study uses systematic data to prepare a report card for each state on its higher education system in terms of six categories: preparation, participation, affordability, completion, benefits, and learning. Every state received a grade in each category. But every state received an Incomplete for learning. No state has yet developed an adequate approach to assessing student learning. And yet surely this is the most important category of all. This national report card tells us a great deal about higher education in the states, except whether their institutions are fulfilling their undergraduate mission.
Value added: The only valid measure
Virtually everyone who has thought carefully about the question of assessing quality in higher education agrees that “value added” is the only valid approach. By value added we mean what is improved about students’ capabilities or knowledge as a consequence of their education at a particular college or university. Measuring value requires having assessments of students’ development or attainments as they begin college, and assessments of those same students after they have had the full benefit of their education at the college. Value added is the difference between their attainments when they have completed their education and what they had already attained by the time they began. Value added is the difference a college makes in their education.
Easy as it is to state, assessment of value added is difficult to carry through. Let me briefly mention just a few of the more important difficulties.
* Value has many dimensions. No college or university is trying to develop only a single capability in students; all are trying to develop an array of capabilities. Measurements of value added must therefore attend to a number of different dimensions of value. We probably should develop several different measures of value added and invite institutions to select the measures that reflect their intentions.
* Institutions are different. Colleges and universities do not all seek to add the same kind of value to students’ development.
Even liberal arts colleges do not all have the same mission. We need to assess value added against a college’s chosen aspirations–its mission. Any effort to rank colleges or universities along a single dimension is fundamentally misguided.
* Effects unfold. Some consequences of a college education may take years to express themselves. We may need to assess some aspects of value added with alumni rather than with graduating seniors.
* Complexity and Cost. Measurement of value added is likely to be complex and expensive. Yet it can be more expensive for society to have no serious assessments of whether we are succeeding in having students learn.
A value-added approach is the best way to assess student learning, but higher education has not yet committed itself to developing reliable measures of the most important dimensions of a college education. There are, on the other hand, a few other possible strategies for assessing student learning that are worth considering.
Assessing outcomes: A second-best strategy
A second strategy for assessing quality is simply to measure the outcomes of a college education: evaluate students as they graduate (or shortly after) on the skills and capabilities they have acquired or the recognition they gain in further competition.
It is possible, for example, to look at GRE scores for those students who take GREs, or to measure the percentage of students who go on to further graduate study, or to look at the honors won (Rhodes, Watson, Fulbright) by graduates. At best, these measures evaluate the quality of an institution’s best graduates, not the attainments of all its graduates.
The most frequently used outcome indicator at present is the measurement of retention rates. What percentage of those admitted to a particular institution continue in the program or finally earn a degree? Retention rates tell us what percentage of an institution’s students were satisfied enough to continue at a college, and what percentage received the benefit of the institution’s full program. But they do not tell us anything about what students actually learned or attained on their way to a degree.
Retention rates are one useful outcome measure, but we need others. We need outcome measures that assess students’ attainments along a variety of dimensions: writing, quantitative abilities, problem solving, understanding of their own culture and of the cultures of others, development of a sense of civic responsibility, and the like. If we had such outcomes measures, we could use them in the service of measuring value added. We could simply assess student outcomes or attainments as they began college and again as they complete their degrees.
Inputs and Reputation: The Approach of U.S. News and World Report
The most commonly noticed and quoted effort claiming to assess quality in higher education is the annual rankings by U. S. News and World Report. It has grown extraordinarily influential. But how shall we assess its validity as an assessment of quality in higher education?
The USNWR approach to rankings changes somewhat each year, but basically it draws together data of several different kinds, blending them into a single set of rankings for various kinds of institutions. Essentially, the approach makes use of data about inputs, reputation, and outcomes.
Inputs. Several of the indicators employed by USNWR measure an institutions’ inputs. First, USNWR gathers data about a college or university’s financial resources. It measures how much an institution spends, per student, on instruction. Colleges which charge more in tuition or which have larger endowments (or both) rank higher because of these measures.
A second kind of input data gathered by USNWR concerns an institution’s faculty resources: the average salary of its faculty members, the percentage of its faculty members who are full time, the percentage of its faculty with highest degrees in their field, its overall student/faculty ratio, and class size.
These input measures are at best a look at what might be ingredients of quality. Using these is a bit like evaluating cakes by looking at their list of ingredients rather than by tasting them. Whether more resources translates into better education for students depends on whether the resources are used well and wisely. In this regard, it is important to note that there is no significant research linking these resource inputs to value added.
A third kind of input data used by USNWR concerns student selectivity. Some of these measures indicate how capable or prepared the students are when they enter a college or university: entering student scores on SAT or ACT tests and the percentage of students graduating in the top 1.0 percent of their high school class. Perhaps these measures are useful for a student choosing a college; high achievement students may want to keep company with other high achievement students. But such measures are a backwards approach to assessing the quality of learning at a college: Colleges are ranked higher insofar as they start with students who have already learned more.
Outcomes. The USNWR annual rankings do make use of one kind of outcome measure: graduation and graduation rates. They measure what percentage of an institution’s first year students return for a second year, and what percentage of students graduate within six years. I believe these measures are the best aspect of the USNWR rankings. But, I also believe retention and graduation rates are a very primitive outcome measure: They beg the question of whether, and what, students have actually learned.
Reputation. Reputational measures are another component of the USNWR rankings. The magazine surveys presidents, provosts and deans of admissions at institutions of similar types, asking them to rate dozens of colleges or universities on a five-point scale from distinguished to marginal. This aspect of their approach has superficial appeal, namely, ask the experts. But how much do officials of one college know, really, about the quality of education at other colleges? As someone who pays careful attention to what happens beyond my own campus in higher education, I think I could seriously evaluate, at most, two or three other colleges–not dozens. Consequently, I refuse to participate in the survey. I believe it is an exercise relying not on expert judgment but on reputation in a sense barely distinguishable from hearsay or rumor. (Though touted as peer review, I believe the approach is akin to refereeing journal articles without actually reading them.)
Some of USNWR’S student selectivity measures are also really reputational measures: a college’s acceptance rate (the ratio of students admitted to all those that apply) and its yield (the ratio of students who enroll to all those admitted). These statistics measure how many students desire to attend a particular college, and how strongly they desire to do so. The more students want to attend a particular college, the higher its standing in the rankings. But these, too, tell us nothing about whether students learn.
Expert assessment: The Templeton Guide
One kind of alternative to the inputs and reputation approach of USNWR is represented by the biennial Templeton Guide. It relies on assessments by experts. Seeking to identify and focus attention on colleges that encourage character development, the Templeton Foundation invites colleges and universities to nominate their own programs in ten categories, for example “academic honesty programs, “civic education programs,” and “spiritual growth programs.” These program descriptions are evaluated by a panel of experts using a set of explicit selection criteria.
It is not possible, using this methodology, for these experts to assess whether the programs they are evaluating are genuinely effective. However, it is more than a reputational approach in that the experts are at least evaluating program descriptions. More on the Templeton Guide can be found at www.templeton.org.
Self-Reports: The College Results Instrument
Another approach to assessing quality involves asking people to judge for themselves whether they have benefited from the college or university. Such an approach can survey either students or recent alumni.
Research has shown that self-reports of student learning have greater validity when, rather than overall satisfaction, they measure whether students or alumni believe their college education significantly improved skills with regard to a particular capability-writing, for example, or critical thinking.
The College Results Instrument (CRI), developed by the Institute for Research on Higher Education for the National Center for Postsecondary Improvement (NCPI) takes a different and intriguing approach. In addition to other kinds of questions, it presents ten different “scenarios” to alumni five years beyond graduation. Each scenario is a situation, task, or problem that might need to be researched, solved, and/or performed in the workplace or in life. Respondents are asked whether they believe they are well-prepared to work on the task described.
Rather than a unidimensional ranking of colleges, the CR1 yields a profile of a college’s alumni–their values, abilities, and attained skills on several different dimensions. The design of the instrument helps prospective students make more informed choices among the array of colleges and universities to which they might apply. More information about the CRI can be found on the Peterson’s Guide website www.petersons.com. Peterson’s Guide will make use of this instrument.
Processes and participation rates: The National Survey of Student Engagement
Yet another approach to assessing quality in higher education asks students to report what they actually do while they are in college–what they engage in. Such an approach focuses, that is, on processes and participation rates. It asks students at a particular college, for example, how many papers they write each semester, how much time they spend on academic work each week, whether they regularly talk with professors outside of class, and whether they participate in athletics or music or other extracurricular activities.
This approach has significant substance only to the degree that there is evidence that the process or activity in question is closely associated with student learning. Rather than trying to measure value added directly for each student, the intent is to measure whether students are educated through processes that research has shown do in fact add value to students’ attainments.
The best exemplar of such an approach is The National Survey of Student Engagement (NSSE), developed with support from The Pew Charitable Trusts and the Carnegie Foundation for the Advancement of Teaching. An all-star group of higher education researchers designed for NSSE a new instrument, the College Student Report, that can be administered to first- and fourth- year students. The survey asks students how their college is engaging them and also questions how much they have learned (self-report items). Each of the items in the survey is grounded in research evidence that associates it with significant student learning. More information can be found on the project website at www.indiana.edu/~nsse.
Two hundred seventy-six colleges and universities participated in the first (year 2000) administration of the College Student Report. About 325 institutions (a sample of 230,000 students) will participate in 2001. NSSE releases national data summarized by type of institution, but not data on individual colleges or universities. Each institution owns its own data and makes its own decisions about whether to release these to the public. NSSE is amassing a very large data set that tells us a great deal about what is happening in college. And like the College Results Instrument, it yields a profile of a college or university, a multidimensional portrait, not a single ranking.
The road ahead
I believe the U.S. News and World Report annual ranking is irretrievably flawed as an assessment of quality in higher education. It focuses heavily on input and reputational measures, uses retention as its sole outcome measure, and makes no effort whatsoever to assess value added. In several new initiatives, however, we do have promising alternatives for assessing quality: the Templeton Guide, the College Results Instrument, and (especially) NSSE’s College Student Report. I hope we will start investing in their use and paying attention to their results.
At the same time, I hope we will put our best efforts into developing value-added measures. We need to develop value-added measures focused on the particular core competencies that most people agree should be gained through a baccalaureate education–the core competencies of a liberal education. It will take thoughtful, sustained effort to develop these. We will not be able to adequately assess the various competencies simply with multiple-choice tests. The instruments (to be administered to students as they begin college and again as or after they finish) will need to involve writing, problem solving, and performance of tasks regularly encountered beyond campus boundaries. We do have the expertise to design such instruments. All that is lacking is the will and the mobilization of the needed resources.
I have a candidate for where we should start. No single capability is more important than writing well. Virtually every college and university seeks to have its students write better when they graduate than when they first enroll. Several sensible approaches to assessing writing have emerged in K-12 education, the best being that of the National Assessment of Educational Progress (NAEP). I urge that higher education develop, based on the NAEP approach, a value added instrument to assess how well college students improve their ability to write. Once we have embarked on this we can develop other value-added measures. It is time we begin.
DOUGLAS C. BENNETT is the president of Earlham College
COPYRIGHT 2001 Association of American Colleges and Universities
COPYRIGHT 2008 Gale, Cengage Learning