Regulators, payers, patients and others understandably want to determine which healthcare providers deliver the highest quality of care. Interest in performance measurement has increased since such groups such as the Institute of Medicine, the Center for the Evaluative Clinical Sciences at Dartmouth, the CMS, and the Rand Corp. have reported estimates that 20% to 50% of all prescriptions, visits, procedures and hospitalizations in the U.S. may be inappropriate. That translates into patient deaths and injury and the waste of an estimated $400 billion annually.
Clearly there is a need to improve healthcare, and performance measurement is an increasingly applied component of quality improvement. Yet there are pitfalls to performance measurement of which many users are unaware.
We offer an outline of some important limitations we believe healthcare leaders and managers should consider as they attempt to improve quality through the use of performance measures within their organizations.
Providers must distinguish between three main uses of performance measures: required reporting, quality improvement, and evaluating organizational or individual performance. Each of these uses requires a different approach because of the possibility of invalid measures. Providers must understand the limitations and pitfalls of drawing conclusions from performance measurement because of the problems resulting from conflicting measures, which can result in decreased quality, waste and incorrect conclusions about individual health organizations or practitioners.
Components of a measure
A good performance measure consists of a valid numerator and denominator, frequency of occurrence and the data-gathering process. It is often expressed as a percentage or a rate. The text statement for a diabetic patient-care performance measure might be, "The percent of patients with a diagnosis of diabetes mellitus receiving at least one hemoglobin A1c annually."
A valid denominator means that it specifies the right base from which the measurement will be made. The denominator must have appropriate criteria for the pool of whom or what is eligible for measurement. For example, a denominator measuring clinical improvement for cervical Pap smears that does not exclude women without a cervix would be invalid.
Numerators generally count events such as something that happens to a patient or something patients receive - typically, this is an outcome, an intervention, a service or a process. Examples of numerators are the number of patients with diabetes mellitus from the denominator (the population eligible for measurement) who have received an annual hemoglobin A1c, or the percent of charts available for patient appointments at an outpatient clinic.
A key point -- often misunderstood -- is that numerators should be based on valid, useful and usable scientific evidence. For health status outcomes and interventions, this requires that we know what interventions are likely to result in improvement. With rare exceptions, only well-designed and conducted randomized controlled trials can demonstrate cause and effect relationships, and only valid and useful information from trials data should be used for interventions.
Medical leaders, administrators and others should be aware of the significant potential for selection bias, observation bias, confounding and the play of chance when relying on observational data -- and performance measures are one form of observational data. These threats to validity can confound performance comparisons between institutions, units and individuals. Trying to resolve this by adjusting for case mix is analogous to using models to adjust for patient differences in observational studies dealing with therapy -- the potential to be misled by confounding factors remains high. Databases and clinical records are useful for measuring processes, but are not reliable for attempting to "prove" that a health status improvement was the result of an intervention. Database information, observational studies and opinions of experts can inadvertently mislead.
Unfortunately, because of a general lack of knowledge of these potential problems, providers may be required to report outcomes to stakeholders even when the performance measure or the organization's outcomes lack validity.
Feedback to a clinician in the form of a performance report may be of great value as a way of encouraging his or her participation in quality improvement efforts and focusing attention on improving processes of care and attention to patients' needs. However, because of significant validity and reliability problems inherent in observational data, it is altogether a different issue when an individual provider's performance is made available to others in the form of a performance "report card" or when an individual's income is based on a limited set of performance measures.
Here are several examples of problems that can result:
* The wrong denominator: A colon cancer screening quality improvement project at the Veterans Administration Hospital in San Francisco resulted in the facility failing to meet a national target and the hospital faced financial penalties. However, an audit revealed that 47% of the patients included in the measure had declined screening, 12% failed to make their appointments for screening, 11% had chart documentation that screening was not indicated and 42% of the counted patients received diagnostic testing rather than screening (i.e., they had signs or symptoms of disease). Thus, the conclusion that the hospital was failing to meet national VA benchmarks was incorrect.
* The wrong numerator: Some groups recommend routine screening of all newborns for hearing problems during postpartum hospitalization-this is even required by law in many states. There is, however, insufficient evidence to conclude that such testing leads to improved speech and language skills at 3 years of age. It is also unclear from the best available evidence if potential benefits outweigh potential harms of false-positive tests. Unfortunately, many stakeholders demand performance data on this measure. When looking at internal quality projects it would be better to select a more valid measure.
* Problems in judging the quality of a clinician: A physician may take appropriate actions to improve quality of care, but because of patient factors, systems factors or small sample size, the physician's performance may not result in clinical improvement.
Examining the use of profiling family physicians for glycemic control in their diabetic patients is instructive. It has been reported that in a typical family practice, only 4% or less of variance in hospitalization rates, visit rates, lab utilization and glycemic control in diabetics can be attributed to differences in physician practice patterns. For profiles of glycemic control, outlier physicians could dramatically improve their profiles by pruning their panels of as few as one to three patients with the highest HbA1cs levels. This gaming of the system could not be prevented by case-mix adjustment.
Leaders and managers should remember that conclusions about individuals -- and even organizations -- based on performance measures alone should be drawn with great care. A key issue is how much of the outcome is due to selection bias, sample size, and other factors such as how much of the outcome is really under the control of the clinician or the health system and how much is not.
Performance measurement is an important component of quality improvement efforts in healthcare. However, if performance measures are not designed and used correctly, they are rendered statistically or clinically meaningless -- a waste of resources and a threat to quality. To bring clinicians on board with quality efforts and to improve patient safety and outcomes, we must ensure that each measure captures the data intended and is used appropriately.
Michael Stuart is president of the Delfini Group and clinical assistant professor in the Department of Family Medicine, University of Washington School of Medicine, Seattle. Sheri Ann Strite is a principal and managing partner in the Delfini Group, Portland, Ore.