“No Estimation Without Representation!”

For so-called Statistically Valid Random Sampling (SVRS), it is necessary that the sample be “representative.” I have written other articles about the appeals of Medicare extrapolation cases, and this is another of those objections that is sometimes raised in the appeals process. This is a general principle in statistics which should be understood by everyone involved in sampling. Issues raised in appeals sometimes reveal a misunderstanding about what “representative” really means. Or, perhaps, the consultant is counting on the adjudicator buying into such a misunderstanding.

There is an intuitive version of representation that says a representative sample will “look like” the population, that is, they will have very similar descriptive statistics, such as mean, variance, and skewness. A histogram of the sample should look like a histogram of the population, one thinks. Indeed, this is the ideal, it is the result one hopes for if sampling is to be worthwhile. And it is, in fact, the theoretical asymptotic result. However, it might not be the case for any given representative sample.

It is a general principle of sampling that random sampling ensures representativeness. Misunderstandings arise with the failure to distinguish between an intuitively “representative sample” and a “representative sampling procedure.” A sample which is chosen through a “representative sampling procedure” is, by definition, a “representative sample.” This is true even if it fails to resemble the population. We can use a small illustration to show how this occurs. Suppose we have a population consisting of {1, 2, 3, 10} and we intend to select a sample of size three. Now, a basic principle of random sampling is that every possible sample has an equal (technically, “known”) probability of being selected. In this illustration, we can enumerate all the possible samples and compare the mean and variance to the population. I leave this as an exercise for the reader. Even without doing it, you might quickly see that the sample {1, 2, 3}, which is perfectly legitimate, has very different characteristics as compared to the population. We can go further than this, because we have not stated whether sampling should be with, or without, replacement. In sampling with replacement, {10, 10, 10} is one of the possible samples, and it is pathologically unlike the population! Yet, these samples are representative, if selected randomly from the population of samples. Larger populations with larger samples will not display such extreme behavior, yet the principle seen in the illustration can still be observed.

When one begins to evaluate a sample to see if it is representative, the question ultimately arises, “What will you do if you decide it is not representative?” The appeals consultant would have you throw it out and start over, but, what does that do to SVRS? According to random sampling, every sample should have a known probability of selection. But if you’re going to decide, after the fact, that some samples are not eligible, you can no longer know the ultimate probability of selection (after rejection). I suppose, if one could quantify the conditions under which a sample will be rejected (in advance), there might be a way to do this. I’ve never heard of anyone trying, and I suspect there are good reasons why not! In any case, the call to discard a sample after sampling is performed turns the sample into a judgement sample rather than a probability sample, and thus not SVRS. It’s just the same as if a scientist were to keep collecting samples until he got one he liked, and discard the others!

The next question is, what is it you actually want to be representative? This ought to be a real “aha” moment. You see, sampling is performed to collect “observations” from a limited number of “subjects.” The target values of the whole population are not observed (this applies to most cases; we omit the exceptions). But, it is the target values in the population that one wants the observed values from the sample to represent. So, it is actually not possible to compare the distribution of the population values (we don’t have them) and the sample values. Evaluation of “representativeness,” in this sense, cannot be done!

So what is it that our appeals consultants are doing? They’re comparing other (known) values from the population to their corresponding values from the sample. In the Medicare extrapolation scenario, this usually means comparing the distributions of the paid amounts as a proxy for the distributions of overpayments. Of course, one hardly need mention that the distribution of the paid amounts may not be anything like the distribution of the overpayments! So even if one were to entertain the validity of rejecting a sample for not being representative, we still don’t know the all-important piece of information–if the overpayments are representative or not! (Putting aside all the problems with defining “representative,” as described above.)

Perhaps the reader will be unable to avoid the gut feeling that, somehow, it’s just fair and right that the distribution of paid amounts for the sample should be similar to the distribution of paid amounts of the population. Well, there are some ways to deal with this that do not involve invalid sampling procedures. One helpful strategy is to deal with outliers first. Outliers can either be eliminated, or they can be put into a separate “certainty stratum.” Both approaches are done prior to sampling and do not break the rule about “known probabilities.” And speaking of “stratum,” stratified sampling plans can also be used to mitigate this problem. By controling the stratum sample sizes, one can ensure a more even “intuitive representation” without jeapordizing validity. Of course, when stratification is done, the consultant will certainly make some claim about how the strata are not correctly defined, or how the sample (as a whole, without regard to strata) is unlike the population. The possibilities are probably endless, but one thing is clear: An SVRS, which is always conducted with known probabilities of selection, is always a representative sample.

A flashback of thoughts on education

Today I’m posting something that isn’t (necessarily) about statistics.

For as long as I’ve been alive, and based on my parents’ comments, apparently for some time before that, education has been “going downhill.”

We still hear anecdotes about how bad education is, we still hear about America test scores not being competitive in the world, and we still hear from college professors that freshmen are increasingly unprepared for college.

And now we have some new things.  It’s the “millennials” we hear, their faces buried in their phones, their 50-millisecond attention span, their inability to reason in depth, and so forth.  Well maybe it’s not as bad as all that.  But I was going through some old posts I wrote on another blog, more than 10 years ago, and I found this gem.  It seems even more relevant now than it was then.  Enjoy.

Tuesday, October 18, 2005
Education vs Instant Gratification

I think I have a new take on the “schools were better in the past” or “students were better in the past” argument. It’s this: We live in a culture of instant gratification, and education doesn’t fit.

Consider being a student a couple of hundred years ago. Suppose you were hungry. What would you do? Well, you would probably have to think of it in advance, and get prepared. Maybe you’d have to butcher a chicken, which you would have had to raise up from a chick. Maybe you’d have to go hunting for something, then butcher it, then cook it, and so, in a couple of hours, you’d have something to eat. If you wanted some vegetables, you’d better have planned ahead months in advance–planted a garden, tended it, put up and preserved the goodies. Then, when you’re hungry, you could take it out and prepare it (which might involve building a fire, etc). It would take lots of effort and advance planning. Of course, as a student, you might not have done all that yourself. But, chances are, you’d have been part of the process, helping your parents do exactly those things. So you would get the idea that if you wanted to eat, you’d better be prepared to put some work into it.

Today, you run to McDonalds or throw a frozen dinner in the microwave, and in a few minutes, you can eat. It’s pretty easy and doesn’t take much planning or work.

Suppose you were a student a couple of hundred years ago, and you were cold. What would you do? Throw another log on the fire–but first, you’d have to chop the wood, stack and dry it. Or maybe you use coal–dig it out of the ground and haul it home. Or maybe you gather buffalo chips in the fall. Or, you’d put on more clothes. But where do you get them? Long ago, you would have gathered straw, spun thread, and wove the cloth, and finally sewed the garment. More recently, you’d still have to buy the cloth and make your clothes. It was a long process that involved planning and work, to make sure you’d have something warm to put on. You probably participated with your parents in all these activities. You’d get the idea that if you wanted to be warm, you’d better be prepared to put some work into it.

Today, you turn up the thermostat or run to Walmart and buy a sweater. In a few minutes, you’re warm. It’s pretty easy and doesn’t take much planning or work.

Suppose you are a student a couple of hundred years ago, and you went to school. You’d know that everything important in life requires hard work and advance preparation. You’d take if for granted that nothing important comes easy. You’d automatically be prepared to work hard at school, just like everything else.

Today, every other experience of your life tells you that the things you want can be quickly and easily obtained. There is practically no chance that you would ever have to worry about not having your basic needs fulfilled, even if you do absolutely nothing. You see advertising that tells you how all the hardest jobs can be done without breaking a sweat, leaving you plenty of time to play and enjoy yourself. Unfortunately, there haven’t been any major advances in education in the last 200 years. Learning proceeds pretty much just as it always has, with lots of hard work and advance planning. But you have no analogue for this. Nothing in your life has given you a context for it. So, you scoff at your teacher’s admonition that you put hard work and effort into your learning. Life just doesn’t work that way, in your experience. Certainly, there must be a way that you can flip a switch, or run to the store, or pop something into an appliance, so that your educational needs are quickly fulfilled, and you can get back to playing and entertaining yourself.

Is there really any possible way that today’s students could be as good as yesteryear’s?

The Independence Fallacy

First, some context:

This article applies to SSOE (Statistical Sampling and Overpayment Extrapolation) as practiced in the Medicare program.  The principle is more generally applicable, but this is the specific use case.

The “Independence Fallacy” arises when a sampling plan is challenged (generally during an overpayment appeal) on the grounds that sample units are not “independent,” and thus the estimation methods employed are claimed to be invalid.

Second, what is the importance of independent sample units?

Most estimation procedures rely on certain theorems in statistics that require normal distributions and “independent, identically distributed” (iid) sample units.  In particular, the challenge will usually relate to the use of the “Central Limit Theorem” (but there are actually several “central limit theorems) which says that under certain conditions, the estimate will follow a normal distribution (approximately).  Independence is supposed to be one of the conditions required.

It is worthwhile to note that this condition of independence contrasts with what happens in something called a “Markov Chain,” in which each sample unit is a statistical function of the preceding one.  More precisely, the probability distribution of each sample point depends on what has been observed before.  Suppose you measure the speed of a bus at certain points between two stops.  Since the bus certainly goes through a pattern of acceleration and deceleration (though affected by traffic), the probability distribution of the speed at the next measurement point depends on the speed at the last measurement point.  In a scenario with independence, the observed value (or distribution) of any sample point has no relationship with the previous sample unit.  All sample units have the same distribution.

The claim that is made in overpayment cases is that sample points (which are individual overpayments) are not independent because some come from the same patient, or the same doctor, or the same day, etc.  In fact, such observations may be correlated within the population.  This correlation can occur if for some reason, all of one patient’s overpayments, on average, are greater than another patient’s overpayments, due to some special feature of the patient.  In this case the probability distribution of the two patients’ overpayments, prior to payment or even billing,  is different.  However, correlation does not equal dependence!

And now, things are going to get really technical.  We need to understand what the random variable actually is in this scenario.  The overpayments in the universe are not random variables.  In fact, they are fixed, though unknown, values.*  This is the reason for the phrase “prior to payment or even billing” in the previous paragraph.  Once payment is made, the overpayments become fixed, not random.  They do not have probability distributions.  (There is a distribution of values, but that is not the same thing.) The probability distribution only comes into play through sampling.

“So,” you may be asking, “how do we get a probability distribution out of fixed values?”  Why, the same way that we get it from tossing a die.  The six numbers on a die are fixed.  What has probability is not the actual number itself (or face of the die) but the outcome of a random experiment–tossing the die.  And in the case of tossing a fair die, each number or face has a 1/6th  probability of being “chosen.”  If you toss the die six times, there is no dependence from one toss to the next.  Now suppose you toss six separate fair dice, one at a time.  Again, there is no dependence between tosses. The second example does not have greater independence because it involves six different dice.

And so, the overpayments in the population are like the faces of the die.  They are fixed values with no probability distribution.  If all were reviewed, the result would be an exact number with no probability or statistics involved.  However, we do not review them all.  Instead, we create a random process (toss the die) in which several sample units are selected in sequence.  The values of the sample units are not known in advance, because the identity of the units (in the population) is not determined until the sampling procedure is carried out.  The order of the selected values is also not pre-determined.  This is why the sample units have an identical distribution, even though there can be widely varying values in the population.  Any of the population elements can end up in any of the sample unit positions.  This means the probability distribution of each sample unit is the same as the fixed relative distribution (unknown) of the population, and it is the same for every sample unit.  Therefore, no sample unit’s distribution depends on the previous sample unit, and they are independent.  Any sharing of characteristics relating to the origin of the sample unit is completely irrelevant.

In conclusion, the idea that sample units in overpayment extrapolations might not be independent is a fallacy, with no possible basis in statistical theory.

*An objection may be raised here, that different reviewers might determine different overpayments.  This is a separate issue, involving “measurement error” and “bias.”  These issues are addressed in the appeals process by re-reviewing claims in question, and do not affect the statistical theory.  Regardless of what different people may decide, there is a “truth” about each overpayment which the review process is intended to uncover.

Medicare Overpayment Extrapolation Consulting

I’ve been reading various websites that purport to give advice about defending against overpayment extrapolations done by various CMS contractors or the OIG. It doesn’t seem like many of them have any actual insider knowledge about how extrapolations are done or what might be helpful during the audit or during the appeals process.

In fifteen years of working with Medicare, I have never had the impression that any CMS contractor is “out to get” providers. On the contrary, I have seen repeated efforts to “bend over backwards” to help providers get into compliance with the Medicare program, sometimes working for years with extremely recalcitrant providers trying to get them to bill properly and stop abusing the Medicare payment system. Thus my first piece of Honest Advice is this: When Medicare tells you that you are doing something wrong, or may be doing something wrong and should internally investigate, LISTEN! This is your first opportunity to head off any possible extrapolation! If you have questions, or are unsure what the issue might be, work with the contractor (or your MAC) to understand the issue. Get medical review experts to advise you on proper billing and documentation.

I know documentation requirements are onerous. This is the government we’re talking about. I know medical professionals are chafing under the documentation load already. I know documentation requirements are raising the cost of medical care and not necessarily improving the quality. These are my opinions, which I think are widely shared in the industry. HOWEVER, you cannot escape the fact that Medicare payment is contingent on proper documentation. If you don’t have the documentation that is required, it is essentially illegal for you to accept payment, and it is an error for the MAC to give you payment (even when no documentation was requested). Time and time again I have seen the argument made that the government did not prove the service was not rendered, nor that it was not necessary. But this is irrelevant–if you did not prove the service was rendered, medically necessary, and covered by Medicare in your documentation, you are not entitled to payment from Medicare. It’s really that simple. In addition, if you knowingly accepted payment in the absence of proper documentation, you are getting into the realm of actual fraud, regardless of whether the actual services were proper and payable in every other way.

Many providers attempt to appeal an extrapolation by challenging the statistics. There are several (questionable) experts out there who make it their business to try to get extrapolations overturned with a variety of spurious arguments. This used to work sometimes, because ALJs where not properly educated about the statistical aspects of these appeals, and, particularly if the contractor did not attend the hearing, they were duped into believing these claims. Things have gotten much tighter now. There are a number of Departmental Appeals Board decisions and Circuit Court decisions that back up and provide legal precedent for extrapolation methods. It is true that mistakes occasionally occur in the extrapolation process, and providers should not neglect to examine the methodology closely. However, they should not rely on this, and should be prepared to accept the decision of a consultant (one with actual Medicare extrapolation experience) that the extrapolation was conducted properly.