You Can’t Trust Self-Report Data on Rape. Here’s Why

  1. Home
  2. Culture Wars
By Megan McArdle | 11:38 am, February 29, 2016

(Bloomberg View) — Rape statistics are a mess.

Rape is at the center of the national debate, and it’s no wonder: the Obama administration’s push to crack down on campus rapes through the Civil Rights Act, the notorious and  retracted story alleging a gang rape at UVA, the raft of accusations against Bill Cosby. Naturally, statistics about rape have been an important part of this conversation. Unfortunately, these statistics are hard to agree upon; what you get is a duel between conflicting sets of numbers, which should be resolved by careful analysis but instead leads each side to turn up the volume and wave its favored research harder.

The “careful analysis” option turns out to be grueling. Human sexuality is complicated, and has defied almost every society’s attempt to neatly divide it into two piles labeled “right” and “wrong.” Wherever one draws the line, it’s going to create difficult boundary cases. I’ve been groped in a crowd more than once, and I’d call that “disgusting,” not “sexual assault.” Nor would I think of applying that name to the birthday reveler who grabbed me in a bar one night and kissed me. On the other hand, what if he’d slipped his hands up my shirt, or under my skirt? At some point it progresses to something that everyone in the country can agree was sexual assault — but where is that point? Some activists would say that the kiss was an assault. Others will draw the line someplace very different.


Or take drinking. At exactly how many drinks, at what level of intoxication, does one become unable to consent? Does it matter who bought the drinks? If so, if one person bought drinks for the other in the hope of a spark, is that the behavior of a rapist? Does it matter whether the person buying drinks was drinking as well? Whether the buyer was drinking more or less? None of that is a simple case like the villain slipping roofies into his date’s drink.

There’s a vast array of complications that are hard to adjudicate, even when consent is not muddied by alcohol or drugs. If a teenage boy pushes the boundaries while he and his girlfriend are fooling around, and she tells him to stop and he does, should we call that sexual assault? Surely the context matters; if a stranger walked up to the girl and did the same things to her, and stopped when she told him to, we would be much more likely to think of the act as assault.

So there’s a naming issue. But even if everyone agreed on what terms like “sexual assault” and “rape” mean, gathering data on their incidence would be a challenge. People lie to researchers, especially about behavior that is considered immoral or taboo or very emotional. So should researchers ask women whether they have been sexually assaulted, or ask about a long list of specific behaviors instead? The polling approach has been the topic of some controversy.

The starting point on gathering this data is to manage expectations: Many of the statistics that could inform our national conversation will alwaysbe “dark figures.” And not just because words fail us. It’s thorny because we know that people do make false accusations at least occasionally, and there is very rarely a third witness for sexual encounters who could give an opinion on whether they were consensual. Thornier still when both parties are intoxicated. And rape is different from other violent crimes; unlike, say, punching someone or grabbing money from the cash register, sex is often undertaken consensually and joyfully. If I punched you, investigators would probably not need to ask whether you gave affirmative consent to have your nose broken.

It’s certain that we used to undercount rape, too often assuming consent in contexts where none was given. It’s possible that we are moving toward overcounting it, by too often assuming a lack of consent. As all scientists know, there is almost always some tradeoff between these two types of error; generally any steps to reduce the false negatives will generate more false positives.

This reality rarely seems to register with the concerned people marshaling little armies of numbers. They prefer the statistics that support their side, presented with angry certainty. They shun nuance, rarely acknowledging what is unknown and what is unknowable. Nowhere is this more obvious than in discussions of false accusations of rape.

The number of false rape reports is obviously a number we’d like to have. Whether that number is many, or few, alters how vigorously police interrogate victim’s stories, how the media treats accusations of rape, how juries decide tricky cases. Maybe it shouldn’t, you’d argue, but humans are imperfect, and it inevitably does. It’s not surprising that in the wake of the Rolling Stone debacle, we’ve had a lot of feminists claiming that we should draw no wider lessons from this case because statistics show that false positives are rare, and a lot of people on the other side arguing that they can show beyond a reasonable doubt that false reports are epidemic. Both sides should stop, because they are wrong.

I don’t mean that they disagree with me. I don’t mean I think they are wrong. I mean that they are wrong.

Any number of pieces have recently been written suggesting that we actually know — or have a pretty good idea — how many rape reports are false. Deadspin, for example, of the Jameis Winston case:

“There’s no doubt that being falsely accused of rape is a dreadful thing that no one should have to endure. One of the reasons it is such a dreadful thing is that false accusations of rape basically do not happen. Statistically, between 2% and 8% of reported rapes are found to be false, but only about 40% of rapes are reported. Do a little math and that means that, for every false accusation of rape, there are up to 100 actual rapes that take place.”

When I pointed out on Twitter that the author did not know the percentage of false rape reports, and therefore could not possibly calculate the ratio of false reports to rapes, he suggested that this was a matter of opinion: Maybe I liked one study better, but he thought his was pretty good. This is not a difference of opinion; it is simply a misunderstanding about the data. He has substituted a number he knows — which is, presented in its absolutely best light, the percentage of reports that can definitely be shown to be false by investigators using stringent criteria — for a number he does not know, which is how many reports of rape are actually false.

Perhaps a parallel will make what I mean more clear. Every year, it’s virtually certain that some number of people get away with killing their spouses. More than occasionally, it happens that investigators think they killed their spouses. They’re maybe even pretty sure that they killed their spouses. But they can’t prove it. In the statistics, this will not show up as “spousal murder” or “intimate partner violence”; it will show up as an unsolved case. But they still killed their spouse. How often does this happen? We have absolutely no idea.

I’ve now spent quite a bit of time reading research on rape prevalence over the last few decades. What you see in the literature on false reports is a general move from methods designed to exclude more false negatives (finding a rape report to be true, when in fact it was false), toward one that is designed to minimize the number of false positives (finding a rape report to be false, when in fact it was true). They use quite stringent criteria, where you basically need a confession, or strong evidence that the attack could not have happened as described, to declare it false.

Proponents of this approach tend to describe it using phrases like “more methodologically rigorous,” but this oversells things a bit. What it is is more conservative, in the accounting sense: it is a nice baseline number that we can be reasonably sure of. It almost certainly excludes some reports that were, in fact, false, but could not be conclusively shown to be so. But while approaches that include more questionable cases — like reports that investigators think were probably false but could have happened the way the accuser says — are going to have a lot more variability, and may well dramatically overstate the problem of false accusations.

Critics point out that the convergence upon this particular methodological choice has a political and ideological component, and of course they’re right, it does. But that doesn’t mean it’s wrong for researchers to use those criteria. It’s very useful to know how many accusations can be demonstrably proven to be false, as a floor, a minimum, that gives us a solid starting point.

The problem comes when this baseline number is abused by journalists and activists who would declare the floor to be a ceiling. This is simply wrong. It does not help to say, well, all the studies tend to cluster together around similar results, so there’s a research “consensus.” It is completely unsurprising that they cluster; if the consensus approach is the conservative method, then we would expect the results of well-done studies to cluster pretty tightly. That’s the beauty of the conservative approach.

But it has its limits. Definitionally, the number of false reports is the number of detected false reports plus the number of undetected false reports. Even assuming you can get an accurate count of the former, the latter is definitionally unknown. So unless you assume that the police have superpowers and no false reports ever go undetected, the sum of the two will always be a dark figure.

These numbers are often further tortured by overgeneralization. For example, I’ve seen people saying that police think false reports are more common than they actually are, when a careful understanding of the limited data would show this statement to be obviously nonsensical. Or they say that the data shows that false reports are more likely to include elements like strangers and violence, which fit the traditional narrative about rape. That may well be. It may also be that false accusations against people you know are harder for the police to conclusively disprove.

The most egregious error, however, consists of applying studies of law enforcement investigations to other contexts where there’s no reason to think that the numbers would be identical:

  • Campus judiciary processes
  • Public accusations of crimes that are well beyond the statute of limitations and are therefore not investigated
  • Media stories about sexual assaults that were not reported to the police
  • Accusations made in social contexts or support groups

The true numbers of false reports in any of these situations could be higher or lower than that of police reports. The only certainty is that we should not assume studies from the justice system are relevant.

On the flip side is the way I’ve seen people use the Kanin study, which purported to show that 41 percent of false rape accusations were false, based on a police department that made “serious” offers to polygraph both accuser and accused. Polygraphs are not truth machines; they are especially unreliable when the person being polygraphed is under great emotional distress. And dismay at being distrusted could easily cause genuine victims to disengage from a hostile process, even though their disengagement could wrongly land them in the “false accuser” category. It was also conducted in a single city decades ago, with vaguely described methodology. I see people using this study to make a mistake like the one above: treating the 41 percent figure as a floor, when at best it’s probably more like a ceiling. Maybe even a loft ceiling.

Could the number be between 3 and 8 percent? Absolutely. But it could be substantially higher than 8 percent; it could even be that 40 percent of rape accusations or more are false, though I’d bet against that. It’s possible that less than 3 percent of rape accusations are false, though again, I would offer good odds against that. The point is that we don’t know, and the groups that claim to know are wrong together.

Until we get mind reading machines, the only thing we can know about the actual prevalence of false rape reports is that we don’t know it.

This article was written by Megan McArdle from Bloomberg and was legally licensed through the NewsCred publisher network.