A study on dishonesty was based on fraudulent data

Definitely not about writing directly, but quite possibly a wonderful story about human nature, which forms the basis of many works of fiction and is present at all times, everywhere.

From The Economist:

IF YOU WRITE a book called “The Honest Truth About Dishonesty”, the last thing you want to be associated with is fraud. Yet this is where Dan Ariely, a behavioural economist at Duke University, finds himself, along with his four co-authors of an influential study about lying.

In 2012 Mr Ariely, along with Max Bazerman, Francesca Gino, Nina Mazar and Lisa Shu, published a study on how to nudge people to be more honest. They concluded that when asked to affirm that information is truthful before they give it, rather than afterwards, people are more likely to be honest. The results stemmed from three experiments: two conducted in a laboratory (led by Mr Bazerman, Ms Gino and Ms Shu), and a third based on data from a car-insurance company (led by Mr Ariely and Ms Mazar).

Several researchers have tried and failed to replicate the results from the laboratory tests. But it is the car insurance study which is driving the most serious doubts. It asked policyholders to self-report the number of miles they had driven. Customers were asked to sign a statement on the reporting form which said, “I promise that the information I am providing is true”; half of the forms had this declaration at the top, half had it at the bottom. All of the car-owners had previously reported their odometer readings to the insurance company, giving a baseline for the data (the time elapsed between the baseline readings and the experiment varied for each customer). Mr Ariely and Ms Mazar found that when customers were asked to sign the statement at the top of the form, there was a 10.25% increase in the number of self-reported miles, compared with the miles reported on forms where the statement was signed at the bottom. The more miles a car has driven, the more expensive the insurance will be. The researchers concluded that signing the truthfulness statement at the top of the form resulted in people being more honest (and thus on the hook for higher insurance premiums).

With over 400 citations on Google Scholar, these findings have spread far and wide. But on August 17th Leif Nelson, Joe Simmons and Uri Simonsohn, who run a blog called Data Colada, published an article, based on the work of a group of anonymous researchers, dissecting what they believe to be evidence of fraud. There are several eyebrow-raising concerns, although two in particular stand out: the number of miles reported by the policyholders, and the way in which the numbers were supposedly recorded.

In a random sample of cars, one would expect the number of miles driven by each vehicle to follow a bell-shaped curve (such as a “normal distribution”). Some cars are driven a lot, some are barely driven, but most fall somewhere in between these extremes. But in the experiment from 2012, the number of miles driven follows a uniform distribution: just as many cars drove under 10,000 miles as drove between 40,000 and 50,000 miles, and not a single car drove more than 50,000 miles. Messrs Nelson, Simmons and Simonsohn suggest that a random number generator was used to add between zero and 50,000 to original readings submitted by the customers.

The random number generator theory is backed by the second problem with the data. Many people, when asked to write down big numbers, round to the nearest ten, hundred or thousand. This can be seen in the data for the original odometer readings: nearly 25% of the mileages end in a zero. But in the experiment, each digit between zero and nine is equally represented in the final digit of the mileage reports. Humans tend to round numbers, but random generators don’t.

All five members of the original research group admit that the data in their study were fabricated. But all say they were duped rather than dishonest. “We began our collaboration from a place of assumed trust⁠—rather than earned trust,” said Ms Shu, on Twitter. However, she declined to comment further to The Economist. Mr Ariely’s name is listed as the creator of the Excel spreadsheet containing the original data. But he says he has no recollection of the format of the data he received, speculating that he might have copied and pasted data sent to him into the spreadsheet. One explanation is that the insurance company, or a third party that collected data on its behalf, falsified the numbers. The Hartford, the Connecticut-based insurance company that allegedly provided data for the experiment, could not be reached for comment. Mr Ariely has requested that the study be retracted, as have some of his co-authors. And he is steadfast that his mistake was honest. “I did not fabricate the data,” he insists. “I am willing to do a lie detection test on that.”

Link to the rest at The Economist

From Buzzfeed News:

The paper also bolstered the reputations of two of its authors — Max Bazerman, a professor of business administration at Harvard Business School, and Dan Ariely, a psychologist and behavioral economist at Duke University — as leaders in the study of decision-making, irrationality, and unethical behavior. Ariely, a frequent TED Talk speaker and a Wall Street Journal advice columnist, cited the study in lectures and in his New York Times bestseller The (Honest) Truth About Dishonesty: How We Lie to Everyone — Especially Ourselves.

Years later, he and his coauthors found that follow-up experiments did not show the same reduction in dishonest behavior. But more recently, a group of outside sleuths scrutinized the original paper’s underlying data and stumbled upon a bigger problem: One of its main experiments was faked “beyond any shadow of a doubt,” three academics wrote in a post on their blog, Data Colada, on Tuesday.

The researchers who published the study all agree that its data appear to be fraudulent and have requested that the journal, the Proceedings of the National Academy of Sciences, retract it. But it’s still unclear who made up the data or why — and four of the five authors said they played no part in collecting the data for the test in question.

That leaves Ariely, who confirmed that he alone was in touch with the insurance company that ran the test with its customers and provided him with the data. But he insisted that he was innocent, implying it was the company that was responsible. “I can see why it is tempting to think that I had something to do with creating the data in a fraudulent way,” he told BuzzFeed News. “I can see why it would be tempting to jump to that conclusion, but I didn’t.”

. . . .

But Ariely gave conflicting answers about the origins of the data file that was the basis for the analysis. Citing confidentiality agreements, he also declined to name the insurer that he partnered with. And he said that all his contacts at the insurer had left and that none of them remembered what happened, either.

According to correspondence reviewed by BuzzFeed News, Ariely has said that the company he partnered with was the Hartford, a car insurance company based in Hartford, Connecticut. Two people familiar with the study, who requested anonymity due to fear of retribution, confirmed that Ariely has referred to the Hartford as the research partner.

The Hartford did not respond to multiple requests for comment from BuzzFeed News. Ariely also did not return a request for comment about the insurer.

. . . .

The imploded finding is the latest blow to the buzzy field of behavioral economics. Several high-profile, supposedly science-backed strategies to subtly influence people’s psychology and decision-making have failed to hold up under scrutiny, spurring what’s been dubbed a “replication crisis.” But it’s rarer that data is faked altogether.

And this is not the first time questions have been raised about Ariely’s research in particular. In a famous 2008 study, he claimed that prompting people to recall the Ten Commandments before a test cuts down on cheating, but an outside team later failed to replicate the effect. An editor’s note was added to a 2004 study of his last month when other researchers raised concerns about statistical discrepancies, and Ariely did not have the original data to cross-check against. And in 2010, Ariely told NPR that dentists often disagree on whether X-rays show a cavity, citing Delta Dental insurance as his source. He later walked back that claim when the company said it could not have shared that information with him because it did not collect it.

Link to the rest at Buzzfeed News

PG picked an online random number generator at random.

Somewhere in his brain, he remembered reading that random numbers generated by a computer are not truly random numbers, but are pseudo random numbers – he is not certain of the difference, but expects picking an online random number generator by entering “Random Number Generator” into Google and picking one of the first listings to appear is a pseudo random number generator search. Or something.

At any rate, here is a list of ten random numbers that PG created with the online random number generator – pseudo or non-pseudo, he can’t tell the difference:

12654
36023
23996
13888
33367
23237
18197
18197
17624
23718

If, as the OP’s suggested, the main culprit is a TED Talk speaker and a Wall Street Journal advice columnist who used a random number generator to create the mileage figures upon which the whole ground-breaking study was based, it makes PG question the expertise of TED Talk speakers and Wall Street Journal advice columnists.

Additionally, is there a reason why none of these heavy-duty university mathematics and data science experts never noticed that none of the numbers in the original data was rounded off?

23 thoughts on “A study on dishonesty was based on fraudulent data”

  1. If your random range was 1 to 100,000 then the range from 12000 to 36000 would suggest they are truly random. A human generating pseudo random numbers would probably have thrown in at least 1 number from 50000-100000.

    But mainly you need much bigger samples, ie 1000’s times the size. Truly random can have repeated numbers and given a large enough sample even distribution.

    Psuedo will tend to thrown in high numbers to balance out a series of low numbers and wont have repeated numbers, unless of course they are trying to look like random numbers and are taking that into account.

    A failed attempt at random numbers we did in-house in the 80’s tended to be biased towards the 1-10 range. So it was supposed to give a question and answer based or a 1000 choices between 1-1000 but if you had written every answer down then if you tried perhaps 20 times you would get a question you had already written down the answer too

    <– Not a data science expert just someone who has sometimes had similar issues.

  2. To answer your question, those heavy-duty university mathematics and data science experts did notice the problem because they never saw any of this. While the paper undoubtedly was peer reviewed, it likely was reviewed by other behavioral economists. Usually, it makes sense to have a paper reviewed by people in the same field. The problem is that social scientists are as a rule terrible at math. There are honorable exceptions, but as a rule of thumb if you are a math whiz and want to become an academic scientist, you are unlikely to go into the social science side. Notice how the recent discussion about the replication crisis is in the social sciences. We aren’t seeing this stuff about particle physics. This isn’t a slam against the social sciences in general, but about physics envy leading social scientists into waters they are unqualified to navigate.

    On a different note: “In a random sample of cars, one would expect the number of miles driven by each vehicle to follow a bell-shaped curve (such as a “normal distribution”).” One would? This is not obvious to me. I would think that below some number of miles, it wouldn’t be worth the expense to own a car. This would tend to lower the left side of the curve until the number reaches the point where owning a car becomes cheaper than using alternatives.

  3. A set of pseudo-random numbers appears random, but is can be replicated. Such a generator takes a “seed”, or starting number. Started in such a way you get a series of numbers that appear to have a random distribution. You can then stop, go back, and start such a generator again, with the same seed, and get the same series of numbers.

    This means that two separate entities starting the same pseudo-random number generation algorithm using the same seed ‘see’ the same sequence of numbers and can operate in synchronicity. This has real world applications in crypto, and interestingly enough, your cell phone. Cell phones and towers multiplex the available spectrum by synchronizing using PNR series which allow each to predict what the other is going to do next.

  4. This is one more pebble in the avalanche of the replication crisis.

    As you suggest PG, TED Talk scientists and advice columnists have taken it on the chin over the last 5-6 years. Most everything you’ve heard about psychology and social science in the popular press ain’t what they make it out to be.

    All the work about “priming”, about the situational and environmental influences on our behavior — all of it central to this notion of the “nudge”, given to us dull-witted herd animals by our benevolent leaders — is about as scientific as reading the tarot.

    Richard hits on this in his comment:

    “Notice how the recent discussion about the replication crisis is in the social sciences. We aren’t seeing this stuff about particle physics. This isn’t a slam against the social sciences in general, but about physics envy leading social scientists into waters they are unqualified to navigate.”

    When social scientists try to treat human behavior as a physics problem, they run up against the limits of that methodology applied to a domain where there aren’t that kind of determinate answers.

    They are looking for effects that, if they exist at all, are weak and unreliable. They then go about using statistical methods to sift out that weak effect from large bodies of data. Sure, you can get “results” from that… just don’t expect anyone else to find them.

    And that’s before getting into questions of actual misconduct or plain old sloppiness.

    They could have just listened to Aristotle. All the way back in 350 BC he was telling us that any generalizations about human behavior would only hold in general and for the most part.

    • When social scientists try to treat human behavior as a physics problem, they run up against the limits of that methodology applied to a domain where there aren’t that kind of determinate answers.

      Because maybe it isn’t science? These are free-riders, not scientists.

      I’ve had the same argument with economists trying to pretend it’s a science.

      • Not to be rude, but that’s a silly thing to say.

        Maybe that kind of dogmatism impresses ideological physicists and math geeks with a certain temperament, but the idea that a field of study isn’t a “real science” because it isn’t doing physics sounds closer to political screeching than a well-justified position.

        Petty turf wars are rarely convincing, much less backed up by serious reasons.

        Biology is a real science — one of the first sciences in fact — and it faces many of the same limitations of quantification, even though the math tends to get you a little further than it will where humans are involved.

        Ernst Mach may have been loud and cranky, but loud and cranky doesn’t make you right.

        • Not all sciences are creative equal.
          Yes, science is by definition knowledge but there are differences from the descriptive sciences, the physical sciences, and the behavioral sciences.

          All suffer from reproducibility and observer issues but in the hard sciences (and biology is one) focused on the external world it is minor (quantum mechanics aside) whereas the behavioral sciences like economics and social sciences, to say nothing of “political” science suffer from massive reproducibility issue, mostly
          because of observer issues like bias, both conscious and unconscious.

          Conscious bias is hard enough to police but unconscious bias is crippling in tbe social sciences (the primary subject from the OP) where it is well nigh impossible to distinguish between hardwired mechanisms and cultural drivers. Now add academic politics and partisan forces to the mix.

          Think back to Murray’s BELL CURVE.
          Who is biased? Murray? His kneejerk critics? Both?

          Social scientists can’t even agree on what Intelligence is much less how to quantify it. And the eyes of many out there, if you can’t reliably quantify it, it ain’t science.

          I would suggest that in the divide between hard sciences and “soft” sciences, biology belongs on the hard side, psychology straddles the border (currently evolving into a hard science) and the rest of the behavioral studies are still too fuzzy to be called even soft science. Most are barely distinguishable from parapsychology and psionics, which at least have the virtue of being useful in fiction. 😀

          • Gate-keeping from Authorities on High is as unconvincing when scientists try to do it as when publishers try to hand down edicts about what’s worth reading.

            Science is what scientists do. Any moves beyond that tautology have frustrated far better minds than us mere blog commenters.

            • If science is what scientists do (the scientific method) then social “scientists” aren’t. The issue isn’t whether behavior studies folks are trying to do science, but whether they’re succeeding and actually doing it.

              This is not a matter of authorities on high: it’s about reproducibility, falsification, and rigurosity all of which are founded on the process of the scientific method. Miss on any of the steps and you’re not doing science.

              https://en.wikipedia.org/wiki/Scientific_method

              The world of tbe fuzzy studies is more about opinions and hand waving than hard facts which is why the debates over the validity of the data are so central, whereas in the hard sciences you deal with tangible real world objects and entities: scientists might debate about the meaning of a fosil but nobody has to ponder about its reality. Folks are able to debate and *test* for experimental error in confidence that the outcome will either validate or falsify a hypothesis. Verification or falsification can be achieved.

              The same methodology must produce tbe *exact* same outcome, not something “generally similar”. Only two outcomes are possible. (Even in quantum mechanics, which is probabilistic rather than deterministic.)

              That is the purpose of the method and why the method defines tbe activity, not a proclamtion by anybody on high or down below.

              • This is not a matter of authorities on high: it’s about reproducibility, falsification, and rigurosity all of which are founded on the process of the scientific method. Miss on any of the steps and you’re not doing science.

                Pay close attention to this statement.

                Which Science Deciders put *these criteria* on the list of scientific methodology?

                What scientific discovery could disprove them as standards and ideals for scientific practice?

                Do you believe that mathematical physicists, cosmologists, chemists, evolutionary biologists, and archaeologists all use a single scientific method?

                These are the hard questions that this facile hard/soft distinction can’t even begin to answer.

                In fact they sound a lot like the claims of a soft, non-quantitative science!

                Here I spent all that time reading Bacon, Hume, Duhem, Poincare, Carnap, Popper, Polanyi, Quine, and Kuhn when I should have just taken the Wikipedia at face value. You learn something every day!

        • Physics has nothing to do with it. However, physics is definitely a science.

          Science is an approach to the discovery of objective facts and phenomenon about the natural world. When science made huge strides, other areas of study started calling themselves science, but they had no set of objective facts and phenomenon to present.

          The material physics studies is the same yesterday, today, and tomorrow. Other studies are based on observation of human behavior which varies all over the place. It changes, and is not a set of objective facts or phenomenon.

          People who study human behavior try to pretend it does, but we can observe that it doesn’t. Screeching, politics, dogma, rudeness, impressions, temperament, turf wars, and ideology all fall in their territory.

          I encourage the study of the human condition and behavior, and wish those who do the best. However, it’s time for them to stand up on their own two feet, and present their studies as deserving of respect. There is no need for them to call themselves something they are not. Better for them to show such value that others actually want to be like them.

          • “Science is an approach to the discovery of objective facts and phenomenon about the natural world.”

            I’ve no argument with this, at the most general level, though this statement is so general that it admits almost any systematic inquiry, which isn’t very helpful.

            The devil’s in the details of what counts as objective, and how/why we get to that understanding.

            “The material physics studies is the same yesterday, today, and tomorrow. Other studies are based on observation of human behavior which varies all over the place. It changes, and is not a set of objective facts or phenomenon.”

            Oh, no, that’s far too quick a leap in that last line.

            Firstmost, most every interesting phenomenon in the real world is mutable in a way that the abstractions and oversimplifications of physics — substitute your favored quantitative method of abstraction as you wish — don’t contend with.

            Try to use Newton’s equations to solve for systems with more than three interacting bodies.

            The abstractions tell us neat things, but it’s a profound error to confuse the model of an object with the object itself. John von Neumann, one of the legitimate geniuses of our species, warned against this in his writings.

            Secondmost, if the concern is with the timeless, eternal, universal, immutable truths, then I’m afraid we’ve strayed out of the world of empirical facts. We’re now rubbing elbows with the likes of Plato and St. Augustine.

            Empirical facts are contingent and changing by definition. They’re always open to revision in the face of future discoveries. That’s a simple artifact of the inductive logic used to derive facts from observations.

            Universal and necessary truths belong to metaphysics and theology.

            No issue with that, if done self-consciously, but that isn’t the business scientists are in.

            Thirdly, it’s worth some time dwelling on this identification of the objective with that which does not change.

            Where’d that come from?

            I know where it came from — it came from Plato, and it’s been with us ever since — but the rhetorical point means to ask how people confused empirical studies of contingent laws and events (not to be confused with necessary and universal) with perennial metaphysics and theology.

            The question is what makes it the case that a changeable reality cannot be objective or conflicts with objectivity.

            A river changes from moment to moment, but we wouldn’t call it an illusion.

            Organisms constantly turn over matter and energy to such a degree that they’re better thought of as enduring patterns in a flow. They sure aren’t subjective hallucinations.

            Most things in the world are like rivers and organisms — complex, changing, and multi-dimensional. Abstractions have to screen out nearly all the complexity of concrete things for the sake of putting them to use.

            That’s fair, and it’s clearly an effective & useful strategy in many domains. But to confuse that abstraction for the reality is… un-scientific.

            See again John von Neumann’s warnings about the limits of using abstractions to understand complex concrete objects/phenomena.

            Michael Polanyi had a much better take that this old positivist mythology from the 19th century. The scientist is already involved in the world. There is no Absolute line to draw between object/subject, meaning/truth, empirical/necessary, since any line we could draw already involves human observers.

            Which makes sense. If you adopt a point of view that leaves the human observer as a disinterested spectator watching a show unfold up on the stage, perhaps you get some of that old-time positivism.

            But then, what human being, scientist or otherwise, ever watched reality unfold from God’s point of view?

            And where’d we get that little image if not from a history of ideas born out of religious and mythical ideas?

            That Plato, he’s always sneaking back in through the back door.

            To phrase all this differently, why should anybody *care* about the standard of knowledge imported from physics and other highly quantitative disciplines?

            What makes this very high bar so important to clear, when so many of the sciences, and really the problems facing real-life humans, have so little to do with it?

            It’s clear you don’t think much of the “lesser” sciences, which is fine, you’re free to believe what you like.

            But I’m not sure what’s gained by petty turf wars and gatekeeping games over what’s “really” science, given so much of this intellectual house is built on the quicksand of barely-hidden metaphysics and theology.

            • But I’m not sure what’s gained by petty turf wars and gatekeeping games over what’s “really” science, given so much of this intellectual house is built on the quicksand of barely-hidden metaphysics and theology.

              The reliability of findings from science is quite important. The standards demanded of physics provide a great deal of confidence. Pretending studies of human behavior provide the same is misleading.

              • Oh, I agree with that 100%.

                My disagreement is about the reasonability of *expecting* studies of human behavior to meet standards of mathematical reliability.

                They don’t… and so much the worse for those standards when human beings are the subject of study.

                • I expect nothing. I observe they don’t meet the objective mathematical or observational standards that we find in science.

                  That’s fine. I look forward to them charting their own course, and making their own unique contributions.

          • Agreed.
            Except I think “present their studies as deserving of respect.” should be “present studies deserving of respect.”

            Too much sloppiness in too many efforts.

  5. Follow the money.

    Universities encourage their professors to bring in as large a grant as possible so that they can skim off a big part of it for upkeep. They also overcharge on lab facilities.

    Plus, a big chunk is paid out to the journals that they publish in. It’s all “pay to publish”, so very little goes to the actual work.

    Plus, it’s not “sexy” to try and “replicate” the experiment. Priority on awards go to the original work, not the “replication”, so grants are hard to get to “replicate” other peoples work. Besides, the granting agencies do not want to be shown to be handing out money for “research” that the public thinks may be fake, so don’t ask don’t tell.

    That’s because, “replication” is hard. No matter how clearly the experiment is described they often cannot be “replicated”. Not because anything was “fake” but because unless you have actually done the work before, you can’t duplicate it. I remember when gas lasers were developed. None of the labs were able to “replicate” his work, so he had to go lab-to-lab “laying on the hands” to show them what they were doing wrong.

    Plus, the funding agencies have real scientists on the funding committees. They look at the applications for research ideas of their own, and will say “no” so they can steal the idea — or person — for their own lab. They will also spot possible money makers that they will say “no” to and redirect them to big Pharma where NDAs hide the work, and the money they make as part of the work.

    In things like the soft sciences people will play around with things until they find something interesting. They will then cherry-pick their data and assemble a paper that fits their results without saying what they discarded along the way.

    That has pushed many of the soft sciences to require that an experiment be “pre-registered” describing what they are going to do, and that is what they use to check whether the paper is actually reporting the work done.

    The whole system of funding is a perverse incentive to bring in big grants and have flashy results to make it easier to bring in the next big grant. It generates “Golden boys” which is useful for Story if not for science. I have many a “Golden boy” as the villain of a piece.

    BTW, at the Highway Department I was often assigned to manage “research” that we had with the various State Universities. On one I told the boss that the “research” was a waste of time. He read me the riot act saying that it was our job to funnel Federal dollars into State Universities, and that if all we got was a nice thick report — the thicker the better — then we will have gotten our money’s worth. So I signed the approval to cut them the monthly check and never questioned it again. I knew which side my bread was buttered on, and that goes into Story as well.

  6. Keep in mind, too, that “data scientists” is nearly as wishy-washy and deceptive a term as “numerically-founded social scientists.” (I get to snark like this because I’m a hard-scientist-two courses-short-of-a-math-minor. From back in the Cretaceous, when we were just inventing slide rules. I actually know about data gathering and analysis.) Most “data scientists” are highly skilled at manipulating numeric data and finding patterns in it… and have no [expletives delighted] clue about how difficult it is to validate data collection and replicate it even in the clean, uninterfered-with conditions of a university research lab, let alone the real world, and when the source of that data is “human behavior or preference”…

    These kinds of articles (and even the analyses of these kinds of articles) remind me all too often of Gen. Buck Turgidson explaining the potential costs of a nuclear first strike (“Mr. President, I’m not saying we wouldn’t get our hair mussed. But I do say no more than ten to twenty million killed, tops. Uh, depending on the breaks.”). Notice the 100% anticipated data variance in there? GIGO, guys. GIGO.

    • Well, a 10% casualty rate could be survivable…
      …as long as you’re in a safe, well equipped bunker. 😀

      Which I’m sure the CCP has plenty of.
      They have their own “data scientists” interested in their own survival. Publish or perish has an added incentive there. GIGO isn’t the only pitfall.

Comments are closed.