Meta-Science 101, Part 2: Confirming confirmation bias

“You’re a scientist, and you have a hypothesis.  You really like said hypothesis…”

Believe it or not, simply liking a hypothesis is enough to potentially bias the results of a study toward your favored hypothesis.  This is a well-studied phenomenon known as researcher allegiance bias, and has most commonly been associated with the field of psychotherapy.

In psychotherapy, there are multiple different ways of treating a patient with a mental disorder.  For example, for treating somebody with a fear of spiders, you could tell the patient to identify their negative thoughts surrounding spiders, find those thoughts to be irrational, and replace them with more realistic thoughts.  This would be cognitive therapy.  Alternatively, you could just show them pictures of spiders until they’re comfortable with that, and then gradually work through scarier and scarier things until you’re dumping buckets of spiders over their heads.  This would be (a caricature of) systematic desensitization.

Naturally, different researchers have different favorite techniques, and multiple studies have been done comparing these techniques against each other.  Unfortunately, these studies were apparently hopelessly confounded with researcher allegiance bias, as this discussion of researcher allegiance bias puts well:

“Among studies by investigators identified as favoring cognitive therapy, cognitive therapy emerged as superior; correspondingly, systematic desensitization appeared the better treatment among studies by investigators classified as having an allegiance to systematic desensitization.

What made this pattern especially striking was that the analysis involved comparisons between the same two types of therapy, with the allegiance of the researchers as the only factor known to differ consistently between the two sets of studies.”

(The original meta-analysis that this discussion refers to can be found here).

It’s not a good thing when the theoretical inclinations of a researcher can reliably predict the outcome of a study.  Remember that whole thing about scientific results supposedly being a reflection of how the world works?  Yeahhhhhh.

When I said that researcher allegiance bias was a well-studied phenomenon, I meant it.  The above-mentioned meta-analysis (a systematic overview of primary studies) that found researcher allegiance bias is just one of dozens of meta-analyses done on the topic.  So what does one do when one has dozens of meta-analyses?  That’s right: a meta-meta-analysis!  In 2013, Munder et al. conducted a meta-analysis of 30 different meta-analyses and confirmed that there was a “substantial and robust” association between researcher allegiance and study outcome.

But here’s where it gets crazy.  Munder et al. also found that meta-analyses whose conductors were in favor of the researcher allegiance bias hypothesis–that is, the hypothesis that research allegiance bias is associated with study outcomes–found a greater association between researcher allegiance bias and study outcomes.  In other words, the meta-analyses on researcher allegiance bias themselves were confounded by researcher allegiance bias.1


Note that researcher allegiance bias doesn’t necessarily have to involve conscious intent to manipulate data in favor of your own favorite psychotherapy treatment.  More likely, subtle things like how the researcher designs the competing treatment protocols, how the researcher trains the therapists that will actually be carrying out the treatments, etc. are operative here.  But this just makes the problem of researcher allegiance bias even scarier; what we have to do battle with is not bad actors, but rather fundamental aspects of human psychology.  There have been a number of suggestions on how to moderate the effects of researcher allegiance bias (the same source I quoted above has a good discussion at the end), but I won’t talk about them here, as this blog post is already going to be long enough without addressing fixes of science as well.

Being biased towards one hypothesis over another doesn’t just play itself out in the phenomenon of researcher allegiance bias, however.  Perhaps even more powerful than personal inclination is financial interest; when you have a direct financial stake in seeing the results of your study go one way rather than another, this can have a strong biasing effect.

The most well-researched example of this involves comparing industry-funded clinical trials to independently-funded trials.  If financial interests play a role in biasing research results, we would expect industry-funded trials to show more positive results for the industry sponsor’s drugs than independently-funded trials.  This would be particularly real-world relevant if true, since drug and device companies now fund six times more clinical trials than the federal government.

Since the last time we looked at a meta-meta-analysis went so well, why don’t we do it again?

This meta-meta-analysis, published by the extremely well-regarded independent organization Cochrane in 2012, looked at 48 different papers, each of which themselves compared industry-funded studies to non-industry-funded studies; the total number of primary studies encompassed by this review numbered nearly 10,000.  The authors concluded that industry-funded studies were 32% more likely to find that the drug tested was effective, 87% more likely to find that the drug wasn’t actively harmful, and 31% more likely to come to an overall favorable conclusion for the drug.  These results were more or less in line with several previous meta-meta-analyses done on this topic (yes, there have been several).

Like with researcher allegiance bias, industry sponsorship bias seems to often be instantiated via study design.  For example, this can be done with more frequent testing against placebos than against active controls, resulting in an easier bar to clear for a drug to be considered “effective” by the study, or by using lower doses to mask adverse effects of the drug.  Whether or not these are conscious study design choices to boost the desirability of a drug I’ll leave up to the reader to decide; the bottom line is that, regardless, we know that industry funding introduces a real bias that ends up affecting the results of a study.

Continue to Part 3: P-hacking your way to publication >>>

1The snarky response here is that Munder et al. were obviously biased in favor of researcher allegiance hypothesis hypothesis, the hypothesis that researchers with an allegiance to researcher allegiance hypothesis are more likely to find associations between researcher allegiance hypothesis and study outcome.  Munder et al. can’t be trusted!  We need a meta-meta-meta-analysis!


Meta-Science 101, Part 1: An introduction


(Comic taken from xkcd, here)

Science is in dire straits.

As you may know, a white man in a position of power has recently been criticizing science relentlessly, and it seems like that will continue to be the norm for at least the next four years.  His actions seem informed by a worldview in which science is mostly false or useless, provoking strong reactions from scientists worldwide.  Most troubling of all, he’s just getting started; there’s no telling what he’ll do next, emboldened by his newly acquired institutional power.

You all know who I’m talking about.  That’s right: John Ioannidis.

Uh, who?

Contrary to the misleadingly-worded intro paragraph, John Ioannidis is not someone out for the blood of scientists; rather, he’s a Stanford professor who’s played a huge role in bringing to light a multitude of problems entrenched in modern scientific practice, and he’s dedicated his career to figuring out how to fix these problems.  And he’s not alone; Ioannidis is just one representative of a larger movement in the scientific community, which has become more self-critical and introspective in recent years.  If you’ve heard talk of “p-hacking” or “the replication crisis” recently, this is why.

This post is an attempt to synthesize all the problems in science that have surfaced as a result of this scientific self-reflection.  If we can call this movement meta-science, then welcome to Meta-Science 101: a whirlwind tour through the biases and flaws that currently plague science, and the result of the past month or so of me diving into this topic.

Let’s start off with how science is supposed to work.  You’re a scientist, and you have a hypothesis.  You test the hypothesis with a well-designed experiment, and your results come back; these results give you an insight into how the world works.  Without any fudging of data, you publish your result in a journal, and await the results of peer review.  The scientific community judges whether or not your work is up to snuff, and the work gets published or not, accordingly.  If your work is published, it gets replicated by another lab, confirming your original result.  Rinse and repeat across millions of scientists practicing worldwide, shower in the output of true knowledge generated.

Now let’s take a look at how science too often actually works.  You’re a scientist, and you have a hypothesis.  You really like said hypothesis, so you design an experiment to test it.  The results come back, but they’re not super clear, so you interpret it various different ways until you find a positive, statistically significant result in favor of some variant of your original hypothesis.  Researchers in your field receive your manuscript for peer review, and make semi-arbitrary recommendations to the editor.  In the end, you publish, so your boss is pleased, and more importantly the grant committee funding your boss’s proposals is pleased, guaranteeing you funding for a while.  Phew!  Good thing you got that positive result.

Meanwhile, alternate-universe-you that didn’t find a statistically significant result doesn’t publish.  The results sit in a file drawer.

But anyway, this-universe-you is happy that you published.  Your work never gets replicated, but if it was, the replication might not have confirmed your finding.

…There’s a ton of stuff packed into those last three paragraphs, so we’ll spend the rest of the post elaborating point by point.  First up, confirmation bias.

Continue to Part 2: Confirming confirmation bias >>>

Lions, bears, and tigers the interplay between individual and group decision-making, oh my!

A. A fable

Once upon a time, there lived a community of animals in a forest. The animals generally lived in harmony, until one day, when everything changed.

One of the rabbits had just returned from a short trip, and she arrived with news. “Everybody listen up!” said the rabbit. “I just discovered something that will change all of our lives. This could seriously be the best thing that’ll ever happen to us.” The animals were intrigued.

“So here’s the deal,” said the rabbit. “I just went over to the human camp and I saw them making a big pile of wood. First I thought, ‘No big deal, this is just humans doing weird human things.’ But then they pulled out this…” The rabbit presented a matchbox that the humans had left behind. “…and they struck one of the sticks against the box. The next thing I know, I saw a huge, orange glow arise out of nowhere.  They call this orange glow ‘fire’.”

“And!” continued the rabbit. “The fire can be used to do all sorts of cool things. The humans use it to make food hot, and believe me, hot food tastes really good. I know because they fed me some.”

The animals were getting excited now.  “What else can it do?” asked another rabbit. “Well,” the first rabbit said, “since it’s hot, you can just sit next to it in the winter to warm yourselves up. No more cold winters, guys! It’s also really bright, so you can actually use it at nighttime to see in the dark. I could go on and on here, but it’s really just best if I show you.”

One of the squirrels was a bit concerned. “Wait,” the squirrel said. “If the fire is really hot, could this be dangerous? I know when the sun is really hot I start to hurt all over. I’m not so sure I like the idea of basically making a second sun right in our backyard. Do we even have a plan for how to get rid of it if we don’t like it?”

“That’s silly,” the rabbit shot back.  “Fire is great and you’ll definitely change your mind when I show it to you.”

The squirrel was upset that the rabbit thought his concern was silly. The other squirrels were upset as well, and started arguing with the rabbit. The other rabbits joined in, and soon it became an all-out debate of rabbits versus squirrels.

Hence started the Rabbit-Squirrel Wars of 2250.1

The debate shifted from the reasonable to the ridiculous. The rabbits’ arguments evolved into: Fire scares away demons, if you hate fire then you’re an element-ist, and squirrels have low IQ’s and their opinions should be disregarded. The squirrels’ arguments included: Fire has been scientifically proven to give you cancer,2 making fire is like idol worship and will upset the Sun God, and everybody knows all rabbits are liars anyway.

A line had been drawn in the dirt. Nothing could convince any of the rabbits or squirrels to change their minds. They were locked into their respective positions.

Then there were the bears. The bears watched the rabbit-squirrel debates. They understood which arguments were good, and which arguments were bad. They realized there were good arguments both to start fire and to refrain from doing so. But they didn’t really know how to weigh these good arguments against each other.  So they figured they would remain mostly neutral, at least until they could receive some guidance from some wise authority.

The lion was wise.3 After the lion heard all the arguments from the rabbits and the squirrels, she went back to her den. A few hours later, she came back out and said to her bear friend, “I’ve thought about the fire debate long and hard, and I’ve come to support the idea of starting a fire. I’ve looked into the human literature on fire, and I’ve found a number of studies supporting the benefits of fire, and not as many studies concerned about safety. I’ve done a rough calculation of the potential benefits minus the potential costs, and it seems like the benefits outweigh the costs by a factor of two when converting all measures to quality-adjusted life-years. I’d say that starting a fire is likely the way to go with a 80% confidence interval. You can check my analysis if you’d like.”

The bear nodded, understanding enough to follow, but being lazy, didn’t bother to check the analysis or do his own. The lion is wise, he thought, so I trust her analysis. I’ll support fire.

The bear went to tell his bear friends that the lion supports fire. The bear’s friends all understood that the lion was wise, so they all decided to support fire as well. They went to tell their friends, and those friends went to tell their friends, and so on and so forth until the entire community of bears decided to support fire. Many of the later converts didn’t even hear of the lion’s analysis; they just saw that the rest of the bear community was coming around to support fire, and the bear community was generally smart, so they would support fire, too.

Finally, one day, the lion gathered the entire forest community to take a vote. “In incredibly complex debates like these that divide communities,” stated the lion, “we have to go by majority opinion. All in favor of fire, say aye!” The rabbits and bears all said “Aye!” in unison. “All opposed, say nay!” Only the squirrels said “Nay!”

It was clear; the squirrels were outnumbered, and the pro-fire crowd had won out. The lion, trusting in the results of a direct democratic vote, took the matchbook, and, with all the animals in nervous anticipation, set fire4 to a tree…

…and soon after, the forest was no more.

B. Mo’ bears, mo’ problems

I think one of the failure modes of society is to have too many rabbits and squirrels5; that is to say, there is a real danger to having too many people who have become one with their beliefs, and who cannot even imagine changing their minds in response to opposing argument. This is something I touch upon here. But this is clearly not the problem in the story above. If it were all up to the rabbits and squirrels, the vote would have come out to a tie, and there would have been no forest fire.

I don’t think the lion’s the problem, either. She was well-intentioned, her analysis was pretty thorough, and she even put the question out to a democratic vote. Perhaps in hindsight she got the analysis wrong, but it doesn’t feel right to put any blame on her when she was just one individual making her own analysis in a transparent, honest fashion.

No, I think the problem here is that there were too many bears.


“I’m very disappointed by my fellow bears.”

Having bears in your society seems like a good thing a priori. Isn’t it good to have people in your society who listen to expert opinion, and who are willing to change their minds toward the societal consensus?

To an extent, yes. But problems can arise when a lot of people start thinking this way.

To see why, let’s first take a look at what happens in situations with few or no bears; namely, in prediction markets. Prediction markets are simple: They are betting markets where traders can buy and sell contracts that cash in for $1 if a particular outcome occurs. For example, a contract may read “The Green Bay Packers will win the 2016-2017 Super Bowl,” and it may cost $0.20. If you think that this outcome is at least 20% likely to occur, then you’ll buy the contract; if you don’t, then you’ll wait until the price falls. Every time somebody buys or sells a contract, the market price self-adjusts to reflect the new balance between supply and demand. As a result, the market price of a contract represents the consensus of all traders at any point in time on the probability of the event occurring, just as the price of a stock represents the market consensus on the future value of a company.

Prediction markets are perhaps most well-known for being applied to election forecasting. While you might think that polling already serves as a reliable way to predict the winners of elections, given that pollsters ask respondents directly who they’ll be voting for, there is good reason to believe that prediction markets are more accurate. In a study comparing the predictions of one of the oldest prediction markets, the Iowa Electronic Markets, to 964 polls taken over the course of five presidential elections from 1988 to 2004, the authors find that the IEM predictions were closer to the actual outcome 74% of the time.

The accuracy of prediction markets in election forecasting has sparked interest for implementation in other domains as well.  Many large companies either have used or now use prediction markets (Google. Hewlett-Packard. Eli Lilly. General Electric. MotorolaBest Buy.) to forecast metrics such as expected demand for a product, product launch dates, or clinical trial enrollment rates. US intelligence agencies have also been interested in implementing prediction markets to forecast future political events around the globe, albeit with some pushback.

What makes prediction markets so accurate? As the theory goes, when a group of individuals is involved in making a prediction, each individual brings both useful information and error. Useful information brings an individual closer to the truth, while error pulls him or her away from the truth. With a diverse group of individuals that’s making independent judgments about the probability of an outcome, the errors tend to be random and cancel out, leaving only useful information. The group as a whole accumulates more useful information than any single individual has, and so tends to make more accurate predictions than even the most well-informed individuals in that group. The general phenomenon described here is known as the “wisdom of crowds,” and applies not only to prediction markets but to any decision-making body made up of independently-acting individuals.

Let’s return to the bears. Contrary to the individuals that make up a wise crowd, who make independent judgments, the bears made their judgments of the benefits and risks of fire in a completely dependent fashion; each bear either listened to the lion or to the consensus of the bear community, without making an independent analysis. For the bears, error is not random; the errors are all systematically pointing in the same direction as the lion’s. Rather than the errors canceling out, the bear community amplified the lion’s error by a number-of-bears-fold. And unfortunately for the forest community, the lion just happened to be wrong on this one.6

As smart as any individual lion is, a wise crowd is generally smarter.

C. Should I be a lion or a bear?

Say it was your job to look after forest communities in general. Seeing what happened with the fire debates in the first forest, you head to the next forest over and start an education program targeted to bears. You try teaching the adult bears about the wisdom of crowds, and you tell them that it is virtuous to make your own judgments and do your own research. To the bear cubs, you say that lions are awesome and well-respected, and don’t you want to be just like a lion when you grow up? At the end of your education program, you think you did as good of a job as you could, but you notice that the bears aren’t really changing their behavior. The bears that took your course are still just replicating the lion’s opinion, because, well, the lion is wise. What’s going on here?

Above all else, bears are lazy. They don’t want to make independent judgments, because doing so takes time and energy. And as someone who thinks like a bear much of the time, I sympathize with that sentiment.

Because despite everything I said about the wisdom of crowds, and the dangers of making dependent decisions, part of me thinks being a bear is OK sometimes. If I needed to do independent research every time I wanted to have a belief about a complex matter, then I wouldn’t have much time for much else. It’s just so much easier to look up what the experts say about the matter, assume those beliefs, and then call it a day.

(I mean, why would I do my own research into the pros and cons of marijuana legalization when this already exists?)

If my goals include both having true beliefs and saving time, then being a bear is a great way to advance both those goals at once. Unfortunately, being a bear means I’m not contributing to the wisdom of crowds, and in fact am contributing to more highly correlated errors in the crowd. So there’s a tension here between what would be best and easiest for me, and what would be best for the group.

This sets up a dilemma: In forming beliefs, do I do my own independent research, taking up a lot of my time and energy but also building up robustness in the crowd consensus? Or do I just look up the relevant experts and believe what they say?

On any given topic, do I contribute to the marketplace of ideas, knowing that by doing so I’m making it less likely to hold true beliefs myself? Or do I take from the marketplace, knowing that every time I do so I weaken it just a little bit?

Put concisely: Per issue, do I be a lion, or do I be a bear?

There’s probably no general answer here, and it’s going to depend on how important the topic is to me, how much time it would take me to research the topic, and how much the topic in question is amenable to being completely understood by a small group of experts. Other than that, I don’t really have any guidance on this, and I don’t know where I would go to look it up. So…if you’re a lion when it comes to when-to-be-a-lion-and-when-to-be-a-bear, I’m listening.

(P.S. I was going to end the post here, but I just discovered something relevant and interesting that didn’t really fit anywhere else, so I’m just putting it here. So you know those prediction markets I told you about earlier? A couple months ago, the markets received some flak for “failing” to predict Brexit; they had put the probability at about 25% even just the day before. Now, it is a bit sketchy saying that any one probabilistic estimate is “wrong,” since even events with a 25% probability occur, well, one out of every four times. But anyway, what I find interesting is the retrospective analysis that occurred as a result.  In attempting to explain the market “failure,” some have stated that prediction markets have become too stable as of late, not updating to new information as much as they once used to, and becoming more inaccurate as a result. They speculate that this is due, ironically, to the increased reputation that prediction markets have received. Now, as the theory goes, rather than traders in the market acting on their own private information and making independent judgments, traders have more and more started to take the prediction market prices themselves as reflecting the “true” probabilities. So even when a new poll comes out that contradicts the current market prices, traders will discount the poll, because the market must be right. Putting things in zoological terms, the prediction market itself has become a lion, and the traders are turning into bears.  This gives new meaning to the phrase “bear market.”)

The Conclusion Box: Providing information to-go.

A. Rabbits and squirrels fight.  Bears listen to lion.  Lion says fire, so fire.

B. The problem in the previous story is that the bears were all making dependent decisions, correlating their errors with one another.  In contrast, a crowd of individuals making independent judgments typically makes a more well-informed decision as an aggregate than any single individual could; this is exemplified by the accuracy of prediction markets.

C. Unfortunately, the lesson here isn’t just as simple as “be a lion instead of a bear,” because being a lion incurs a lot of cost on our time and energy.  You’ll probably have to pick and choose on which topics to be a lion, and on which to remain a bear.


1This is set way in the future, after we genetically engineer all forest animals to speak English, because reasons.

2Citation: Squirrel, S. Sq. J. Fire Sci. 2250, 1, 1–2.

3I don’t know what kind of stable forest community includes rabbits, squirrels, bears, and lions, but this is my story, so I’ll do what I want.

4Um, I guess we’ve genetically engineered them to have opposable thumbs, too.

5This quote is not to be taken out of context.

6For more on this kind of dependent decision-making, see information cascades.

America has a Horsemen problem


Viktor M. Vasnetsov [Public domain], via Wikimedia Commons

A. Four Horsemen politics

Four major patterns of interaction have been observed to show up again and again in couples whose marriages will eventually end in divorce; namely, these are Defensiveness, Criticism, Contempt, and Stonewalling. Psychologists who study this kind of thing have ominously dubbed them the “Four Horsemen of the Apocalypse.”

When Alice asks Bob if he remembered to buy the toilet paper, and Bob responds with “No, I forgot. You put a million things on the grocery list that I was supposed to buy, how am I supposed to remember all of them? You should just go next time,” then that’s Defensiveness. Bob took an innocent question and perceived it as an attack, immediately putting his defenses up and turning the attack around on Alice. When Alice responds with, “What? You’re so forgetful. You always do stupid things like this. I can’t believe it,” then that’s Criticism. Alice made a globalized attack on Bob’s character, when really the dispute was just about toilet paper. When Bob turns around and says, “You think I’m the stupid one? You can’t even keep our bills straight half the time. That’s so easy a monkey could do it. Talk about pathetic…” that’s Contempt. Bob sees Alice as inferior to himself, and is trying to make her feel as worthless as possible. When both Bob and Alice eventually retreat to different corners of the house and refuse to engage with each other, that’s Stonewalling. Instead of confronting the issue constructively, Bob and Alice leave the issue unresolved, and the negative emotions simmer until their next encounter.

If these are the criteria, then America needs to go into couples counseling, fast. The Left and the Right are America’s Alice and Bob.

When I look around, I don’t see people just disagreeing with each other. What I see is people full of hatred (read: Contempt) for people of the opposite political persuasion. Instead of debating each other’s political beliefs, we’re debating each other’s identities (that’s Criticism). If you’re a Republican, then you must be an uneducated, racist, sexist homophobe who thinks America is the greatest country in the world, and oh you probably own multiple guns, too. If you’re a Democrat, then you’re probably a naive, hypocritical, elitist snob whose lack of moral values is driving this country to Hell, and oh that’s a nice badge from the PC police academy you got there. Rarely are we so overt, but you can feel these attitudes just below the surface of debates all over the Internet. See: Youtube comment sections, Facebook news feeds, probably Reddit (I don’t know, I don’t Reddit).

“OK, Brian, but everybody knows the Internet is a wretched cesspool where rational discussion goes to die. Nobody really feels that way.” That’s what I want to believe, too. But I think the recent rise in polarized politics is hard to deny. Over the past 20 years, the percentage of Democrats and Republicans that view the other party “very unfavorably” has more than doubled, with 38% of Democrats and 43% of Republicans doing so as of 2014. Think about that for a second. If we boil American down to twenty people living in a neighborhood, ten Democrats and ten Republicans, chances are that eight of them can’t stand half of that neighborhood and want to be as far away from them as possible. That’s going to be a neighborhood with a lot of fences.


And to make matters worse, these eight people are going to be much more likely to participate in the political process. So now it’s in the interest of politicians of [political party 1] to cater to those who hate [political party 2] the most. To get votes, these politicians will make statements that Criticize and are Contemptful of [political party 2], signaling that they are really, really committed to the causes of [political party 1], unlike those other guys that are running that are talking about compromise and stuff. These statements polarize the electorate on both sides, and maybe the eight people grow to ten. Now the politicians are even more incentivized to go down the Criticism and Contempt route; Four Horsemen politics is winning politics. This eventually polarizes the electorate even further, so maybe now the ten grows to twelve. You can see where this is going. The future of that neighborhood does not look good.

B. Truth from trade

All of this would seem very strange to an alien observer, given that everyone should be sharing the same goal: Truth. After all, many of the things that we argue about are empirical questions that have answers, at least in theory. They may be convoluted answers that defy simplification, and they may be difficult to discover, but they are out there nonetheless. Shouldn’t this common goal of Truth unite us to find out the answers to the questions we debate with each other? Shouldn’t we all be holding hands, singing Kumbaya as we march together towards Truth?

And yet our political discourse is not at all optimized for truth-maximization. When we pick our companions to join us on our truth-seeking journeys, we tend to reject those who don’t share our beliefs (that’s Stonewalling). Again and again, we find ourselves off the beaten path, and inevitably we wind up inside an intellectual echo chamber. And once we’re there, it’s really, really hard to get out, because the ideology we subscribe to has become a part of our identity. The walls of the echo chamber have grown outwards into us, and to leave the echo chamber is like leaving a part of ourselves behind.

But this echo chamber usually constitutes only a small part of a whole world of beliefs. If we imagine ourselves as explorers looking to find as many true beliefs as we can, then why constrain ourselves to only a part of the map? Half of us are in Liberal Belief-Land, half of us are in Conservative Belief-Land, and not enough of us are willing to cross between them. It’s like we’re Han China and the Roman Empire, except this time nobody’s on the Silk Road. If we aren’t willing to engage in a trade of ideas, then you won’t get my truth, and I won’t get yours, and as a result everyone is worse off.

Our best shot at Truth comes from exposing ourselves to a diversity of opinion. You may not talk to—or even know—anyone from across the political aisle. Such is the power of the social bubbles that most of us live in, whether we intended to be absorbed into these bubbles or not. But in the age of the Internet, there’s no excuse for us not to read up on worldviews that are very different from ours. If you’re a liberal, read a conservative op-ed every once in a while. If you’re a conservative, browse some liberally-disposed forums and see what you think. You may be surprised by what you find, or you may not be. But we do ourselves—and the goal of Truth—a disservice if we don’t even try.1

However, coming into contact with differing opinions is just the first step. How do we rationally evaluate the arguments that we come across?

C. Here’s how I try to rationally evaluate the arguments that I come across

There’s a silver lining to this whole Four Horsemen of the Apocalypse analogy. Apparently, couples that attended a two-day workshop that involved being trained on how to avoid the Four Horsemen were much more satisfied with their marriages one year later. And maybe similarly, if we as a nation can learn how to un-Stonewall each other and have rational discussions without resorting to Defensiveness, Criticism, or Contempt, then perhaps we can begin a long process towards being OK with one another again.

So here I present to you a list of principles that I try to follow whenever having a discussion with somebody.2 Disclaimer: This list is based off of personal experience, and I didn’t look to see if there is any research out there showing that these principles lead to more rational discussion, if such research is even possible. There’s also probably a ton of other principles you could come up with, but these are the ones that I’ve found most valuable for me. Without further ado:

  1. Think of debates as collaborative efforts, not as adversarial encounters. This is a really important one. Remember, both you and the person you’re debating should be after the same common goal: Truth. This is easy to forget, especially considering that all the vocabulary that we typically use to describe debates implies that there is a winner and a loser. In a debate, you try to “defeat” your opponent; you try to get your opponent to “concede” a point; you try to “attack” your opponent’s argument. Even the idea of having an “opponent” in the first place implies a confrontational situation. But debates don’t have to be framed this way. An adversarial debate is a zero-sum game, but in a collaborative debate, both sides are working together to reach a better understanding of the truth. If you have been convinced by another person to change your mind, in the traditional framework, you have “lost” the debate. But in the collaborative framework, any change in belief that brings that belief closer to the truth is a net positive.

  2. Divorce your beliefs from yourself. If you’re really attached to a belief, your incentives can become misaligned with the goal of Truth. Any evidence against a belief that’s attached to you is an attack on your identity, and it can really hurt. If you’re an academic that has come up with a new theory, it’s in your personal interest to defend that theory as much as possible, even against damning evidence. If your entire network of friends and family is centered around a framework of beliefs, then it is in your interest to defend that framework at all costs, lest it fall down and shatter the world around you.

    Once a belief has been incorporated into your identity, it rigidifies, and even overwhelming evidence won’t be able to pry it loose. It’s best not to let ourselves get into these kinds of situations in the first place. Beliefs should be soft and malleable; they should be shaped by the evidence. You shouldn’t be afraid of changing your mind, and you should shed old beliefs as soon as it becomes clear that they are no longer supported by the evidence. This will be much easier if you don’t view arguments against your beliefs as attacks on you. 

  3. Don’t demonize the other side. When you present yourself as the enemy, all of your arguments become enemies, too. By attacking the other person, you will immediately trigger their defenses; as a result, they won’t accept your arguments, no matter how cogent, out of sheer pride. Their beliefs won’t budge; if anything, they will harden.

    As will yours. When you envision the other person as Satan incarnate, you really won’t want them to be right. This leads to a strawmanning of their arguments, or attacking a substantially weaker version of the argument than the one they are actually putting forth. It’s much easier to imagine that you’re in the right when you portray their arguments as so darn silly. You’ll walk away from the debate even more convinced in your position than when you started.

    This is the opposite of what we should be doing. Feeling good about defeating a strawman is like patting on yourself on the back for scoring a goal on an empty net. No one ever learned anything from attacking the weakest version of an argument. We should instead be engaging with the best possible version of the other person’s argument; this is known in some quarters as steelmanning. A steelmanned argument can be an even better version of the argument than what the other person is actually saying. If there are gaps in the other person’s argument, fill those gaps in for them. If you think their argument needs a bit of fine-tuning, imagine how the fine-tuned version would play out. You’ll learn much more by considering the best possible opposition to your argument than by continuing to score goals on empty nets. 

  4. Find a balance between learning mode and persuasion mode. Often, we will approach a debate with the sole aim of persuading the other person to adopt a different position; we’re in full persuasion mode. But remember Principle #1: Both you and your fellow debater should be trying to work together to reach the truth. And the best way for the two of you to reach the truth is by taking the Silk Road to each other’s Belief-Lands and engaging in a trade of ideas. You’ll need to turn the dial towards learning mode. Ask as many questions as you give answers. Try to find out why the other person believes what they believe. Because if the two of you don’t understand each other’s positions, then you’ll just be talking past each other. Nominally you’ll be engaged in a debate, but really you’re just both talking to yourselves.

D. Practice makes perfect

Every time I go to the gym, I face the same recurring phenomenon. I’ll get set up to do whatever exercise I’m doing, with a mental checklist of all the things I need to do during the exercise to keep proper form: keep my chest up, keep my back parallel to the floor, open up my hips but not my knees, etc. And then I’ll start…only to find that my body is completely disobeying every item on my checklist. My chest drops, my back rounds, and my knees go from bent to locked. It’s just really hard for my brain to keep track of how my form should be at the same time as it’s spending energy on doing the exercise itself.

This is kind of how I feel about the list of principles in the previous section. When I get into a debate with somebody, even a mild one, my body just automatically goes into competition mode, at which point it’s really hard to keep a commitment to rationally evaluating opposing arguments. There’s just something about the nature of a debate that makes me want to defend whatever I’m saying as hard as I can, even if I wasn’t really all that committed to that position to begin with. Because I’m arguing for that position now, so it’s mine. Principle #2 especially should rear its head, but, well, it just doesn’t. My brain just doesn’t bring it up.

But here’s the nice thing about debates that makes them different from exercising. While I don’t know that you could spontaneously start practicing your form anywhere outside of the gym and expect that you’ll perform better the next time you exercise (and anyway you’d look like a crazy person), you can practice these four principles pretty much anytime after a debate ends. Too caught up in the heat of the moment to steelman an opposing argument during a debate? That’s fine—you can do that afterwards in your head. What if you didn’t put yourself in learning mode during the debate and so you didn’t really get to see where the other person was coming from? You can do that afterwards, too; either ask them in casual conversation later, or browse the Internet for people who’ve stated similar opinions. Principles 1 and 2 are more about putting yourself in a different frame of mind than taking a particular action, so practicing those can just involve reflecting on the debate and seeing how you did with regard to those principles.

Once you practice enough, you’ll eventually reach the point where you’ll just instinctively do these things. You’ll have retrained your mind to be open and charitable to opposing ideas. That’s the goal at least. I’m certainly not there yet. In fact, part of the reason for me writing this blog post was to put these principles down in words, which hopefully will help me in the future. If I’m ever in a debate with any of you and I exhibit bad argumentation, please remind me of principle number whichever.

Getting rid of the toxicity that saturates political discourse nowadays might require some sort of systemic change involving the media or public education or something. I don’t know. Finding the answer to that sort of question is way above my paygrade (which for blogging is $0, so I guess everything is above my paygrade). But I think we could go a long way just by each of us individually agreeing not to contribute to a toxic discourse—because Toxic Discourse is an organism, and it will reproduce like cancer if we provide it the means. If the Four Horsemen are the substrates on which Toxic Discourse feeds, let’s resolve to starve that motherfucker out.

The Conclusion Box: Providing information to-go.

A. Psychologists have found that couples whose interactions involve Defensiveness, Criticism, Contempt, and Stonewalling (the “Four Horsemen of the Apocalypse”) are much more likely to get divorced. I think there’s an analogy to be made here with current political discourse, which is putting America down a very bad road indeed.

B. This is strange given that everyone should be after the same goal: Truth. We have a tendency to put ourselves into ideological echo chambers, which obscure Truth by only providing us a limited view of reality. We should leave our echo chambers and try to interact with as many different beliefs as possible, picking up the true ones and leaving the false; we should engage with others in a trade of ideas.

C. I provide a list of four principles that I think help provide an environment for rational discussion: Think of debates as collaborative efforts, not as adversarial encounters; divorce your beliefs from yourself; don’t demonize the other side; and find a balance between learning mode and persuasion mode.

D. While it’s difficult to apply these four principles when you’re emotionally invested in a debate, it’s possible to continue practicing them after the debate is over. If all of us practice having rational discussions with each other until that’s just the default mode of political discourse, then maybe we can start healing the vast divisions in America today.


1Let me be clear that I’m not advocating for some sort of false equivalence between any two beliefs, or between two different clusters of beliefs. Beliefs can be right or wrong, and similarly, clusters of beliefs can contain a high percentage or a low percentage of true beliefs. What I’m saying here is that, a) you often don’t know how many truths are in a cluster of beliefs until you’ve spent some time getting to know it (and in fact you may systematically underestimate it if that cluster of beliefs is opposed to your own), and b) even clusters of beliefs with a relatively lower percentage of true beliefs are worth exploring in order to find the true beliefs that are there (which may be more valuable if they are sufficiently different from the beliefs that you already hold).

2It just so happens that there are also four of them, so I wanted to take the Four Horsemen analogy way too far and name them the Four Horses–because they are unencumbered by the Horsemen, you see–but I don’t really know how to do that elegantly and it’s really bad anyway and so I’m just kind of putting it as an afterthought in this sentence here.

Two different notions of probability


Recently, my relationship with the concept of probability has been analogous to what a lot of people go through with religion. At first, I thought I perfectly understood probability as it was first presented to me; but then, I had an internal crisis, when I started to question if the notion of probability was even logically coherent; and finally, upon reflection, now I think I’ve clarified a few things that I’d like to share with you. Let me explain.

A. What does probability even mean anymore?

I’ll start off with a standard account of probability, as I and many of you learned in school. In math: P(E) = NE/NT. In words: the probability of an event E can be approximated by the number of times event E occurred (NE) out of a large number of repeated trials (NT), preferably as NT approaches infinity. The probability of event E is used to assess the likelihood of event E occurring the next time I conduct a trial. For example, I could make the statement, “The probability of me rolling a 1 on a 6-sided die is ⅙,” and this would totally make sense given this definition. If I roll a die enough times, assuming the die is fair, then I’ll end up rolling a 1 once every six times, which defines the probability of ⅙. So the next time I roll a die, I can use ⅙ as my estimate for how likely I am to roll a 1. Easy enough.

But I think statements like this constitute only a small fraction of the total number of statements we make that involve the concept of probability. For example, political forecasters make statements like this all the time: “The probability of the UK leaving the European Union, or ‘Brexit,’ by the end of the week is 40%.” Let’s see if we can apply the above definition of probability to this statement. Event E is easy to define; it’s the UK leaving the EU by the end of the week. But if I want to know how many times event E occurs over a large number of trials, I’m in big trouble. In the die-rolling case, I know what constitutes a “trial”; it’s every time I roll a die. But what’s a trial in the Brexit case? Every time this week ends? This is obviously problematic, since we don’t yet have even one instance where this week has ended. I could say a trial is every time any week has ended since the creation of the EU, but this is also a problem, since I want each trial to have the same probability distribution of outcomes as the next. Presumably the probability of the UK leaving the EU was way different a year ago than by the end of this week. Without having even the theoretical possibility of repeating a large number of trials, can I even use the concept of probability? Does the Brexit probability statement make any logical sense?

Things get even trickier when you realize that people make statements of probability that aren’t even about future events. For example, take this statement: “The probability that intelligent life exists outside of Earth is 30%.” For the die-rolling case, the probability of ⅙ defines how likely it is for me to roll a 1 the next time I roll a die. Here, the probability of 30% defines how likely it is that intelligent life exists outside of the Earth the next time…what, exactly? There is no “next time.” This statement is completely about the world as it already exists. But if this is the case, we have a huge problem. As a matter of fact, intelligent life outside of Earth either exists or it doesn’t. If we can make a statement of probability at all, it should be either 100% or 0%. Is the intelligent life probability statement just completely incoherent then?1

Despite the problems with applying the above definition of probability to the Brexit statement and the intelligent life statement, these statements still intuitively mean something to us. We don’t react to these statements the same way we would react to a logically incoherent statement like, “The probability of Brexit by the end of the week is the letter A.” It seems that these probability statements are communicating something to us. But what is that something, exactly?

B. Bins and Bayes

I don’t think there’s a good way to reconcile these probability statements with the above definition of probability. As I see it, the solution lies in a different definition of probability, one that involves a person’s subjective belief in the likelihood of an event.

To illustrate, imagine that you have a hundred different bins. Each bin has a label on it with a different probability. So there’s a 1% bin, a 2% bin, etc. all the way up to 100%. When you make a statement like “There is a 40% chance of Brexit by the end of the week,” you are placing a slip of paper that says “Brexit will happen by the end of the week” in the 40% bin. Over time, if you make enough probability statements, these bins will be full of slips of paper. If you take the 40% bin, empty it out, and put all the statements that ended up true in a pile, ideally you’d end up with 40% of the slips of paper in that pile. Similarly, the 30% bin should end up with 30% of the slips of paper in the “true” pile, and so on and so forth for the rest of the bins.

For most of us, our bins won’t be so ideal. Maybe we’re too overconfident, and our 40% bin ends up with only 20% of its statements being true. Or maybe we don’t pay much attention when we make probability statements, and we put our statements in bins more randomly than we should. But the goal, at least, for somebody who wants to make accurate probability statements would be to match the percent of true statements in each bin to the label on that bin.

Screen Shot 2016-06-20 at 11.07.48 PM

This person is highly skeptical of events occurring in general.

So now we have two different concepts of probability, each of which describes distinct kinds of statements that communicate different things. When you make a die-rolling-type probability statement, you are telling me about the number of events that would occur out of a theoretical large number of trials; this often reflects the inherent symmetry of a system, like a die or a deck of cards. When you make a Brexit-type (or intelligent-life-type) statement, you’re simply telling me which bin you’re deciding to place that statement in.

At first glance, die-rolling-type statements seem more mathematically rigorous. I mean there’s a whole mathematical equation and everything. Meanwhile, Brexit-type statements are based on…subjective feelings?

But there are actually lots of ways to make Brexit-type statements remarkably accurate. The key is to base these kinds of statements on available evidence. This evidence can be historical (i.e., how often have events like event E occurred in the past?); it can be predicated on an aggregation of the opinions of a large number of people, as in prediction markets, where people bet on the potential outcomes of different events; it can be based on statistics- and science-informed analyses of existing data, as in weather forecasting; or it can even be based on abstract philosophical reasoning. So making Brexit-type statements doesn’t consist of just choosing a probability that “feels” right; a lot of rigorous analysis can go into them.

As it turns out, probability theorists are a lot smarter than me and so they figured all this out a long time ago. What I call die-rolling-type probability they call frequentist probability (because probabilities are defined as frequencies of events), and what I call Brexit-type probability they call Bayesian probability (named after Bayes’ theorem, which tells you how to update your subjective probability based on new evidence).

What I find interesting is that, although Bayes’ theorem was originally published in 1763, the concept of Bayesian probability was rarely invoked until the last 20 years or so. Frequentist probability ruled for a long time2, and many of the sciences are still dominated by frequentist statistics; standard statistical techniques involving p-values, confidence intervals, and null hypotheses are brought to you by frequentist statistics. But Bayesianism is on the rise; fields ranging from genetics to medicine to machine learning now make heavy use of Bayesian statistics.

I guess the “Bayesian revolution” hasn’t made its way into high school or undergraduate education, though. I certainly never came across the concept in my introductory statistics class; all I got was frequentist statistics. And as a result, you get this post.

The Conclusion Box: Providing information to-go.

A. Probability is often defined as the number of occurrences of an event over a large number of repeated trials, but this definition seems inadequate for most of the statements we make involving probability on a daily basis, such as “The probability of Brexit is 40%.”

B. These kinds of probability statements reflect subjective judgments of the likelihood of an event; this kind of probability is known as Bayesian probability, which has been seeing increasing usage across multiple sciences recently.


1There are a couple of ways that you could imagine stretching the concept of a large number of repeated trials so that the Brexit and intelligent life statements would make sense. One way is to say that if you ran the universe over again from the beginning an infinite number of times, then in 40% of those scenarios Brexit would occur by the end of this week, or in 30% of those scenarios intelligent life would exist outside of Earth. A second way is similar; you could say that in all the universes that exist, in 40% of them Brexit will occur by the end of this week, and in 30% of them intelligent life exist outside of Earth. In each case, each universe counts as a repeated trial.

But both of these ways involve controversial metaphysical assumptions. In the first, you have to assume that the way the trajectory of the universe unfolds is not fully determined by initial conditions, i.e. the universe is non-deterministic; in the second, you have to assume the existence of a multiverse. No matter what you think about the truth or falsity of these assumptions, I don’t think most people mean to imply that they assume these things when they make a probability statement about something like Brexit.

2The success of frequentist probability was partly derived from statisticians’ aversion to any hint of subjectivity, and also from the fact that some of the greatest examples of successful applications of Bayes’ theorem occurred in wartime, and were therefore classified. Apparently Bayesian thinking was critical to the British cracking the German U-boats’ Enigma code during WWII, but evidence of this was buried after the war because the British didn’t want the Soviets to know how they did it.

Why I’m voting for Hillary Clinton in the Democratic primaries

24527808392_a6ef950989_zArtwork by DonkeyHotey via Flickr (CC BY-SA).

A.  Why am I writing this?

I know, I know.  Just my second blog post in, without a secure reader base, and already I’m talking politics.  I hope you’ll stick with me here though, because I don’t mean for this to be a divisive or inflammatory post.

For those of you who haven’t kept up with where I’m at in my life, two years ago I started a PhD program in chemistry at UC Berkeley.  And while it has been a great experience so far and I have learned a lot, as a chemist working only with other chemists it can be easy to get trapped in an intellectual bubble.  We go into work everyday and do chemistry for 12 hours, talk chemistry at lunch and coffee breaks, and go home and think about chemistry, because that’s what we’re expected to do as graduate students.  But what this means is that non-chemistry topics like politics don’t often enter into our discussions.  I realized as I was following this primary season that my views on the best candidate for the Democratic nomination1 were being formed almost completely free from debate or discussion with other people.  I was inhabiting my own intellectual echo chamber with the entrance blocked off, and with nobody around for miles.

So here, I attempt to lay out my rationale for why I plan to be voting for Hillary Clinton in the California Democratic primaries on June 7.  The nature of this post is not persuasive; I didn’t approach this post with the intent of convincing you, the reader, to vote for Hillary. In fact, just the opposite; I encourage all of you to give me the best reasons why I’m wrong. I’m inviting you into my echo chamber, not so you can hear my views, but so I can hear yours.  Look for logical flaws. Question my reasoning.  Tell me what factors I’ve not considered.  I’m putting my reasoning under the microscope of peer review, and I’m open to making revisions.

Let’s get started.

B. Rise of the algorithms

Now, on matters of issues, I tend to agree with Hillary more than I agree with Bernie, but I won’t be talking about issues here.  I’m guessing that, if you’ve followed the Democratic primaries, you’re already familiar with both candidates’ positions, with plenty of other media outlets debating those positions’ relative merits.  If you want to talk issues, I’d be happy to indulge in the comments section.  But here I want to present reasoning that is perhaps a bit unorthodox, and hopefully informative.

First, we have to take a detour and talk about Nobel Prize-winning psychologist Daniel Kahneman, well-known for his work on heuristics and cognitive biases and for jump-starting the field of behavioral economics.2  In 1955, Kahneman was a young lieutenant in the Israeli Defense Forces.  He had a tall task in front of him: He was to come up with an interview system for new recruits for the Israeli army, which was to be used to determine the likelihood of their future success. The system that was in place before Kahneman’s arrival was a standard fifteen- to twenty-minute interview, at the end of which an interviewer would have formed a general impression of the interviewee. Unfortunately, it was found that this general impression had almost zero predictive value for the recruits’ future success.  And so Kahneman, who at the time only had a bachelor’s degree in psychology, was brought in to change things up.


This is Daniel Kahneman.  He is smiling because he has a Nobel Prize.

It turns out that Kahneman wanted to change things up so much that his suggestions were met at first with open rebellion. Specifically, he proposed scrapping the standard interview system, and instead, having the interviewers ask factual questions in order to objectively score the interviewee on six traits that he regarded as relevant to performance in the army, such as “responsibility” and “sociability.”  He would then take these scores and plug them into a super complicated statistical formula, and the magic number that popped out was to be used as a predictor of future success.

Interviewers didn’t like having their role be reduced to robotically asking factual questions, but they eventually went along with it.  After a few hundred interviews, Kahneman’s magic number for each recruit was correlated to evaluations of that recruit’s performance by commanding officers.  The result? Kahneman’s method was far superior to the old interview system.  It wasn’t perfect; in fact, Kahneman himself only described it as being “moderately useful.”  That may be an undersell, however, given that the Israeli army still uses mostly the same interview system to this day.

The moral of this story is that statistical algorithms tend to outperform expert intuition in domains where the predictability of the outcome is low.  If you think I’m cherry-picking one anecdotal example, I’m not; this turns out to be one of the more robust findings to come out of the social sciences.  Since Kahneman’s application of this principle to the interview system of the Israeli army, statistical algorithms have been applied to predictions of outcomes ranging from life expectancy of cancer patients to winners of football games; from success of new businesses to future prices of Bordeaux wines; from recidivism of juvenile offenders to evaluations of scientific presentations.  In all of these cases, algorithms either matched or exceeded predictions by experts in the relevant fields.  Overall, across about 200 studies on this phenomenon, around 60% have shown that predictions from statistical algorithms significantly outperform predictions of trained professionals; the remaining 40% show a statistical tie.  No convincing study has shown experts outperforming an algorithm.


She’s angry because an algorithm beat her in predicting the number of unicorns that would be found on Mars.

Oh, so you know how I said Kahneman used a super complicated statistical formula to predict the future success of Israeli army recruits?  I lied.  It turns out all he had to do was take the scores assigned to the six traits and add them up.  That’s right – a simple summation of the scores resulted in significant predictive validity.  Again, this is a result that has since been confirmed by further research; even back-of-the-envelope calculations tend to have an advantage over experts.  This means that any of us can apply this technique to make better decisions, whether you’re an employer looking to hire the best employee, a gambler attempting to predict the outcome of a horse race, or – well, you probably saw where I was going with this – a voter trying to decide who would be the best nominee for the Democratic Party.

C. Political figure-skating

OK, so I’m going to be using an algorithm here.  But how do I come up with the traits or characteristics that I’ll be scoring?  Well, ideally, I’d be looking for traits that are good predictors of being a good president.  For a bank of traits I suppose it’d be reasonable to start off with a personality inventory that includes the Big Five personality traits (Openness to Experience, Conscientiousness, Extraversion, Agreeableness, and Neuroticism).  And if I wanted an objective measure of the greatness of past presidents I suppose I could contact a bunch of presidential historians (maybe about 800 of them) and get their ratings of the presidents.  I could then correlate the traits of the presidents (attributed to them by about 100 presidential biographers, who I could also contact) to their greatness ratings, and see which traits correlate the best.

If I did all that I would end up with this study by political psychologists Steven Rubenzer, Thomas Faschingbauer, and Deniz Ones.  The authors broke down each of the Big Five personality traits into six facets, and correlated each facet with presidential greatness as determined by presidential historians.  Based on that study, as well as this similar study by political psychologist Dean Simonton, I’ve identified five facets that are the most highly (positively or negatively) correlated with presidential greatness.3  They are listed below.  As points of reference, Rubenzer et al. consider correlations of above 0.40 to be large correlations, and correlations between 0.25 and 0.40 to be medium correlations.

Trait:                                                                          Correlation:

Intellectual brilliance                                                  0.56

Assertiveness                                                             0.43

Achievement-striving/competence                            0.35

Tender-mindedness                                                    0.31

Vulnerability                                                               -0.27

The scoring system I’ll use is similar to that used for figure skating.  Both Clinton and Sanders will be scored on a scale of 1 to 5 in each of the five personality traits, with 1 meaning “This candidate does not exhibit this personality trait at all” and a 5 meaning “This candidate personifies this trait”.  These five scores will be scaled by a factor equal to the corresponding correlations, and then these scaled scores across the five traits will be summed.  Whoever has the highest total at the end wins.

OK candidates…ready, set, skate!

D.  The showdown

While I see sections B and C as less controversial, this section where the scoring takes place is much more subjective.  If you wanted to throw tomatoes at me, it would be over this section (but please throw them respectfully in the comments section).  If you disagree with me on the scores, however, what’s nice is that you can still retain the same scoring system and plug in your own numbers.  If you do so, resist the urge to play around with the numbers if you get the final score back and it goes against your favored candidate, as that would immediately negate the value of this exercise.  Try and score as objectively as possible the first time around, and then accept the final weighted sums, whatever they turn out to be.  But anyway, on to my own scores:

Intellectual brilliance: What is meant by intellectual brilliance is not made incredibly clear in Simonton’s paper, but it seems to be not just intelligence, but a combination of cognitive factors that also includes creativity, curiosity, openness to alternative ideas and values, wisdom, and more.  While there are no (reliable) reports of the intelligence of Clinton and Sanders as determined by IQ tests, both have impressive educational backgrounds.  Based on what I saw in the Democratic debates, I was impressed by Clinton’s wide range of knowledge on both domestic issues as well as on foreign policy.  Her answers seemed nuanced and detailed, and I got the sense that she had given every issue a lot of thought.  Even on Libya, widely viewed to be one of her greatest failures as secretary of state, you can’t fault her for not doing her homework. Meanwhile, I thought Sanders faltered in the debates when it came to providing details for his proposals, especially when it came to foreign policy.  I’m also troubled by the lack of evidence to support his economic proposals.  All in all, I don’t see the same receptiveness to evidence and intellectual rigor in Sanders as I do in Clinton, so I give Clinton the edge in this category.

Clinton 4, Sanders 3.

Assertiveness: Assertiveness is defined by a candidate’s inclination to be a leader and take charge, so past leadership experience seems important here.  Clinton’s resumé here is long: She transformed the role of First Lady by actively influencing policy, she served two terms as Senator of New York, and she served as Secretary of State in the Obama administration.  Sanders also has plenty of leadership experience, having served for 25 years in the House and Senate.  I give Sanders a lot of credit here for being an independent in a two-party-dominated system of government, for leading the way in showing that a serious campaign can be run without a Super PAC, and for locking horns at times with the DNC.  Clinton, on the other hand, seems like less of a trailblazer when it comes to policy issues, and is much more likely to toe the party line.  I give the edge to Sanders on this one.

Clinton 3, Sanders 4.

Achievement-striving/competence: This category involves both having high aspirations, as well as the ability to realize those aspirations.  Sanders and Clinton both have one-half of the equation here.  Sanders certainly has run a campaign with lofty goals, but I question his ability to implement those goals, especially given his poor legislative record in Congress.  Clinton, on the other hand, has a penchant for compromise and pragmatism, but a platform built on compromise and pragmatism almost has to be non-ambitious.  I’m calling this one a tie.

Clinton 3, Sanders 3.

Tender-mindedness: Tender-mindedness basically means having sympathy for others.  This is a tough one to score, since I feel like this is a trait that would show much more in private than in public.  Based on what I’ve seen in public, though, I don’t really have any reason to believe that one candidate is significantly more tender-minded than the other.  I’ll call this one a tie, too.

Clinton 4, Sanders 4.

Vulnerability: Vulnerability is defined as having a general susceptibility to stressful situations.  Note that this category has a negative correlation, so a lower score is better.  I think Clinton has a pretty strong case for this category, having weathered conspiracy theories, multiple organizations dedicated to attacking her image, eight Benghazi investigations, and, at least according to one study, the most negative media coverage this election season (yes, even more than Trump).  Even after all that, she continues to put herself in the center of public attention, and she doesn’t seem to have lost her appetite for public service.  I really don’t have much evidence in favor of Sanders here; in fact, I thought he lost his composure at times during some of the (relatively tame) Democratic debates.  I give Hillary a win here.

Clinton 1, Sanders 3.

Weighting all these scores and summing them together, we end up with final scores of Clinton 5.55, Sanders 4.88. So that settles it.

…OK, I’m being a bit facetious here.  That doesn’t really settle it, because this is only a partial reasoning based solely on personality traits.  As I said at the beginning of the post, though, I also have issues-based reasoning in favor of Clinton, and so those reasons in combination with this analysis gives me a solid foundation for why I’m planning on voting for Clinton.  But as I also said at the beginning of this post, I’m broadcasting this line of reasoning to get feedback.  Show me where you think my reasoning isn’t valid and I’ll update my beliefs accordingly (if you think the logic is sound, feel free to say that, too).

The California primaries are fast approaching, though.  They’re on June 7.  You have until then to convince me.

E.  Caveats

This is just an addendum section explaining why you might be wary of the reasoning I have laid out here:

  1. Coming into this analysis, I was already leaning towards Hillary, opening up the scoring to possible confirmation bias (although I tried to guard against that).
  2. I don’t have much of a background in statistics, and so it is possible that I am missing some of the nuances of the statistical analyses done in the political psychology papers by Rubenzer et al. and Simonton.
  3. This reasoning completely ignores electability in the general election, and only focuses on which candidate would make a better president.

The Conclusion Box: Providing information to-go.

A. I’m writing this post to get some feedback on my reasoning for why I plan on voting for Hillary Clinton in the Democratic primaries, reasoning that I had mostly developed in isolation.

B. Research from the social sciences has shown that even simple algorithms tend to outperform trained professionals in making predictions of highly unpredictable outcomes. Algorithms could in theory be applied to predicting which candidate would make a better president.

C. Based on research in political psychology, many personality traits show a correlation with presidential greatness; five of the traits with the highest magnitude correlations include intellectual brilliance, assertiveness, achievement-striving/competence, tender-mindedness, and vulnerability.  These traits could be incorporated into an algorithm where each trait is scored and scaled by the magnitude of the correlation with presidential greatness.

D. My subjective scoring of Clinton and Sanders on these five traits results in a higher final score for Clinton.

E. Some potential reasons for distrusting this analysis include confirmation bias, my lack of knowledge when it comes to statistics, and the ignoring of electability considerations.


1As for why I’m not discussing the Republican primaries here, I offer two reasons, a) I followed the Democratic primaries much more closely than the Republican primaries (it would’ve taken a ton of time to follow both), and b) at the time of the writing of this post, Donald Trump had already clinched the nomination.

2The remainder of this section is all drawn from Kahneman’s book Thinking, Fast and Slow, which mainly summarizes research over the past half century on cognitive biases and heuristics.  I highly recommend it to anyone who is interested.

3Boring details on how I extracted what I did from the two papers: I picked the facets with the highest magnitude correlations, but only one from each of the Big Five personality traits, in order to keep the facets independent of each other as possible (having dependent traits would result in an overcounting of those traits in the algorithm).  I’ve combined Achievement-Striving and Competence into one facet, since they had almost identical correlations and they are pretty related.  Also, instead of taking a facet from Openness to Experience, I’ve substituted Intellectual Brilliance from Simonton’s paper, because it correlates better than any facets from the Rubenzer et al. paper and is supposed to be quite related to Openness to Experience.  For correlations from the Rubenzer et al. paper, I averaged the correlations that came from the Ridings & McIver ratings and the Murray & Blessing ratings, leaving out the Ridings & McIver ranks, because that would be correlated with the Ridings & McIver ratings and it seemed to me that the ratings would just be more precise.

Why you shouldn’t call other people hypocrites…or maybe you should, it’s complicated

A.  The tu quoque fallacy

I’m sure I don’t have to tell you that correctly accusing somebody of hypocrisy can be a strong rhetorical move in an argument.  If Alfred says that wearing pink is morally wrong, and I can show photo evidence that Alfred regularly wears pink, then you’d be hard-pressed to agree with his position.  This illustrates a point that we’re all familiar with: Showing that a person’s behaviors are inconsistent with their position is a good way to tarnish that position.  But should this be the case?

Let’s say that Alfred had a lot of good reasons to condemn pink-wearing.  Maybe, in this hypothetical world, pink fabric only comes from the United States of Pinkistan, which treats its workers extremely poorly.  Maybe wearing pink is a universally understood signal that you support the Pink party, which holds morally abhorrent views.  Or maybe the production of pink fabric is really bad for the environment.  Whatever the case may be, whether or not Alfred wears pink has no bearing on his argument.  It’s not logically valid for me to rebut his argument with, “Oh, Alfred, but you wear pink!” as if his wearing pink would suddenly mean that the production of pink fabric was environmentally sustainable.

Accusing somebody of hypocrisy as a way of addressing their point is a logical fallacy; it is termed tu quoque (pronounced too-kwo-kway; literally, “you also”), and can be thought of as a subcategory of the ad hominem fallacy.  As I showed above, an appeal to hypocrisy doesn’t address a stated idea itself, and so is not a logically valid form of argumentation, at least not if the purpose of the argument is to determine the truth or falsity of that idea.  Instead, it redirects the argument toward a person’s belief in that idea.  In the hypothetical scenario above, I wasn’t able to show in a logically valid way that wearing pink isn’t immoral; all I was able to show was that Alfred must not really believe that wearing pink is immoral if he does it himself.

Or was I even able to show that?

B.  We’re all akrasics

I don’t think it’s that simple.  Alfred could in theory wear pink and believe that wearing pink is immoral; he would just be logically inconsistent if he did so.  This isn’t really so ridiculous.  Most of us lead logically inconsistent lives, although we might try not to. Procrastination is a universal example; although we know we should start writing that essay/doing that problem set/going to the gym now, we just…don’t.  In our minds, we judge one course of action to be the right and optimal one to take, and then behave in a completely different way.

Even ethics professors, a group of people that you might expect to self-select for logical thinking, can be vulnerable to logical inconsistency.  In a study of the moral behavior of ethics professors, philosophers Schwitzgebel and Rust surveyed over 500 professors, divided into the categories of ethics professors, non-ethicist philosophers, and non-philosophers.  On issues such as voting, vegetarianism, organ and blood donation, and charitable giving, the professors were asked about both their attitudes and behaviors towards these issues.  While it varied from issue to issue, overall, the correlation between ethics professors’ attitudes and their behaviors was low, and was not significantly different from that of non-ethicist philosophers and non-philosophers.  It seems that even spending a lifetime of reflection on one’s moral views doesn’t necessarily impel one to act in accordance with those views.

So why are we so logically inconsistent?  For one answer, we can look all the way back to Aristotle, who posited the lack of willpower, or akrasia (pronounced a-kray-sha), as one potential reason why we don’t always do what we think is right.1  The phenomenon of akrasia is backed up by psychological research, which has found that self-control is a limited resource; as you resist more and more impulses and desires, eventually you start to run out of willpower and your capacity for self-control breaks down.  The important thing here is that this phenomenon is universal.  Alfred doesn’t have to have any sort of malintent to behave hypocritically; he could just be suffering from akrasia, which all of us do from time to time.

(6/27/16 EDIT: The linked research in the above paragraph is now in doubt.  The idea that self-control is a limited resource that can be used up may no longer be scientifically backed up, although it seems the debate is still ongoing.  In any case, I don’t think this necessarily affects the broader phenomenon of akrasia; even if there is no limited bank of willpower, it’s still plausible to me that sometimes people just don’t have the willpower to do what they think is right.)

C.  What do we do with you, Alfred

OK, so I can’t call Alfred a hypocrite in order to rebut his argument, because that would be committing a logical fallacy.  I also can’t prove that he doesn’t really believe what he’s saying, because he could just be suffering from akrasia (maybe he thinks he looks really good in pink, so it’s hard for him to stop).  But the situation still remains that he is acting against his moral standards.  Can I call him a hypocrite to change his behavior?

Part of me wants to cut Alfred some slack.  This part of me worries that if I call him out on his logical inconsistency, then he might change his moral standards rather than his behavior.  He might choose to continue to wear pink, and might start thinking, “Well, the United States of Pinkistan isn’t so bad…” in order to rationalize his choice.  

And yet another part of me doesn’t want to let Alfred off the hook so easily.  Shouldn’t we be holding people to their moral standards?  If Alfred is suffering from akrasia, then won’t pointing out his hypocrisy give him that extra push he needs to change his behavior?

The scenario with me and Alfred is at the interpersonal level.  What’s interesting is that we can view this same scenario at different magnifications.  Zooming all the way in, we reach the individual level; I am Alfred and Alfred is me. Here we face the same dilemma: How much should I, as an individual, try to hold myself to higher moral standards in order to change my behavior?  And how can I do so without leading to rationalization of the “United States of Pinkistan isn’t so bad” sort?

(Long aside: As a vegan, this is something I struggle with a lot.2  The maximum standard I could hold myself to is being 100% animal-product-free.  But should this maximum really serve as my target?  There are multiple risks to trying to hold myself to that standard.  I could use up a lot of my limited self-control, leaving less self-control for other areas of my life, which could be damaging.  I could also start to resent veganism, and subsequently start rationalizing away the principles underlying my veganism.  I’ve settled on being OK with being 90-95% vegan, because the first 90-95% is a lot easier to achieve than the last 5-10%, so there’s a bit of a diminishing returns effect going on here.  But I recognize that a reasonable person could take a look at that line of reasoning and see “rationalization” written all over it.)

Zooming out, we can take this to the group level, too, provided that the group in question is held together by some core principles.  Instead of just one Alfred, let’s imagine a whole bunch of Alfreds running around, so many that they eventually start a Society of No-Pink-Wearers (these Alfreds are not a creative bunch).  We can imagine two possible scenarios.  In one scenario, the Society of No-Pink-Wearers is super strict, so that anybody who can’t help themselves and wears pink on occasion gets shunned, and is kicked out of the society.  It wouldn’t be hard to imagine a number of people outside the society who want to join, but who suffer from akrasia, so they can’t meet the strict standards that The Alfreds enforce.  The result is these outsiders start readjusting their moral standards; now all of them are thinking, “Well, the United States of Pinkistan isn’t so bad…”  As a result, the membership of the Society of No-Pink-Wearers dwindles and it eventually dies out.

In the second scenario, the Society of No-Pink-Wearers isn’t very strict at all.  They don’t have high standards for membership, and so while the Society starts off with a core of faithful members, it eventually becomes a large society where most of the members are nominally No-Pink-Wearers, but in reality wear pink almost as much as they would otherwise.  The Society’s message is diluted and it has almost no impact on the broader culture.  After a couple generations the message is completely lost and joining the Society becomes something you do just because your family did it.  Given these two scenarios, how much should a group hold its members to logical consistency?3

The basic question underlying all three levels of magnification is whether people, when faced with a mismatch between their moral standards and their behavior, are more likely to alter their behavior to match their principles, or the other way around.  The answer will likely vary greatly from individual to individual, from interpersonal relationship to interpersonal relationship, and from group to group, but maybe there are general trends that could be teased out if this question were studied empirically.  I would be interested if anybody has any evidence from psychology, sociology, or history that could shed some light on this.

The Conclusion Box: Providing information to-go.

A: Accusing a person of hypocrisy in order to rebut their argument is a logical fallacy, because by doing so you are not attacking their argument, but rather their belief in their position.

B: Oh wait, but actually, if a person behaves hypocritically, that doesn’t necessarily mean that they don’t believe in their position, since they could just be suffering from akrasia, or lack of willpower.

C: Pointing out hypocrisy can lead an accused person to rationalize away their moral standards, while not accusing them of hypocrisy allows them to continue acting against their moral standards.  This poses a dilemma on the individual level, the interpersonal level, and the group level.


1There are other reasons, of course (i.e. cognitive biases, lack of introspection), but I won’t get into them here.

2Well really, more broadly speaking, this is something I struggle with as a utilitarian, which underlies my veganism.  I might talk about utilitarianism in a future post.

3The scenarios I point out are two extremes on opposite ends of a spectrum, between which I think you could put many real life social groups.  On the stricter end you might find the Society of No-Meat-Eaters (vegetarians/vegans), and on the laxer end you might find many religions.


I’ve always found my personal opinions on most matters to be uncertain, even matters that most people have an opinion on.  People: “I think [argument 1] because of reasons [X, Y, Z].  What do you think?” Me: “That seems like a hard question, I don’t really know, I’ll have to consider both sides of the argument, there’s probably no easy answer, let me get back to you on that.”  Then I never get back to them on that, because even if I have reasons A–Z, I don’t feel comfortable putting my foot down and saying, “I agree/disagree with you on [argument 1].”  Probably because I am afraid of somebody else coming in and saying, “Aha, but you didn’t consider reasons [𝛂, 𝛃, 𝛄]!”  I’m not afraid of being wrong per se, but I have a constant fear of not having enough information to hold an informed opinion.  I continue saying “The jury’s still out!” even after the jury has deliberated, made its decision, publicly declared its verdict, and everybody’s gone home.

Now, there are lots of reasons to be epistemically modest, of course.  It would be wrong to have a firm opinion when there is no empirical evidence or logical argument to back it up, or worse, in the face of opposing evidence.  But epistemic modesty to the degree I have described above is problematic.  Since I always defer my opinion until more evidence comes in, I never articulate my reasoning for what I believe given the information I do have.  This means I never end up discussing important issues with anyone, and perhaps I’ve forgotten how to do that.

This blog is an attempt to change that.  If my articulation muscles have atrophied, this blog is meant to be a workout program.

I don’t know exactly what I’ll be blogging about – probably just whatever thoughts I have that I think are interesting and worth writing a blog post on. Maybe this blog will evolve into something more coherent in the future.  Maybe it’ll eventually become a science blog, or a philosophy blog, or a social commentary blog.  But for now it’s just going wherever my mind takes me.  I hope you’ll stay along for the ride.