Monthly Archives: June 2014

Communication From Another Dimension

In my post complaining about the way people talk about Guess, Ask, and Tell Cultures, I summarized them this way:

The gist of the difference is that in “ask culture” it’s normal to ask for things you want even if you don’t expect to get them, it’s normal to refuse requests, and it’s not expected to anticipate others’ needs if they don’t ask for things, whereas in guess culture, you’re expected to offer things without being asked, you don’t ask for things unless you really need them or strongly expect the other person will want to give them, and it’s rude to refuse requests. (Tell culture is a variant on ask culture where instead of just making a request, you express the strength and exact nature of your preference, so other people can respond to your needs cooperatively, balancing your interest against theirs, and suggesting better alternatives for you to get what you want.)

But the more I think about it, the more I’m sure that the problem isn’t that one or all of these is bad – it’s that these distinctions are insufficiently dimensional. Here are a few more precise axes along which communication differs:

  • Explicit vs Indirect
  • Verbal vs Nonverbal
  • Anticipation vs Self-Advocacy
  • Zero-Sum vs Coöperative

Continue reading Communication From Another Dimension


F-ing Statistics, How Do They Work?

Anyone who’s taken a sufficiently high-level statistics course, or tried to teach themselves statistics, knows that there are a bunch of different kinds of statistical procedures, and a bunch of different “statistics” and “tests” they have to do to figure out whether the results are “significant.” For example, I’ve seen the F Test introduced, explained mathematically, derived a few times – but I never quite figured out what it was actually doing. Not during my mathematical statistics course, not during my regression or econometrics courses, not at work, not in my own reading. Then last night a friend asked what it was and I explained it in about 30 seconds, then realized that I’d figured it out. I figured it would be nice if someone explained this on the internet, and I’m someone, so here goes:

Analysis of Variance

Statistics courses often shove a unit on Analysis of Variance somewhere into the middle or end, but the principle behind it is the one that motivates all of frequentist statistics. Here it is in a nutshell:

Variance is a measurement of how much variation there is in the data. For example, if you’re measuring the physical location of an object in latitude/longitude, the variance of your bed is low, the variance of your housecat, Roomba, or other pet is a little higher, and the variance of you or your car is even higher.

When you build a statistical model, you are trying to explain the variation in the data you’ve observed. You will usually be able to explain some, but not all, of the variance. Let’s say grandma and grandma (or mom and dad, if you’re a bit older) have prospered financially. They spend spring and summer up north in Connecticut or New Jersey, and fall and winter in sunny Florida. Your model has only one predictor: what season is it? This explains the vast majority of the variance in their location. While they might move a little bit to get groceries, go out to dinner, visit a friend, or play golf, you can almost always get their location within a few miles by using that one simple predictor: is it warm season, or cold season.

But it’s not enough to just see that you explained some of the variance. Sometimes this will happen by coincidence. If you assign a group of people randomly into two groups, A and B, usually one will be a little taller than the other on average – it would be a big coincidence if the average heights were literally exactly the same. But that doesn’t have predictive value – just because group A is taller doesn’t mean you expect group A to be taller next time you divide people randomly into two groups. You can’t always use common sense – for example, maybe you’re not already sure whether something’s truly predictive or not – so you’d like some objective measure of how likely you are to get a prediction that works this well by chance, if there weren’t really a true relationship between the things you’re measuring.

It turns out that under certain simple conditions, you can neatly divide the observed variance into the part explained by your model, and the remaining noise. In the grandma and grandpa example, the explained part is whether they’re in their average Florida location or their average New York Area location, and the unexplained part is trips around town.

Now here’s the core of statistics*:

1) Come up with a single measurement of how much your model explains.

2) Transform that measurement into a statistic that has well-defined statistical behavior.

3) Look at its value, and see whether the statistic is a lot bigger than you would expect if your model just worked by chance.

If your model didn’t “truly” explain anything, then both the apparently explained and apparently unexplained parts of the variance would be distributed like a Chi-Squared random variable. If you divide the explained variance by the unexplained variance, you get another statistic, the ratio of the two. The more of the variance your model explains, the bigger this ratio gets. And since the ratio of Chi-Squared random variables takes the F distribution, you have a statistic with a well-behaved distribution. This is called the F Statistic. Then you just look to see, how big is this F statistic? Is it bigger than 90% of the values it would take by chance if your predictors were meaningless? 95% 99%? This tells you how unlikely it is that your model worked as well as it did by accident.

Now, I oversimplified this a bit. For one thing, the ratio of variances has to be multiplied by something based on how many records and how many predictors you’re using. For another, sometimes the F test is used to compare a model with another model, instead of comparing a model with no model. And it’s not literally true that mathematical statistics is nothing but significance tests over and over again – for example, there are fancy Bayesian techniques that explicitly estimate a prior and a posterior, there are ways to figure out whether a weird outlier is messing up your model, etc. But this is the part that keeps showing up, over and over again, but no one really bothers to explain because it’s too obvious – or because they never figured it out.

Another Example of This Sort

My friend, Franklin, reciprocated by explaining the central principle of sorting algorithms – ways to sort a bunch of records in a dataset, by some key (for example, you might sort bunch of records of people by full name). Basically, all efficient sorting algorithms work by minimizing the number of extra comparisons they have to make; if you know that A>B, and B>C, you don’t need to compare A and C to know that A>C. The computer science courses he’d seen never covered this, they just tell you about the various sorting algorithms and derive how efficient they are.

This central insight also explains both why the radix sort is so much more efficient than other sorting algorithms, and why it took so long to discover such a simple thing. Basically, you might think of sorting algorithms as a way to approximate the theoretical minimum number of comparisons needed to know, for any given pair of objects in a set, which one is greater than the other. This minimum is n*log(n), where n is the number of objects.

But what you actually want to do is order the data while making as few comparisons as possible. And the smallest number of comparisons you can make is not n*log(n), but zero. Essentially (oversimplifying only very slightly), instead of drawing any comparisons at all, you enumerate all possible values for your sort key in order, and then put each record directly in the corresponding slot. At no point do you compare any records with each other at all.

Look for the Central Insight

In general, you’ll be able to retain a body of knowledge a lot better if you can remember its central insight – the rest can be reconstructed if you have that one intuition. Another example: the central insight of Wittgenstein’s Philosophical Investigations is thinking of language as a behavior with certain social results. The central principle of double-entry accounting is that the credit is where the money comes from, the debit is where the money goes, and you always record both sides of the movement. What are some other examples?

Gel Culture: No Boots

The way people have been praising ask culture and tell culture makes me imagine a boot asking a human face whether it would like to be stamped on – forever. Whether it wants to or not, eventually the boot’s going to give in. But why do I feel so uncomfortable with the idea of ask/tell culture? It seems so sensible; why do I want to run away and hide whenever I hear someone explain how good it is?

Continue reading Gel Culture: No Boots

National Identities

Warning: mild spoilers above the fold, big spoilers below. There is no way to describe this book without spoilers.

The protagonist is a detective solving a mysterious murder. A body has turned up in the fictional Eastern European city of Beszel. The problem: the body has been dumped across an international border; the victim lived in, and was almost certainly murdered in, the neighboring fictional Middle Eastern city of Ul Qoma.

These aren’t like East and West Berlin, or Jewish and Arab Jerusalem, sharing a single contiguous unambiguous border. The cities occupy the same physical grid of streets with borders and “shared” areas crisscrossing the literal topographical (“grosstopic”) area. Only some unfathomed and possibly unfathomable force prevents the citizens of each city from perceiving and interacting with each other. It’s not just that it wasn’t legal to dump a body across the border – it shouldn’t have been possible at all.

I cannot tell you what makes The City & the City, by China Mieville, so good without spoiling the whole thing, but I will tell you that it does not betray the trust of a reader who expects mysteries to be about something. This is not Lost. There really is a secret to the Cities, it makes sense, and it is big enough to justify the story. To the right kind of reader, this is recommendation enough – if so, go and read it.

The big spoilers are below the fold.

Continue reading National Identities