Saturday, July 31, 2010

What Gets Measured Gets Improved (Caveat)

When you're talking about optimizing the performance of groups of people, what gets measured gets improved. You can take that to the bank.

But you shouldn't necessarily measure the thing that you want improved.

What gets measured gets worked at, certainly, and probably gets improved as a result, but it might not be getting improved quite as much as if people were working at something related but different.

I used this to great effect when I was captain of a boat club. There's a device (called an ergometer) for measuring the power output of a rower. They're often found in gyms, but most boat clubs have a few. And some of the keener rowers have one at home.

Since most boat races are 2000m long, most serious rowing crews measure ergo performance over 2000m. They then work out special training programs to try to get this as high as possible, and because they have a large pool of competitive people trying to get in their boats, the people who can produce the best 2k times on demand get picked for the 2k races. This is probably entirely sensible.

Most semi-serious boat clubs measure the same thing, because it's the standard measure.

Rowing is a seriously elite sport. Which is to say that it's mainly practised by obsessive over-competitive nut jobs. I'm probably one of the least competitive rowers in existence. But most of my non-rowing friends seem to think I'm a psycho. Even in the most recreational clubs, the most serious people are probably practising every day.

In clubs where 2ks count for reputation, people are always practising their 2ks, trying to get better scores. They are also doing stuff like weight lifting and circuit training. Sometimes they do this instead of rowing.

I realized that most of the time, what I wanted was an improvement in people's 'base fitness', or aerobic capacity, and their rowing skill.

This would allow us to form competent boats which could be trained up to race fitness (anaerobic fitness) on a few weeks notice. Anaerobic fitness responds much more quickly to training that aerobic, but the good effects also fade much faster when you stop doing anaerobic training.

So on the basis that actually going rowing was both a fine way to build up aerobic capacity, and the only way really of improving rowing skill, I made that our only organized activity despite the fact that there's no real way to extract data about individuals from it.

But it's hard to get eight rowers, a cox and a coach together, so we couldn't do more than four outings a week. Most of us wanted to put more time into it than that, so we needed something else to do.

I decided that from the point of view of aerobic capacity, it would do us good to do half-hour ergos rather than the usual 2000m ones. They're about four times as long, and they actually have an aerobic fitness-improving effect, which 2ks don't. Theoretically, one-hour ergos would have been even better, but I personally find them so tedious that I wouldn't have been able to bring myself to do them regularly. A half-hour is at a level of intensity which is hard enough to be interesting, and limited enough in duration that it isn't boring, but it's not horribly painful and aversive like a serious attempt to do a 2k is.

Also, motivation must come from within. It's easy to keep attending a group activity such as regular rowing outings, where your absence will spoil it for everyone else, but I'd noticed before that attempts to centrally mandate that everyone also does one or two ergos a week always started well, but quickly failed, with only one or two people doing them. Even those one or two tended to stop once they realised they were the only people doing the training.

So I decided that I'd just keep a list of everyone's best recent attempts. That's all I did.

And it worked a treat.

We kept a public table of 30min ergo scores, and my friend Chris Metcalfe and I volunteered scores to get it started. There was no compulsion to do one, but a couple of the keener guys spontaneously put scores up as well.

We weight-adjusted the scores.

This was partly because boats really do go slower with more weight in them, and it's important not to measure the wrong thing.

But a side effect is to bunch them up and give the smaller guys a fighting chance. I am quite heavy myself, but I am also a bit Machiavellian. I would rather have the eighth place in a boat full of fit people than the fifth place on a table I don't believe in and a boat that hasn't trained.

My score was quite low in absolute terms, and also adjusted downwards because I'm overweight. However I was also captain. A fair number of people had a pop just to show that they were better than me. I quickly realized that it would be best if I tried to stay at 7th or 8th in the table.

There are eight rowers in a rowing boat. The captain's place in his crew should be unchallengeable on any grounds (and if it's not, then the captain should either work until he can make it so, or drop out). But beating the captain at something is a powerful motivator for people who are borderline.  As it happened, whilst initially I was underperforming deliberately, quite soon I found myself in the fight of my life to hold onto 7th place.

What really kicked things off was when Kate Hurst, a very good rower and serious athlete, who also happens to be quite startlingly beautiful, asked if she could add her time to the mix.

Without the weight adjustment she'd have been unchallenged at the top of the women's table, but distinctly second rate by male standards.

With the weight adjustment she'd overtaken half of our projected first VIII, and from a technical point of view she was easily good enough. She let it be known that she'd really like a row in the men's first division.

I was more than happy to let her come and play. At one point, in mid-Winter, she had the stroke seat as we set a new record time for our club over the local course. She got so keen that she bought her own ergo machine and kept it with her at work. At the time she was a professional sailor, so this can't have been easy.

This had a salutary effect. No matter how yoghurt-knitting and sensitive a new man he is, no man worth the bother is going to let himself be beaten by a girl in a physical competition, no matter how strong or brave she is. Those of us whose scores Kate had eclipsed were suddenly seen to be training hard. Those of us whom she was threatening also raised their games.

Finally, after many months, Kate was pushed into ninth place for good. Women are at a serious disadvantage compared to men, even taking into account their lighter weights. Kate was training at Olympian levels to stay with a bunch of second-rate club rowers.

But she left us a boat full of people who had really trained in order to beat her. Both she and we were much better as a result of the competition. We were still a fairly second-rate VIII even by local standards, but as soon as Kate saw that she wouldn't be able to force her way into our top eight against determined opposition, she changed tactic and walked into one of the best local women's VIIIs. Later that year they won the local competition outright.

So what sort of effect did all our half-hours have on our 2k times?

I have no idea! About one month before the competition we really cared about (The Town Bumps), I retired the ergo table in favour of doing the sort of training that will improve your fitness over shorter distances.

However by that time, we'd already decided who was going to be in our VIII, and everyone was so keen to do well that we could organize as much water time as possible. So we did all our short sprints in the boat either against our own speedometer or against other boats. There weren't any selection decisions to make, so I never asked anyone to do individual 2k tests at all. I didn't want them distracted by something that had become irrelevant.

The bumps is organized like a ladder. You get four races, in which you compete against the boats behind you and in front of you. There's a very steep gradient towards the top of the ladder. We won our first two races easily, our third narrowly, and drew the last one (rowed over).

This was actually fairly typical. We got better every year while we were running our ergo table, so we always moved up the ladder. The rather unpromising crew we'd started out with turned into a bunch of committed athletes who had the respect, if not the fear, of our much more naturally gifted competitors.

We tended to surprise people. Especially by the fact that we could hold our starting speed for a long time. Even boats that initially rowed away from us had a habit of becoming exhausted and dropping back into our waiting jaws. When boats came towards us off the start, we could raise our game and sit on them, knowing that they'd give up long before we did. It's a nice feeling to have. The will to prepare counts for a lot more than the will to win.

A couple of years ago was my last year as captain before I retired. A man should know when to quit and I'm getting old. In that last year, a couple of our key people were injured, and one was unavailable, and we actually managed to lose a bumps race (the second in about ten years).

Since then, my ex-club's results have been somewhat sub-optimal. Not long ago they set a new club record time for the local course, breaking the record that Kate set several years ago. But in the Bumps, which is still the only race that ever really matters to us, we're now down to 14th place from our high point of 7th. It's not clear why. It's mostly the same guys, the new captain's doing a fine job as far as I can see, and they've been a little unlucky in who their opposition have been.

But I was surprised to learn that they no longer know what their half-hour erg scores are.

Kate Hurst now has international ambitions, and has recently won a bronze medal at the National Championships. We learnt an awful lot from her. I hope she feels that she got something from us in return.

Kate's comment (by e-mail. Added here to put the record straight.)

Hey all,  

What a fab blog John, I'm sure I should have paid for something as flattering as this....;-)

I'd like to add my tuppence.

I didn't leave Chesterton first men cos I got beaten.  Cheek. As far as I know, not enough good scores were submitted by that time.   I left cos the day after Norwich head, getting spannered in the rad and sleeping in my car, my back had a dodgy afternoon and wasn't quite the same for 9 months afterwards, by which time City birds were getting pretty good, and I thought I wouldn't get to race any big London races or go to Henley with the Chesterton boys. not that it wasn't the most fun crew I've ever rowed with.

So there,


Footnote on weight adjustment

We weight adjusted the scores. This is a perenially controversial topic amongst rowers. Because they're not very clever. To use a car analogy, ergos measure the power of your engine. If you want to know how fast you're going to go, then you also need to know other things, like the weight of the vehicle. That's why motorbikes can be quite fast even though they have poxy little 600cc engines.

It's actually much more important in rowing, because the weight of the boat and rowers determines how much water you displace, and thus how much water you need to push out of the way.

So if you fail so badly that you're picking your boat on ergo score, then you really want to weight adjust those scores, because the river certainly will. And if you don't bother, then you're going to end up with a boat full of fatties and leave the fast guys on the bank.

But the real reason for weight adjusting the ergo table is that it moves the scores closer together.

In any boat club, there are big guys and small guys, fit guys and not-so-fit guys. Generally speaking, the big not-so-fit guys and the small fit guys will get roughly the same scores, the big fit guys will be out in front by miles, and the small not-so-fit guys will trail horribly.

There's not much point in the small guys taking it seriously, because if the big not-so-fit guys start training they'll overtake them quickly.

If you weight adjust it, which only makes a small difference, then the small fit guys will move slightly ahead. The small not-so-fit guys will get closer to the pack. This puts a certain amount of productive pressure on everyone involved.

I am a fairly heavy man, about 90kg when I was rowing seriously. Mostly muscle but somewhat overweight as well. Without the weight adjustment difference, I'd have been safely in our top eight without really trying. With it, my friend Chris Wood, who's taller than me, and used to be about 10kg lighter but 200m behind me over a half hour ergo, was suddenly right on top of me by my own formula.

So we both had to work very hard to stay ahead of each other. This led to a sort of ergo arms race, where both our scores improved out of all recognition. This happened to various pairs of people. Tom Watt and Chris Braithwaite, Chris Metcalfe and Andy Southgate.  Chris Wood, James Howard and I formed a close triple. Chris Smith was always far and away out in front, but since he had no one in our little club to push him, he was never under any pressure to improve.

Friday, July 30, 2010


Townies think the night is dark. It isn't, as long as you can see the moon.

For moderns, the moon is a detail. Do you know what its current phase is?

Townies think that the night is dark because they're constantly dazzled by artificial lights.

But if you go for a walk away from all the lights on a full moon night, you'll realise after about fifteen minutes that it's bright enough to read. You can even see some colours.

Unfortunately every passing car will reset your fifteen minute clock, so you have to go somewhere quite out of the way to experience this.

Where I grew up in the Pennines, it is dark at night, and there aren't that many cars around. But even there, there's enough artificial light in the sky that it's hard to see the galaxy.

Go somewhere truly out of the way, and you will see the Milky Way like a shining band across the sky.

Light a cigar. The match flame will dazzle you.

Wait fifteen minutes for your vision to come back. Then light another behind your back so that the flame doesn't dazzle you again.

You'll find that the light from a lit cigar reflects off the grass! I have heard that frogs can detect single photons. We are not that good. But we are not bad.

That's what true human night vision is like, and it's something that all our ancestors until about 200 years ago experienced every night of their lives.

For our ancestors even until very recent times, the moon was in the same category of importance as the sun. It made the difference between the nights when you could see, and the nights when you were blind.

That's why some traditional calendars use the lunar month as the basic unit, not solar year.

That's why Ramadan, the Muslim holiday, is not at a fixed time of year.

That's why the Passover feast of the Jews is on the first full moon in Spring.

That's why the Easter of the Christians is on the first Sunday after the first full moon in Spring. The Paschal Moon.

That's why your diary probably still follows the weird traditional practice of marking the full moon. What possible use could that be?

Here's Jane Austen in "Sense and Sensibility":

"[Sir John Middleton] had been to several families that morning, in hopes of procuring some addition to their number, but it was moonlight, and every body was full of engagements."

The lit night made a huge difference to what it was possible to do. Back in the days when I could still be persuaded to coach rowing on Winter evenings, it was easy on the night of the full moon, and both impossible and dangerous at new moon. I could always tell you the phase of the moon if asked.

moonchild by ~randis @

Women have always been associated with the moon. I know the names of many Moon Goddesses and no Moon Gods.

I am told that women's periods synchronise when they live together. Would they also track the moon if  we lived without artificial light?

Our natural sleep cycle is apparently 25 hours, not 24. It needs the sunlight to reset it every day to keep it accurate. Perhaps the human fertility cycle calibrates on the moon?

Is it impossible to imagine that in primitive tribes all the women became fertile at the same time?

Fertile women find strong men attractive. Men whose wives are straying become madly jealous. Maybe men have monthly cycles too?

Imagine what such a society would be like to live in. Every time the full moon came round, sex and jealousy and madness and anger would disrupt everything.

Maybe that's why lunatics have the moon in their name. Do wolves howl at the moon for the same reason? Is that where werewolf legends come from?

We go to a lot of trouble to conceal exactly when we're fertile, compared to most mammals where it's obvious. That would be completely pointless if everyone could tell just by looking up at the night sky.

Maybe it's the menstrual cycle that synchronizes, but the actual time of ovulation is random within that?

This idea is easily testable, if you can find a small community that lives without artificial light, and find out whether the women all menstruate at the same time. I predict that they do, and that that time is the same fixed point of the lunar cycle for all such small communities.

A second prediction: Sexual jealousy should make men insomniac or at least easily woken. And sexual desire should make women sleepless and prone to going for walks, gazing at the moon.

Is the full moon a good time for affairs? Or is it the new moon, when everything is dark?

Cultural evidence points to the full moon being a special time, but that might be because it's well lit and so a good time for celebrations. The time for affairs might be the pitch dark of the new moon. But if I had to bet, I'd go for full moon as the time of madness.

The moon reeks of romance. It is beautiful and moving. Poets sing of it.

Or alternatively, is the whole thing just an information cascade? Did some ancient sage notice that women's periods were around 28 days and decide that that was close to the moon cycle of about 29 days and so the two must be associated, and from that one observation came all the myths and legends and romantic associations?

You might think that this argument should apply to other species, but this is only going to be relevant if we have a species with sexual jealousy, which has a lengthy regular fertility cycle. It wouldn't surprise me if that was just us.

A Fire Upon The Deep (Vernor Vinge)

A new classic of speculative fiction. It has spaceships and rivets, but no dragons. However I won't quite call it science fiction, since it lies near the ever-shifting boundary between the zones.

Mediaeval cruelty, warring super-intelligences, transcendence, the awakening of ancient evil, and a desperate rescue mission which has at its object the galaxy as a whole and at the same time a single lost child. Avatars and mythical heroes, atrocities, obscenities and perversions.

Vinge illuminates small areas of a vast canvas in intricate detail, hinting at an awesome magical background that fires the imagination.

His aliens are human enough to care about, but alien enough to be interesting. The Tines are a particular joy, and I haven't seen anything like them before.

A riot. Heartily recommended!

Tuesday, July 13, 2010

Miss Williams Confounds Us as Rationalists

I have been reading Less Wrong

There were 128 competitors in the 2010 Wimbledon Ladies' Singles, thus 127 matches.

Of the 127 matches, 27 were 2-1, and the rest were 2-0.

Venus Williams won 4 matches in straight sets before being knocked out in straight sets by Tsevtana Pironkova.

So she's 4/100ths of the winners in straight sets

She's 4/127ths of the winners.

She's 1/100ths of the losers in straight sets

She's 1/127th of the losers

She's 0/27 of the winners or losers by 2-1

I want to say that this makes her more likely to win in straight sets than to win.
And more likely to lose in straight sets than to lose.

I must not say this.
I must not say this.
I must not say this.

Conjunction Fallacy

I have been reading Less Wrong

A blood clot on the lung often leads to shortness of breath, but rarely to weakness. So my sources inform me.

Imagine a man is suffering from a blood clot in the lung:

Rank in order of probability the following lists of symptoms:

A Weakness
B Calf pain
C Sharp pain while breathing
D Shortness of breath and weakness
E Loss of consciousness and fast heartbeat
F Coughing up blood

Actually do this, before reading on.

Did you rank D as more likely than A?

If not, congratulations. Medical students get this wrong, apparently. As do I.

If so, what are you thinking? Can you not see that weakness has to be more likely than weakness together with something else?

A person suffering from weakness and shortness of breath is suffering from weakness.

If out of 100 people with blood clots, 50 are suffering from weakness AND shortness of breath, then AT LEAST 50 must be suffering from weakness.

The thing is, I know about this effect, called the Conjunction Fallacy. And I know some probability and even a bit of statistics. And I still think that D outranks A. I couldn't be more surprised if I kept putting two oranges next to two oranges and getting three oranges.

Here's another one:

Suppose Venus Williams is playing tennis.

Rank in order the probability that:

A Venus loses the first set
B Venus wins the match
C Venus loses the first set and wins the match
D Venus wins the first set but loses the match
E Venus loses the match

Try this, without thinking too hard. No drawings of Venn diagrams are to be made.

I make this:


Venus seems likely to win, and it's also quite possible that she loses the first set but wins anyway.
It's quite unlikely that she loses the first set, but even more unlikely that she wins the first set but then loses the match (she'd have to lose two sets, and that's impossible. She's a goddess!). The idea of her losing seems very unlikely indeed.

How do you rank them? Same as me? If all our brains work like this then it's a wonder we ever manage to tie our shoelaces.

There's a lot of very good stuff along these lines at the Less Wrong wiki.

In the Venus example, it absolutely has to be true that A>C, B>C, and that E>D. I can see that, but I do not feel that.

I am never going to trust a hunch again. There is a name for my belief that hunches are right more often than they have any right to be.

What if we change the question?

Let's take a load of tennis matches, say the recent Wimbledon.

E. Take the loser from every ladies singles match result.
There'll be loads. One of them will be Venus, who didn't win the tournament.

D. Take the loser from every ladies singles match which was 2 sets to 1. There'll be fewer. I bet Venus is in there.

C. Take the winner from every ladies singles where the match was 2-1. There are the same number as for D. I bet Venus is in that list several times.

B. Take the winner from every ladies singles match. There'll be more, and I bet Venus is in that list more too, even proportionately.

A. Take every name of a woman who lost a first set. I have no intuition about how many times our hero is on that list relative to the others.

So E < D < C < B , and I have no idea where A should be.

This looks awfully like my intuitive answer to the first question. Even though it's the answer to a COMPLETELY DIFFERENT QUESTION.

Is there any way to avoid (or any reason to want to avoid) the conclusion that:

When you ask someone for the probability of A given B, they give you the probability of B given A.

Linda is a vegetarian who knits her own yoghurt and is active in the environmental movement.

What is the probability of her being a bank clerk?

What is the probability of her being a feminist bank clerk?

The Planning Fallacy

I have been reading Less Wrong

The other night in the pub I was sounding off about the Planning Fallacy.

An experiment was done where people were asked to estimate the time needed to complete a task. They were asked for estimates for the best case, worst case and normal case times.

The experimenters found no real difference between the best case and normal case estimates.

When later, the actual time to do the task was measured, it was, on average, worse than the worse case estimate.

The experimenters drew the conclusion that people expect that usually, everything will go as well as it can possibly go, and that people are spectacularly bad at imagining just how wrong things can go, and how normal it is for them to go that wrong.

The following morning, we were due to go to a cricket match at Newton, which I guessed was about a 20 minute drive from my house. We had planned to meet up in the excellent Queen's Head pub at 12:00. I was giving two friends, Steve and Nick, a lift.

Being a careful person who hates being late, I told them to arrive at my house at 11:30.

At about 11:00, Beard rang to ask if he too could have a lift.

I put my bike and my cricket kit in the van early on so I wouldn't have to worry about it later.

At about 11:30, Steve turned up, while I was still loading the van. He needed a bottle of milk to take to the game. I sent him to the nearest local shop and I put the kettle on, and when he came back,we sat in the back garden while we waited for the others.

Beard and Nick showed up while we were in the garden. We finished our tea, and then loaded their cricket kit into the van, and locked Beard's bike up in my shed.

As we set off, we realised that the main road near my house was blocked by water main repairs, but we quickly worked out an alternative route.

There was little traffic, and the drive to Newton seemed to take about 20 minutes. We had to wait a few minutes at a level crossing for the London train to go through.

Absolutely nothing had gone wrong. We were only about half an hour late. We were the first ones there.