Predicting the winner of the US Open

It’s not just the players that are complaining about the scheduling of the US Open Tennis. I can’t trade the final tonight as it’s my wedding anniversary today. I’ve been married even longer than I’ve been on Betfair, but one benefit of that is  the next generation of traders is following just behind me!

I saw this interesting BBC article this morning and it talks about the data collection that IBM do as part of their deal to these tournaments: –

When I started predicting things many moons ago I quickly realised one important fact. You start at a top level, then slowly drill down deeper and deeper. But beyond a certain point the level of entropy you see outweighs the predictive capacity of what you are doing.

This is because you can’t definitely say that something will or won’t happen, only that it will happen within a frame of reference. The more variables you use the less effective your prediction becomes. The entropy you introduce exceeds any useful forward looking capacity.

I found this out on the first market I learned to price, Football. Figuring out the likely number of goals was fairly straight forward, as was working out the rate at which either teams scored. Then I dug deeper and started looking at what created a goal and further back to who was going to score and why. Then even deeper to how that was going to happen. I eventually ended up being able to rate each player and their influence on the chance of a goal and how that would occur, al la Moneyball style.

But even then you figure that you can only work in percentages. The chance of this happen, then this, then that and that. The longer the chain of events you create the more variable the outcome and less certain its predict. So you add them all together average them and account for variability and end up back pretty much where you started. So there is a maximum number of variables that produces the least variable prediction and that is where you settle. But you also have the human element, where a variable will change on one side if you start modifying it on the other. At that point I wondered whether I should go into predicting the weather. (Incidentally, for fun, each year I do attempt to forecast the number of tropical depression, hurricanes and major hurricanes, but that’s another story!)

So, the guys at IBM will find the deeper down the rabbit hole they go, the less useful the information. But hell, it’s good PR isn’t it?

As for tonights match the market is pricing it about right and I really hope it lives up to expectations. It’s priced for a close encounter and that often makes for excellent trading fodder, but unfortunately I won’t be around to watch it. But, like IBM, I’ll be gathering data on it for later analysis.

