Complex software

One of the areas I have worked in is thinking about modeling complex manufacturing systems to try to analyze the cost functions that could be used to approximate the behavior of the system. This required lots of machinery, and reviewers were at a huge disadvantage, wondering what was going on with all my software. I tried to think of ways to “open” the code and the methods, but it was quite difficult. Getting the paper published was not picnic.

The global warming software debate is very similar. I don’t know whose software is more complicated (maybe not mine?). Here’s an article that talks about the lack of (and need for) software reviewing. Academe needs to confront this issue seriously.

Simpson’s paradox

Averages seem like simple things, but not always. Consider this simple baseball example:

Tony and Joe are competitive friends and so they compare batting averages. At the All-Star break, Tony is batting .300 and Joe is only batting .290. Joe mentions that batting in the second half of the season is more important, and so he and Tony agree to compare their batting averages for the second half of the season (and only the second half). When they finally meet, it turns out that Tony batted .390 in the second half of the season. Joe did better, too, but only batted .375. Tony wins both halves of the season.

Question: who’s batting average was higher for the entire year? Turns out we don’t know, and it could very easily be Joe! (I’ll post an example later)

What the paradox states is that averages for subgroups can demonstrate relationships that are inconsistent with averages for different subgroups or the overall averages. So Tony could win both halves of the season, but have a lower batting average for the entire season.

I do not know the details about Climategate, but it is very interesting to me, with all the statistics. Are average temperatures going up or down or whatever. Here’s an article that about midway through mentions this batting average paradox. I wonder if I have a new example of the paradox involving averages.

Financial collapse

First, know that I am not especially savvy about financial instruments. I sort of understand puts and calls, but not some of the financial instruments that are all over the news. And you can read lots of articles from people who think these people or those people are responsible for the mess. And some that civilization is coming to an end in deflation or hyperinflation.

Second, know that I am not a stockholder in JPMorgan Chase, although I do have one of their credit cards.

But I read this (long) letter from the CEO of JPMorgan Chase to his shareholders and came away feeling like it was one of the more honest assessments I’ve read in a while. The part that interested me starts on page nine. I think I will come back to it in a year or two and see how Mr. Dimon’s analysis has held up.

Means and medians

Here is an article from the Wall Street Journal about the housing market in Detroit. Here’s how the story starts:

On a grassy lot on a quiet block on a graceful boulevard stands the answer to a perplexing question: Why does the typical house in Detroit sell for $7,100?

The brick-and-stucco home at 1626 W. Boston Blvd. has watched almost a century of Detroit’s ups and downs, through industrial brilliance and racial discord, economic decline and financial collapse. Its owners have played a part in it all. There was the engineer whose innovation elevated auto makers into kings; the teacher who watched fellow whites flee to the suburbs; the black plumber who broke the color barrier; the cop driven out by crime.

The last individual owner was a subprime borrower, who lost the house when investors foreclosed.

Then the article sites some statistics:

And the median selling price for a home stood at a paltry $7,100 as of July, according to First American CoreLogic Inc., a real-estate research firm — down from $73,000 three years earlier. A typical house in Cleveland sells for $65,000. One in St. Louis goes for $120,000.

Now I understand that median house prices are the standard way of talking about what is “typical.” Using the mean creates problems when most houses are of one value, but there are some outlier, high-priced houses.

But in this article, is median misleading in a different way? Did the house on Boston Blvd sell for $7,100? How many houses were sold in Detroit? Voluntarily? Can the dynamics associated with foreclosures make the median a poor choice to report?

Often we try to summarize with a few statistics, and that can be useful. But here I think there is need for more information to really understand this situation. The reporter probably had access to that data, if they wanted to report it.

Decision making when you know you will be second guessed

I teach quite a bit about structuring problems to help make decisions. Usually models can help illuminate the tradeoffs, and sometimes it becomes much more clear what the right strategy is.

But sometimes the downside of the “right” decision comes to dominate the thinking of the decision maker. This is a fundamental part of the dilemma facing the contestant on the game show Let’s Make a Deal when they are offered the chance to change their original decision. Even if they are better off changing, the emotional pain of changing and then losing makes them stay with their first choice. They can hear their friends saying “You won the car and then you gave it away.”

Here’s an interesting article about football coaches going for it on fourth down. I am not sure about all the details of the study, but it is probably true that the thought of what the sports writers and talk radio people will say play an important part of the calculations of when to try or punt.

Statistics versus anecdotes

When you report on something and you have statistics and anecdotes, which should win? Which should be the basis for your conclusions? If you had data on a large part of the US population and then you talked to your brother-in-law, which would be more important to report?

Check out this NYTimes article about the Exodus from Facebook, and think about how statistics and andecdotes are used. Ouch.

Fun with probabilities

Here’s a real life application of probability theory. It is similar to the sports betting company business model:

You get an mailing list (email, it is cheaper), and you pick a big football game. You send half the list an advisory that one of the teams will win, and you send the other half an advisory that the other will win, along with the opportunity to subscribe to your newsletter for a low, low price.

To the half of the people that got the correct prediction, you send another letter, picking another big game, and sending half the prediction that one team will win, half the prediction that the other team will win.

People who get two correct predictions in two weeks will be amazed and will be more likely to subscribe to your newsletter.

But you can split that group in half, and send out a third prediction, then split the people who got three weeks in a row right and send out a fourth letter, and so on. You might recycle some of the losers, too, so that some people see that your accuracy is 5 out of 6 weeks and so on.

Imagine getting a letter where the predictions were right five weeks in a row! What is the probability of that? This newsletter must be really good!

Or the newsletter writer knows some probability…

Dilbert economics and a decision tree

Scott Adams has an interesting blog where he tries to reason his way through some of the issues of the day (a bit more seriously than in his cartoons). He has been blogging about all of the decisions associated with building his new house, and one of the big decisions is whether or not to use solar power.

Here is a post where he recognizes that the usual analysis (decision) about solar is just a choice of yes or no. This sounds okay, because you’re in the middle of building the house and have to make a decision, right?

Well, there’s another branch that people often overlook – wait and retrofit solar later. Nice observation, especially since that branch may be the best one.

I also can’t help but smile when I read the summary:

My new home will have solar power. It was a city requirement. I plan to brag about it to people who are passionate about the environment and bad at math.