RECENT READS



COOL USE OF DECISION TREES in the New York Times. They should develop some trees for individuals' votes as well (the graphic below is for counties' votes).


The tree above was developed with a decision tree learning algorithm. When you look at the tree above, what you don't realize is that a computer has looked at all other possible trees (or a large subset of them) and selected this model for a reason: It does the best job of explaining how counties vote.

For example: The algorithm almost certainly tested whether it was smarter to split the first node (percentage African American) at 20% or 21%. It choose 20% because this split had more predictive power -- not only than other potential splits related to the black population, but also to other potential splits on any of the other variables (such as high school graduation and median income).

I was amazed by how well fit the Times' tree was -- there seems to be a huge amount of explanatory value just from the first three splits (which use only two variables).

One of the strengths of trees is their ease of interpretation and visualization to non-experts. Sadly, constructing one with a good fit requires some specialization. Bravo to the Times for hiring someone with the background to construct the tree. Decision tree learning is a relatively advanced statistical technique for a popular publication.

Shame they didn't publish the model's overall fit, the complexity parameter or the details of the learning algorithm used (there are a few distinct approaches to solving this problem). Guess that's probably asking too much.

I've been meaning to run one of these algorithms on the Google prediction market dataset -- I think the resulting trees will be nice for business presentations where people don't want to see those confusing regression tables.

For example, a tree might be used to create a model of whether someone will become a trader or not. Email me if you have ideas.

Link via Frances Haugen.












ABCNEWS' 20/20 REPORT ON PREDICTION MARKETS seems to have worked, at least for Intrade. The words "Intrade" and "In Trade" were some of the hottest rising Google queries in the US in the hours after its broadcast:

For more on the intersection between search queries and prediction markets, check out the Yahoo Tech Buzz Game.












IN NOVEMBER '07 I VISITED DUBAI to speak at a global strategy conference for McKinsey, its senior strategy partners and strategists at client firms. I was on a panel with Todd Henderson at Chicago Law, Jeff Severts at Best Buy and James Surokweicki of the New Yorker. The transcript of the talk appears in this month's McKinsey Quarterly -- including this amusing sketch of me:


I'm better looking in real life, promise ;). Anyhow, the panel was moderated by McKinsey guru Renee Dye (author of these articles) -- an excellent host who posed thoughtful questions and drew out each panelists' distinct strengths. Jed Christiansen has a review of the article in case you're not able to see it.

I enjoyed spending time with my fellow panelists, and the rest crowd was fantastic. I'm quite thankful to McKinsey and Renee for the opportunity, and hope the article contributes substantively to the growing conversation about using markets inside of organizations.

UPDATE: How did the artist do? See the original pic here.

AFTERMATH: In 2001, Ruler of Dubai Sheikh Mohammed bin Rashid bought Jonabell Horse farms in Lexington, KY from family friends. This gave me some local contacts to show me around the area after the conference. Many thanks to Ajay and gang for a great experience. For travellers to the area, be sure you visit Khasab, Oman.












WHOA, LIKE, I'M SO IMPRESSED: "When [new Facebook COO Sheryl Sandberg] was completing her senior thesis as an undergraduate at Harvard, she ran so much data through one of the school's computers that she crashed it."

Please.












THIS NEW YORK TIMES ARTICLE ON CORPORATE PREDICTION MARKETS doesn't mention our implementation at Google, but the accompanying graphic (below) represents what Justin Wolfers, Eric Zitzewitz and I did better than anything I've seen (including our own fairly cool graphic).



No company mentioned in the article (or anywhere else, to my knowledge) is actually doing what is depicted in the picture besides Google.

Notice that the manager of the market (above) can see much, much more than current prices and trading volume. He can see who bets how how and where they sit. He can compare how specific teams or social groups are betting and whether managers for a project are betting more optimistically than their employees.

And so on. The view is a lot more interesting when you're looking at the market like this instead of like this.












NPR HAS A VERY GOOD DESCRIPTION of why everyone liked Zach Morris from Saved by the Bell so much.












"EVEN IN A BORDER-FREE EUROPE, EVERYONE WANTS A HOMELAND": The Economist says that Europe has made peace with nationalism and notes:
An essay in the current issue of Foreign Affairs makes the incendiary suggestion that the EU has kept the peace for 60 years thanks to nationalism, not despite it. The author, Jerry Muller of the Catholic University of America, argues that the brutal genocides and forced population shifts of the 20th century helped to make peace possible. With a few exceptions (he cites Belgium as one), Europe's ethnic and state boundaries now match (ie, most Germans live in Germany, Greece is dominated by Greeks and so on). That has removed a big reason for fighting. Thus the post-war peace may not mark a defeat for ethnic nationalism, but rather demonstrates its "success".






JUST FINISHED READING Dreams and Shadows: The Future of the Middle East by Robin Wright. Interesting portrait of the Middle East from early part of this decade to present. Great storytelling, but no big conceptual insights.












TNR HAS A COOL STORY about the struggle for control over the Wikipedia entries for Barack Obama and Hillary Clinton's Wikipedia pages. Great reporting from Eve Fairbanks.












WELL, YEAH:
"If Barack gets past the primary," said the Rev. Jeremiah Wright to the New York Times in April of last year, "he might have to publicly distance himself from me. I said it to Barack personally, and he said yeah, that might have to happen."

Pause just for a moment, if only to admire the sheer calculating self-confidence of this. Sen. Obama has long known perfectly well, in other words, that he'd one day have to put some daylight between himself and a bigmouth Farrakhan fan. But he felt he needed his South Side Chicago "base" in the meantime. So he coldly decided to double-cross that bridge when he came to it.

And now we are all supposed to marvel at the silky success of the maneuver.
--Christopher Hitchens, Slate.












ME TOO: "I liked TNR better when it was treating politicians as politicians, not boyband heartthrobs." -- TNR reader on the magazine's Obama crush.

For that matter, I liked it better when my generation treated politicians as politicians, and not as boyband heartthrobs.

On the other hand, this is pretty funny: "I'm angry," says black congressman Charlie Rangel, "I'm looking for the white people that are insulting me, and I can't find them."












SOME MORE REMARKS about applications that combine prediction markets and organizational data (org charts, social networks, seating locations). The obstacle to these applications is not a lack of data. Jed mentions privacy concerns -- and if he thinks this is a big obstacle then I'd be interested in discussing his thoughts.

A bigger problem is that that current PM vendors and consultants cannot support these applications. At heart, these vendors are software engineers and salespeople at heart, not statisticians or data miners. They want to write one system that can support lots of clients. At conferences, one hears PM vendors complain about having to do "customization" work for clients.

This approach would not work for the applications I describe for two reasons:
  1. The inputs for different clients won't be the same. Each client's organizational data will likely take a different structure. This makes it difficult for PM vendors to architect a single system that can served many clients (yet another challenge with integrating markets with other corporate IT services).
  2. The outputs for different clients won't be the same. The business relevance and statistical power of each analysis will differ with each client's data.
PM vendors may also need to familiarize themselves with the statistical learning methods necessary to fully utilize these rich datasets. So what's the solution? First, move to a software-and-consulting model. By 'consulting,' I don't mean 'consulting on how to implement the market.' I'm talking about helping the client solve its problem using a variety of data, including PM data.

Second, the vendors also need to pitch prediction markets as more than a forecasting tool. People in the business world commonly identify as data junkies -- probably moreso than they identify with the 'wisdom of crowds' ethos. It is unclear how much companies really care about accurate forecasting anyway.






FOLLOWING UP ON MY PREVIOUS POST on Jed Christiansen's comments of our prediction market paper: I've also heard that other companies would find it impossible to analyze the interaction between their market and the organization. Why? Lack of data. Our analysis benefited from a wealth of internal data (including GPS coordinates of offices) that other companies don't store.

You may be surprised at how much data average companies really have. For example, Google had social network surveys; many companies do not. However, many standard corporate applications (such as email, calendars, telephones and code reviews) contain implicit social networks that can be used in place of data gathered from surveys.

Or, consider this: I recently met with people from Google's real estate management group. Turns out, they have records of the floorplans of Google's offices in electronic format. Not only can someone use these records to find the distance between offices (without GPS coordinates) -- you can also find the total area and perimeter of each office, which desks are open (cube-style) vs. enclosed, the walking distances between offices and more.

Surprised and impressed, I asked if it was typical for companies to have all of this information. The response was: "Any Fortune 1000 company would have this data about their offices." Everyone in the room said his previous employer had the same data -- typically managed through computer-aided facility management systems such as Archibus or Infor.












This work is licensed under a Creative Commons License.

Disclaimer: Opinions expressed on this site are the author's and not necessarily his employer's.