The question is whether his model will work. Ace commenter Brian, writing over at his place, has confidence that it will. In the comments section over at his blog, he said the following:
Well, the way I see it, Silver's (apparently radical) approach to predicting who people are going to vote for is...looking at every possible data set asking people who they are going to vote for.I had taken the view that you have to pay attention to the behavior of the campaigns and, from my perspective, the Obama campaign has been acting in a way that is strongly reminiscent of the way the George H. W. Bush campaign was acting in 1992 -- lotsa personal attacks, anger and scrambling around. Meanwhile, Romney's campaign has been moving forward and contesting a number of states that had been assumed unavailable, including Minnesota.
He nailed it last time around. If he does it again, a lot of people who run their mouths for a living--and who base their predictions on things like their memories and impressions of campaigns past, intangibles like "enthusiasm", etc.--are going to have to contend with the fact that their intuitions just don't seem to matter very much. And there will be strong empirical evidence of that.
In a longish but very good article over at RedState, Dan McLaughlin makes what I think is a very important observation about the limitations of Silver's polling, using an example that I personally experienced:
Mathematical models are all the rage these days, but you need to start with the most basic of facts: a model is only as good as the underlying data, and that data comes in two varieties: (1) actual raw data about the current and recent past, and (2) historical evidence from which the future is projected from the raw data, on the assumption that the future will behave like the past. Consider the models under closest scrutiny right now: weather models such as hurricane models. These are the best kind of model, in the sense that the raw data is derived from intensive real-time observation and the historical data is derived from a huge number of observations and thus not dependent on a tiny and potentially unrepresentative sample.
Yet, as you watch any storm develop, you see its projected path change, sometimes dramatically. Why? Because the models are highly sensitive to changes in raw data, and because storms are dynamic systems: their path follows a certain logic, but does not track a wholly predictable trajectory. The constant adjustments made to weather models ought to give us a little more humility in dealing with models that suffer from greater flaws in raw data observations, smaller sample sizes in their bases of historical data, or that purport to explain even more complex or dynamic systems – models like climate modeling, financial market forecasts, economic and budgetary forecasting, or the behavior of voters. Yet somehow, liberals in particular seem so enamored of such models that they decry any skepticism of their projections as a “War on Objectivity,” in the words of Paul Krugman. Conservatives get labeled “climate deniers” or “poll deniers” (by the likes of Tom Jensen of PPP, Markos Moulitsas, Jonathan Chait and the American Prospect) or, in the case of disagreeing with budgetary forecasts that aren’t really even forecasts, “liars.” But if history teaches us anything, it’s that the more abuse that’s directed towards skeptics, the greater the need for someone to play Socrates.
Consider an argument Michael Lewis makes in his book The Big Short: nearly everybody involved in the mortgage-backed securities market (buy-side, sell-side, ratings agencies, regulators) bought into mathematical models valuing MBS as low-risk based on models whose historical data didn’t go back far enough to capture a collapse in housing prices. And it was precisely such a collapse that destroyed all the assumptions on which the models rested. But the people who saw the collapse coming weren’t people who built better models; they were people who questioned the assumptions in the existing models and figured out how dependent they were on those unquestioned assumptions. Something similar is what I believe is going on today with poll averages and the polling models on which they are based. The 2008 electorate that put Barack Obama in the White House is the 2005 housing market, the Dow 36,000 of politics. And any model that directly or indirectly assumes its continuation in 2012 is – no matter how diligently applied – combining bad raw data with a flawed reading of the historical evidence.
Emphasis mine. As regular readers of this feature know, I was a program analyst for Bank of America during the mortgage boom of the mid 2000s, and for our line of business (corporate relocation) one of my tasks was to estimate the value of future business opportunities. We used a mathematical model that frankly green-lighted every venture that a sales manager proposed, because we assumed we'd get and convert a certain number of mortgages because, well, we would. We always had.
In 2005, the model worked very, very well. I left B of A in 2006 when the office relocated -- the office left me, I guess. In the subsequent years, many of my colleagues who went out to Oregon ended up losing their jobs there when the overall market changed and -- crucially -- when B of A acquired Countrywide.
The problem with the model we used in 2005 was that we didn't have enough historical data to test the validity of the model longitudinally. And we couldn't have seen that our parent company would make what in retrospect was a ridiculous mistake in acquiring Countrywide, which was hip-deep in the mess but looked like a going concern at the time.
I don't have a lot of time for this post, so I don't want to oversimplify things, but McLaughlin sums up what I think Silver's problem is going to be in this cycle:
Poll toplines are simply the sum of their internals: that is, different subgroups within the sample. The one poll-watchers track most closely is the partisan breakdowns: how each candidate is doing with Republican voters, Democratic voters and independent voters, two of whom (the Rs & Ds) have relatively predictable voting patterns. Bridging the gap from those internals to the topline is the percentage of each group included in the poll, which of course derives from the likely-voter modeling and other sampling issues described above. And therein lies the controversy.
My thesis, and that of a good many conservative skeptics of the 538 model, is that these internals are telling an entirely different story than some of the toplines: that Obama is getting clobbered with independent voters, traditionally the largest variable in any election and especially in a presidential election, where both sides will usually have sophisticated, well-funded turnout operations in the field. He’s on track to lose independents by double digits nationally, and the last three candidates to do that were Dukakis, Mondale and Carter in 1980. And he’s not balancing that with any particular crossover advantage (i.e., drawing more crossover Republican voters than Romney is drawing crossover Democratic voters). Similar trends are apparent throughout the state-by-state polls, not in every single poll but in enough of them to show a clear trend all over the battleground states.
If you averaged Obama’s standing in all the internals, you’d capture a profile of a candidate that looks an awful lot like a whole lot of people who have gone down to defeat in the past, and nearly nobody who has won. Under such circumstances, Obama can only win if the electorate features a historically decisive turnout advantage for Democrats – an advantage that none of the historically predictive turnout metrics are seeing, with the sole exception of the poll samples used by some (but not all) pollsters. Thus, Obama’s position in the toplines depends entirely on whether those pollsters are correctly sampling the partisan turnout.
Emphasis in original. This is why I've been squawking throughout this cycle about D +7 models: you have to assume that voter enthusiasm on the Obama side is at the same level as 2008, or even exceeding it, for the toplines to make any sense. Based on my instincts and nearly 40 years of observing these things, I don't see anything like that out there. Perhaps Obama's team can squeeze every conceivable vote out of his own coalition in numbers sufficient to counter the Republican efforts, and in sufficient numbers to counteract the evidence that independents are supporting his opponent. We'll likely know in six days.