You don’t need a weatherman to know which way the wind blows (Bob Dylan, Subterranean Homesick Blues)
As you probably know, we are not fans of the statisticians’ magical arts when it comes to selling or buying a home. We’re no experts on this, but this guy is. Frankly, we never met a buyer who bought a house because it was below the average or median home price or turned one down because it was above the average or mean. Love is blind, even to median home prices. But some of these wizards have gone beyond the standard rabbit trick of pulling median housing prices out of their hats to predict housing health (the groundhog is as reliable). In the case of the Zillowists, they are calculating the median of UNSOLD homes to measure a housing market’s health. Zestimated home data pollutes the pure data pool of actual sales. Any report built on these is bupkes in our book. But the Trulia Trend Report is different.
Like sushi lovers, we eat up raw data. And the Trulia Trend Report serves up mostly raw data based on actual searches done on their site. Yes, they give median or average prices, but we ignore those (except to the extent that there is a large gap, which tells us where the cheaper seats are). We think it’s useful to know what people are searching for. This is valuable because it gages popularity. Popularity equals demand, which equals sales. There are no search guestimates to gum up the works.
According to the February Trend Report, here are the top searched ski towns (go to the Report to see those silly average list prices):
1. Breckenridge, CO
2. Park City, UT
3. Aspen, CO
4. Mammoth Lakes, CA
5. Jackson, WY
Those areas are for the rich kids. If you still want a place in the mountains, and money left over to buy groceries, try Big Bear Lake, CA, Tao, NM, or Killington, VT for the cheaper lift seats.
Want to know where Trulia visitors are searching now? Here are the top ten areas with more than double the regular traffic. Four are in Virginia. (Virginia is for lovers, afterall). The Jersey Shore rush is due to the wise crowd trying to lock up a summer rental. You gotta start looking in January & February or you get stuck with the shacks reaking of smoke from the senior classes, 5 blocks from the beach. Ah, the good ‘ol summertime. But we digress.
1. Suffolk, VA +193%
2. Stuart, FL
3. Springfield, VA
4. Toms River, NJ
5. Point Pleasant Beach, NJ
6. Virginia Beach, VA
7. Woodbridge, VA
8. Panama Beach, FL
9. Queens, NY
10. Akron, OH +148%
As for the average searched home on Trulia: 3.2 beds, 2.2 baths, under $500,000 and about 2,000 sq. ft.
Looks like its going to be partly cloudy. Or is it mostly sunny?
Actual Trulia Trend Report for February 2007 is here.
Related Post:
Trulia Heat Maps & Trend Reports Can Be Useful Tools.

















Hi Sellsius,
My quest to explain Zillow’s quarterly reporting continues … good news is that we have middle ground to work from — thanks for linking to the Surowiecki article. The quote that explains the author’s criticism of traditional house value reporting is … “Because nominal median prices compare completely different groups of homes (all those sold in August, 2005, say, and all those sold in August, 2006), they can overstate how much prices go up during booms and understate how much they go down during busts.”
That is why reporting on house value trends using median sales is flawed. In the article he says this is how NAR statistics are reported — it’s frankly the most common method. As Surowiecki explained, this approach becomes increasingly flawed with increasing fluctuation in the state of the housing market. Few people would contest that we’re in a fluctuating market.
Surowiecki’s point is that the only way to accurately measure a trend is to normalize the sample. Only Zillow does this.
Your other criticism seems to be that “zestimated homes are mixed with sold homes”. This is not the case. In fact, we don’t consider any sales and only look at Zestimates (they are inputs to Zestimates only). We’re trying to understand where the market IS not where it WAS at various times (sale dates) during the past 3 months.
Now you know we’re only reporting Zestimates the obvious question is “what about the Zestimates that are way off?”
That’s why we use a median to calculate the Zindex; it excludes inaccurate outliers. Obviously it’s only effective if the outliers are evenly distributed. If you read this week’s WSJ analysis of Zestimate accuracy, you’ll notice they confirm our internal accuracy measurements and that their study showed no Zestimate bias … i.e. when we’re wrong, we’re as likely to be “high” as we are to be “low”.
So, Zestimate inaccuracies are ignored when we calculate the Zindex (trend).
Bottom line … Zillow is the only source of house value trend reporting that is immune to shifts in local markets. That said, all statistics benefit from an expert’s interpretation — which is why we now release our raw data to local experts to add their local flavor. Zillow’s Q4 reports in 75 Metro areas can be found here: http://www.zillow.com/quarterlies/QuarterlyReports.htm
I know that these statistical concepts are hard to grasp. Let me know if you have any questions about my comments.
As always, we welcome your comments David. Yet, we must disagee with your analysis.
A housing report which does not reveal the percentage of the data which is actual sales vs. unsold zestimates (of undetermined inaccuracy) cannot be said to be better by virtue of those unreliable zestimates. If you agree with Surowiecki, as you seem to, if the median of “actual sales” is suspect, how can the median of guestimates be better? The Zindex Housing Report is a mixed bag—only we don’t know the mixture. If most of the data is zestimates of UNKNOWN error rates, what you have is a Zestimate Report. And since a Zestimate Report contains data of uncertain accuracy, it is not likely to be a true indicator—it looks more like a STARTING POINT for a report. My logic goes like this—if a zestimate is a starting point, a Housing Report containing those same zestimates must also be only a Starting Point. No?
(If, on the other hand, most of the data in Z’s Report is actual sales, you have a more traditional report, which as Jim says, is also flawed. Pick your poison.) To repeat myself once again, medians of actual sales are flawed and medians of guestimates are more flawed.
The Zindex Drinking Water
I have used the analogy of drinking water. Actual sales are pure water. Zestimates are contaminates because they may be way off (highly toxic). I do not want to drink from the Zindex Report unless I know the mixture. Why doesn’t Z reveal the % of data in the Report that comes from actual sales and those from zestimates. Theorectically, if a market had no sales, the entire report is based on contaminated (in various degrees)zestimates, guesses, starting points, first steps. In a word, bupkes to the serious. As you have stated:
“Zestmates are a useful starting point in researching house values, but serious buyers will get to a point where they move beyond the Zestimate…” A serious housing report must do the same.
Why Advertise Median Error rates? Why calculate them against actual sales? Because actual sales are the standard the zestimate is measured against.
The reason why Z publishes their median error rates is to make the zestimate seem fairly accurate. And the zestimate’s accuracy is ultimately measured against “actual sales”. The actual sales, the reality, triggers the error rate which measures the value of the zestimate POST FACTO. If the error rate was zero you would in fact have the equivalent of ACTUAL sales. Thus, housing reports with zestimates, which aspire to be as accurate as actual sales, cannot logically be more valuable than the actual sales themselves. This concept may be hard to grasp but once you do, you’ll realize why zestimates are always compared to actual sales to measure their value.
A final question. We always talk about median, but rarely average. Does Zillow know their “average” error rates? If so, will they reveal them? If not, why not?
Sorry to put you on the spot David, but we really want to know.
In case I wasn’t clear:
1) ZERO percent of Zillow’s quarterly reports cover actual sales.
2) Zestimate error rate is of no consequence to the median calculation because Zestimate errors are unbiased.
It really is just that simple.
“A starting point” explains that there are outliers in INDIVIDUAL Zestimates. That issue is irrelevant in the aggregate because again, medians effectively exclude all outliers when there’s no bias in the error.
If no actual sales are in the report then it’s totally a zestimate report. That’s the problem. To the extent the zestimates are flawed (ie not likely to match actual sales), the report is flawed.
Zestimate error “unbias” is really not the issue. It is zestimate accuracy. If zestimates aspire to perfect accuracy, they aspire to be (match)actual sales. To me that is the essence of it. Z would rather issue housing reports based on medians of zestimates, when in their perfect state (0 error rate), these zestimates would be actual sales.
You can’t escape the fact that zestimate accuracy is a function of comparison to actual sales. The difference from the actual sales is the error rate. If actual sales are the standard to which zestimates are measured, how can zestimates be more valuable?
BTW, Please, please tell us if Z knows the “average” error rates of its zestimates? If so, what are they? We have a feeling, the average error rate is “shocking” and that is why Z only discloses the median. Prove us wrong.