The Zindex Illusion or How to Sell A Housing Report Without Using Actual Home Sales


Zillow’s Zindex does not include recent home sales, yet it professes to be a reliable indicator of home values.

From Zillow’s May 6, 2008 Press Release:

Home values in the first quarter of 2008 fell 1.6 percent from the fourth quarter and 7.7 percent from the year-ago quarter, marking the most significant year-over-year decline in the past 12 years*. (emphasis added)

From Stan Humpries post at the Zillow Blog:

Conditions continued to worsen in Q1 as U.S. home values continued their slide down with the Zindex posting a 7.7% year-over-year decline. (emphasis added)

Sounds like serious stuff, this Zindex. It’s not. It’s a statistical sleight of hand– a Zillow illusion– a Zillusion, if you will. Why? Because a Zindex does NOT contain actual home sales. That’s right.

Not a single home sale is included in the Zindex as a data point.

So what exactly is in this Zindex stew? Nothing but zestimates and boiled potatoes (those folks who proclaim the Zindex as an authoritative housing value report). For those who don’t know it, zestimates are NOT home values, despite what Mr. Humphries writes or Zillow publishes.

Zestimates are Zillow estimates of home market values based on computer calculations which depend in large part on data from the local tax assessor’s office, which data may be outdated, incomplete or outright wrong (neither the computer nor the Zillow stat man visit your home). Zestimates are, thus, inherently erroneous– to what degree is unknown— the zestimate’s degree of error is not determined UNTIL a home is actually sold. This is a key point– the actual sale determines the zestimate’s worth as a home value indicator— it is the actual home sale which creates the error rate that Zillow touts as a measure of a zestimate’s accuracy and reliability.

So, if actual sales are the gold standard of value against which zestimates are measured, why are they excluded from the Zindex as data points? The short answer is Zillow believes its zestimates are more reliable than recent actual sales to measure housing value trends. (David G of Zillow has called recent sales data a questionable “legacy approach”). This is probably based on the notion that it is better to have estimates of all homes than actual sales prices of some. While this in an attractive theory when there is a dearth of recent sales, it is nonetheless flawed when one is using zestimates of questionable worth. (It is even more unreliable when sales are robust.) Keep in mind that by Zillow’s own admission, a zestimate is only a starting point for determining a home’s value. So, start by asking yourself how a gaggle of starting points can rise to any level of authority of housing values.

Yet, the Zindex database is wholly comprised of zestimates, these starting points, of unknown error. And actual home sales, the ending points, and the zestimate’s gold standard, are left out. But the Zindex is what Zillowsticians are peddling to the real estate industry and the public as authority on the state of home values. Thanks to newspapers picking up the story (and giving the Zindex credibility), the public may be buying it. I’m not. Now, you can expect a comment from David G that because the Zindex is the median zestimate (middle value), the Zindex somehow overcomes the negative effect of having no actual home sales data points and an unknown error rate. Hooey with a capital H, Mr. G. Actual recent home sales are more valuable than zestimates as data points.

Question Zillow (and David G) will not answer: Does the exclusion of recent sales data points make the Zindex a more reliable report from one that does contain actual sales?

In my opinion, the exclusion of recent home sales makes the Zindex less reliable than one that includes recent home sales.

Tip to Zillow to Improve its Zindex: Since you believe all homes should be represented in your Zindex, you should include recent home sales, in lieu of zestimates, for those homes. (You can send me my fee in cash in a brown paper bag.)

*Since Zillow’s Zindex only came into existence in 2006, the Press Release reference to a 12 year low is misplaced. It’s like comparing zapples to zoranges.

[Image: Waterfall by M.C. Escher]

Update: So as to avoid any misunderstanding, the Zindex data points are exclusively zestimates.  Even if there is a recent sale, the Zindex will not take the sales price as a data point but, instead, take the zestimate. (To the extent there were recent sales, they are crunched by the Zillow machine into the zestimates, which nonetheless remain different than the recent sale prices.)  See the comments below for further clarification and examples.

Related Posts:

Zillow Housing Reports: The Statistical Lie of Estimated Truth

Zillow: Truth by Association

The Power of Print: Perception as Truth (see David G’s comment that “recent sales data” is a questionable “legacy approach”. Oh yeah, well this legacy of recent sales is what tax assessor’s use and, thus, Zillow.)

Technorati Tags: ,

Share This Post
  • Real estate statistics are often confusing, even when based on actual data. I wouldn't know what to make of it, if it's based on estimates. Case in point: my last post was about April's Days On the Market being 57% down compared to March. This could make a good news story " Houses are selling fast in Princeton NJ", which I know is not the case. I found an explanation for this "good" number and provided this explanation in my post. If people base their home buying/selling decisions on data, they need to understand not only the underlying basis for such data, but also how to interpret it.
    The actual sales price data is better then estimates, but may also not reflect the direction of the market, unless one lives in 20 of the Case-Shiller markets and uses their data. This fact is often not understood by people who read about real estate prices.
  • It's no wonder consumers are confused.
    There's so much misinformation circulating that it's nearly unbelievable.
    We live in a world of endless "noise".
  • Up-to-date sales data is critical in estimating home valuations. I've been told that in some markets, appraisers must include sales comparables that are only three months old, rather than looking six months back, as has been typical. In our current market, tax valuations are all over the board and do not appear consistent at all with how the actual real estate market is performing.
  • Joe,

    Please correct your post. You are incorrect. I'm sure you that know better but if you'd looked into this you'd have realized that the Zindex and Zillow's quarterly home value report IS based on recent home sales data. The primary data input to these calculations is recent sales information.

    Joe, you could not be more wrong. Please correct your post.

    The Zindex is the median Zestimate value. Zestimate values are calculated using recent sales. Recent sales are the most important data input to Zestimate values. The Zindex therefore certainly includes recent sales transactions. The Zindex is a preferable mechanism for studying home value trends than the traditional median sales price analysis because it includes the value of all homes. For a detailed discussion of the zindex vs. median sales price analysis, see this blog post:
    http://www.zillowblog.com/debunking-the-median-...

    Please correct your post. Please also cease and desist to publish and spread misinformation about Zillow.com and its products.

    Thank you,

    David Gibbons
  • To answer your question ... "since we do not exclude recent home sales from the Zindex analysis that question makes no sense."
  • David,

    Perhaps you need to read up on the Zindex ;)

    According to Zillow.com:

    "The Zindex home valuation index is the median Zestimate valuation for a given geographic area on a given day."

    ...suppose there are 101 homes in your county. Zillow would create a Zestimate for each of these homes. We then arrange all 101 Zestimates from lowest value to highest and, starting from the smallest value, we would pick the middle one — the 51st — and this would be the Zindex for your county on that particular day.

    "A Zestimate home valuation is Zillow's estimated market value. It is not an appraisal. Use it as a starting point to determine a home's value."
    http://www.zillow.com/howto/WhatsaZindex.htm

    Thus, a median "zestimate" is a zestimate, a Zillow starting point, NOT a sale price.

    Please correct your Zindex, or your comment.

    Also, please correct the following in the Zillow press release:

    1. The reference to a 12 year record is misleading IMO since the Zindex has only existed since 2006. What exactly is Zillow comparing the Zindex to when it refers to a 12 year low? It can't be another Zindex --- perhaps a report using actual sales? Zapples to Zoranges, David.

    2. The press release and Mr. Humphries post state "home values" have declined-- more accurately: "Zillow estimates of home values" have declined-- please correct this misstatement. Home values are not known until a sale has taken place.

    Zillow can't have it both ways-- it can't claim zestimates are only starting points and then tout a Zindex Housing Report containing only zestimates as in any way authoritative. Geez. Indeed, the zestimate only-Zindex is, at best, a starting point housing valuation report.
  • Joe,

    You should research "logic." Here is an example of how "logic" works ...

    1) Blog posts are made up of words
    2) A blog is a collection of blog posts.
    3) So, a blog is made up of words.

    No, let's apply this amazing "logic" to Zillow's Zindex ...

    1) Zestimates are calculated from recent sales.
    2) The Zindex is calculated from Zestimates.
    3) So, the Zindex is calculated from recent sales.

    Now that we have proven that Zillow's Housing Reports do use actual home sales will you please correct your post?
  • David

    I thank you for your logic lesson.

    I give you, in return, a lesson to distinguish a fact from a spin of a fact:
    Fact: Sales prices are NOT data points in the Zindex.
    Spin: Zestimates incorporate sales prices. Therefore sales prices are part of the zestimate and, by extension, the Zindex.
    What your logic/spin ignores is that zestimates without recent sales to incorporate remain only zestimates, i.e starting points of unknown error. And these "starting points" are all data points in the Zindex.

    Now, for an example to show the difference between a fact and its spinning partner:

    This home sold for $429,936 on 2/6/08
    The zestimate is $436,500 (updated 5/9/08)

    Guess which number goes into the Zindex? Yes, the zestimate, NOT the sale price.
    I say the actual sales price should be the data point, NOT the zestimate which incorporates it.

    http://www.zillow.com/HomeDetails.htm?zprop=644...

    In the above example, I used a zestimated home with a recent sale price. If we take a home without a recent sale price, we have only a naked zestimate, i.e merely a starting point of unknown error.

    As always, a pleasure to dance with you.
  • Joe,

    Both the sales transaction value and the Zestimate value you mention above are used in calculating the Zindex value for the Home Value Reports.

    In your post you erroneously claim that Zillow does not use recent sales to calculate home value reports. That is incorrect. Please correct your post.
  • David,

    I'm sorry you misunderstood the post and I take responsibility for that.

    So, I clarified the point that Zillow will use a zestimate over a recent sales price as a data point. And for those homes with no recent sales the Zindex will take the starting point (of unknown error) zestimate as a data point. I hope that helps.

    Also, I notice you did not dispute my example of the recently sold home, where Zillow used the zestimate instead of the sales price as a data point in the Zindex. Am I to assume that is a correct statement?
  • Joe,

    If you read my previous comment you'd have noticed that I did dispute your example. Both the sales value and the Zestimate value in your example are used to calculate the Zindex. The sales value in your example is in fact used many times in the Zindex calculation whereas the Zestimate value is used only once.

    Even your post's title claims that the Zillow housing report does not use actual home sales; that is incorrect.

    Please correctly correct your post.
  • David,

    Let me state this as simply as possible. In my example, the number used as the data point in the Zindex IS the zestimate of $436,500. This is a correct statement. For you to suggest otherwise is misleading to the public, IMO.

    The Zindex simply does NOT use recent sales prices as data points (they only use the zestimates, which is not the same number, by anyone's logic). This too is a correct statement. You obviously agree when you state "the zestimate value is used only once". Exactly! As a Zindex data point. Thank you for making the point, although obtusely.

    I have quoted from zillow.com on how the Zindex is determined. Here it is again for your benefit:

    "....suppose there are 101 homes in your county. Zillow would create a Zestimate for each of these homes. We then arrange all 101 Zestimates from lowest value to highest and, starting from the smallest value, we would pick the middle one — the 51st — and this would be the Zindex for your county on that particular day."

    You will notice that nowhere does it state that BOTH the sales price and zestimate are used as data points-- only 101 zestimates for 101 homes. Simple to understand, no? (actually, you just need to know how to read the words)

    Until the Zindex language is corrected, my post stands as accurate. I say it again: Only zestimates, and NOT recent sales prices are the data points comprising a Zindex. (I notice you avoid the use of the term "data point", suggesting to me you are intentionally trying to skirt the issue --- I can't imagine you don't know what a data point is)

    Re: calculation in a Zindex. (I think here is where you go amiss.)

    There is no "calculation" in a Zindex other than listing the zestimates and counting to the middle one. Count along with me, David-- zestimate 1, zestimate 2... Easy. Did you count a recent sales price? If so, you made a boo-boo. Do not pass Go, do not collect $200 ;)

    If you still think I am wrong on the makeup of a Zindex, I suggest you go back and read what data points are counted when a Zindex is arrived at and come back with a link showing where recent sales prices are part of the data points counted. You won't find it, so I won't wait up.

    And let's not forget this key Zindex point: for homes without recent sales prices, only zestimates of unknown inaccuracy are filling up that Zindex, right? The more recent sales prices missing, the more the Zindex turns into zestimate mush.

    As always, you are the whirling dervish of Zillow spin doctors.

    PS: Hey, David, any chance, in the name of transparency, I can see the raw data making up a Zindex for a county?
  • Oh, I almost forgot. Can you explain how a 2 year old Zindex can arrive at a number that sets a 12 year record?
  • "Can you explain how a 2 year old Zindex can arrive at a number that sets a 12 year record?"

    Of course I can but I'd rather you think about it first. If you haven't figured it out by the morning, I'll tell you. It's a brain-teaZer.
  • No need David, as much as I enjoy a brainteaZer :) But thanks.
  • Why do I feel like I'm watching a cat play with a mouse :)
  • I've been told that in some markets, appraisers must include sales comparables that are only three months old, rather than looking six months back, as has been typical. In our current market, tax valuations are all over the board and do not appear consistent at all with how the actual real estate market is performing.
  • I love that M.C.Escher drawing at the beginning of the post.
  • Of course I can but I'd rather you think about it first. If you haven't figured it out by the morning, I'll tell you. It's a brain-teaZer.
blog comments powered by Disqus

Blog Widget by LinkWithin