Real Estate Scraping


real-estate-scraping.gif

There has been a rise of unauthorized republication of copyrighted content by third parties recently. They aggregate our copyrighted content, display it on their sites, and place advertising around it. The goal is to use our content for their monetary gain.

This morning, we stumbled on this posting here. It was just put up today. This company is looking for developers to build this project to “aggregate data” from other sites for publication on their own site. What “data” do you think would be ok to scrape - beds, baths, sq.ft, public data?

Question: Do they intend on building a search engine or a scraping site?
Anyone, Anyone…..?

Project: CMS / Drupal Development
ID: 1172840643
Post Similar Project
Status: Open
Budget: N/A
Created: 3/2/2007 at 8:04 EST
Bidding Ends: 3/9/2007 at 8:04 EST (6 days left)
Project Creator: eapen
Rating: 10.00/10 (1 review)
Description: We like to develop a site that aggregates data from 6 of the most popular For Sale By Owner (FSBO) web sites that comes up on popular search engines, For Sale By Owner properties on zillow.com and all properties on realtor.com, including related blogs.We should then cache applicable data into a database and publish it to our own site. Design is not that important as long as its a clean and business like interface.

Must automatically refresh the data multiple times a day to be current.

Database must be scalable to millions of records.

Give me a turnkey pricing on this. We will get started with this and fine tune and expand to a few other categories. Gives us a feel of your capabilities and commitment before we go further.

Let me know what kind of hosting framework you will need as well.

Job Type:
  • Ajax
  • Joomla
  • PHP
  • MySQL
  • Website Design

Update:

Seems like Liam, General Counsel for Zillow.com, caught wind of our post and left them a little message.

Programmers Bid Delivery Within Time of Bid
Rating
liam $5 1 day 3/2/2007 at 16:40 EST
(No Feedback Yet)
This is a violation of Zillow’s Terms of Use. I’m assuming you would check before proceeding, but this message further puts you on notice. I’m asking you not to pursue this project, at least with respect to Zillow.com. You should also check the terms on the other sites you’re targeting. Liam at Zillow.com General Counsel
tesfsh $500 15 days 3/2/2007 at 15:48 EST
(No Feedback Yet)
hi there we can do it for you using well known and easy cms .trust us just lets start the job know. portofilo : www.mideanhotel.com www.teklufurnitures.com www.rapidfone.com/shop/ our web site: www.tesfalem.byethost33.com
manikandan $1,500 10 days 3/2/2007 at 19:32 EST
(No Feedback Yet)
Yes It can be Done have a Look in PM

5 Responses to “Real Estate Scraping”


  1. 1 John Wubbel Mar 3rd, 2007 at 12:13 am

    Well I think many folks always get offensive the minute someone infers infringement on content. Unfortunately very little critical thinking takes places on these matters in an effort to ascertain the real value. We just start flaming away on the news groups or blogs.

    Somehow the idea of aggregation seems appealing and the business model behind the idea is probably weak at best. It is difficult to take serious any snippet postings on sites where they bid out software development without sound software engineering functional specifications. Obviously the application has not been thought out through a logical life cycle in detail.

    I have scraped many web sites for commercial real estate listings. While many of the bulletin board real estate listing services like to think they own the data, in reality the guy who is marketing his commercial property wants the most exposure he or she can get. So they will list on every possible venue. It seems almost ludicrous to say the owner of the property does not also own the information/data which describes the features and attributes of the listing. After all listing your place for sale is not condemnation by power of eminent domain on the Internet.

    Scraping web sites has a 2 fold problem. First, the quality of the data by mining other web sites is pretty lousy. It is either lacking details or it is old and useless. Second, is a problem of presentation. I think the second problem has to do with perspective. Either you are a legacy technology person trying to invent new ways of using relatively old infrastructure, or you are bold person willing to build Web 2.0 technology which can be risky. I find working within the context of Web 2.0 that I am not happy with using the same old relational database technology for search and presentation. Thus, I experimented with the concept of building Property Banks. Property Banks put me into a whole new dimension on how I view property for sale or lease without the use of a database. In order to do this, I utilized the Solvent screen scraping tool and stored the listings into Property Banks that can be kept private or made public. Then, I used the Firefox extension from MIT Simile project called Piggy Bank to view the listings in my Property Bank. This is where I feel the value of aggregation is real.

    If you are simply scraping to aggregate into a database, you end up with redundancy to the point where everyone looks syndicated. And being an investor, realize you still have to work much harder than simply finding a property listing on bulletin board service somewhere because it goes back to the quality of your data and the analytics you can perform on that information for selling or purchasing a good investment.

  2. 2 sellsius° Mar 3rd, 2007 at 12:42 am

    Very interesting John. Would you be interested in posting on this subject?

  3. 3 Ross Gordon Mar 7th, 2007 at 12:42 am

    Almost looks like they are urinating on that wall. Well, except for the lady….

  4. 4 Louisville Properties Apr 30th, 2007 at 1:50 pm

    Very Very interesting… You would think that someone who is knowingly breaking some law or violating terms of use policy would be a little more reserved than just putting it out there for everyone to see!
    It’s amazing to me what people do to take the “easy path.”
    FSBO Louisville

  5. 5 John Wubbel Apr 30th, 2007 at 6:50 pm

    What exactly are you referring to in your Louisville Properties post, the paint scrapers or the screen scrapers. I am assuming the screen scraper comments. If you would like more in depth please read my Active Rain Blog from the bottom up and you will see we are not taking the easy way out. We are simply working smarter by applying Web 2.0 technologies. Cheers!

Leave a Reply