Screen scraping has gotten a bad rap for a long time, and its reputation is not entirely without merit. Ryanair, an Irish based airline company, announced in August of 2008 that it would cancel all tickets purchased through websites (e.g. BravoFly, Opodo, Atrapalo, OTBeach, et. al) that employed screen scraping techniques. There have also been countless examples of entire websites being duplicated using screen scraping. But, does that mean that all screen scraping is bad? There are plenty of legitimate reasons to use techniques and technologies that allow you to get information off of a website. Hopefully, this article will address the stigma attached to screen scraping by discussing some of its legitimate uses.
It’s been a long standing practice of retail companies around the world to keep an eye on the pricing of the competition. By knowing your competitors prices, you’re able to make adjustments to your own pricing and remain an attractive shopping option. Now that most companies have moved their prices online, you no longer have to send “spies” into retail locations, spend hours leafing through newspaper inserts, or make price-inquiry phone calls. Many websites have no printed policy on the use of screen scraping techniques and while that’s not an open invitation to do whatever you want on the site, it may mean that, as long you’re not causing an unreasonable strain on the site’s servers.
Forums can contain a wealth of useful information for product manufacturers, service providers, and marketers but getting to that information is often clumsy and time consuming. Provided the site doesn’t restrict the use of data extraction techniques, using screen scraping can make a world of difference. Imagine you’re a cell phone manufacturer. You just released a new phone and want to keep an eye on the public’s reaction. Users are likely to be far more candid with the anonymity a forum offers than they would be in a more intimate setting such as a focus group. So, by monitoring a forum, the cell phone manufacturer may be able to find useful information such as design successes and flaws, manufacturing defects, and consumer demand. These same principles can be used to monitor blogs or blog comments in the event no RSS feed is available.
Getting product information to the people who need it can be a pain (especially if you’re one of the people who needs it). Distributers, wholesalers, and dropshippers often use archaic methods (CDs, Excel Files, physical product catalogs) to get out product information. None of these methods give those who need up-to-date information what they need at the time they need it. This can make it impossible to determine inventory levels, adjust pricing, and be aware of new product offerings or discontinuations. Screen scraping can provide a rather elegant solution. Whether scraping your own site and providing the information to resellers or scraping the site of your distributor, you’re able to extract needed information in a timely, simple fashion. Some solutions, such as Mozenda, offer the ability to not only regularly schedule a screen scraping agent, but to also automatically export that information to a file or to a website. This means that you can either alert distributors to changes or—if you are a distributor—you can monitor suppliers’ changes all without investing additional time and effort.
The above examples are only a handful of the thousands of legitimate uses for screen scraping. Hopefully in the future, responsible users will find new legal and ethical reasons to better organize and repurpose information from the web. Screen scraping–or whatever you chose to call it–won’t have such a stigma attached to it when that time arrives.
Comments
June 9, 2009 @ 06:33 #
Excellent post.I want to thank you for this informative read, I really appreciate sharing this great post. Keep up your work.
SEO
June 23, 2009 @ 20:38 #
It is really cool. I really appreciate your blogreviews.
Klauss Boehler
July 19, 2009 @ 13:36 #
I like your blog. I enjoyed reading it. I am looking forward to reading what would be your next post. Keep up the good work!
Wealthy affiliate review
August 30, 2009 @ 18:29 #
I posted your article to my myspace profile. Regards Butterin
aion gold
I posted your blog to my facebook group Regards Joansina
aion kina
September 8, 2009 @ 17:41 #
Data scraping is most often done to either (1) interface to a legacy system which has no other mechanism which is compatible with current hardware, or (2) interface to a third-party system which does not provide a more convenient API. In the second case, the operator of the third-party system may even see screen scraping as unwanted, due to reasons such as increased system load, the loss of advertisement revenue, or the loss of control of the information content. Thanks
unlock iphone
Add comment