See also Web Scraping, Screen Scraping, Web Mining, Data Scraping
Web Data Extraction is the process of retrieving unstructured or semi-structured data from web pages and importing it into a structured data system like a database or spreadsheet.
See how Mozenda makes Web Data Extraction easy!
The purpose of web data extraction is to simulate the human browsing experience by replicating computer commands to click, operate drop-down menu’s, and ultimately extract specific data fields from a web page. The value of web data extraction lies in the automation of the data collection and the transformation of the data itself from an unstructured format to a structured format such as CSV, TSV, or XML file. These files can then be downloaded and analyzed.
Common uses of web data extraction include business intelligence, price comparison or mapping, data monitoring, and mash-ups to name a few. Web data extraction is also being used aggressively in Big Data implementations to augment certain external data gathering requirements which demand high volumes of web data to be processed and analyzed on scheduled intervals.
Web data extraction is the most efficient way to collect contact information from the web. In the past, approaches to web data extraction consisted of writing specialized programs whose job it was to decipher items of interest among vast amounts of uninteresting lines of HTML text found on a web page. Developing these programs was time consuming and unreliable.
In 2007, Mozenda founders created a way to mimic a person as it operated objects on a web page. This user-friendly method allows users to operate the program without programming experience, and identify target-text by it’s location on the page, using criteria such as specific content, or by its appearance, such as email addresses or phone numbers.
The Mozenda Advantage
Unlike other parsing or wrapper based web data extraction software, Mozenda uses browser rendering technology which allows the Mozenda application to look and behave like a web browser, but act like a web scraper. There are multiple benefits to this approach:
- Mozenda loads pages and navigates pages just like a browser.
- Mozenda can click and activate any item on a web page and wait for it to load.
- Mozenda can easily navigate through sub-pages.
- With Mozenda you only have to set up the navigation and capture action once and it will replicate
across pages and categories.
Mozenda Training Videos
|Input Text Into a Form||0:44|
|Click the “Next” Button to Load the Next Page of Results||1:58|
|Schedule an Agent to Run Regularly||1:08|
|Combine the Contents of Two Fields||1:16|